Version 2.0 of the NVM Express specification has been released, keeping up the roughly two year cadence for the storage interface that is now a decade old. Like other NVMe spec updates, version 2.0 comes with a variety of new features and functionality for drives to implement (usually as optional features). But the most significant change—and the reason this is called version 2.0 instead of 1.5—is that the spec has been drastically reorganized to better fit the broad scope of features that NVMe now encompasses. From its humble beginnings as a block storage protocol operating over PCI Express, NVMe has grown to also become one of the most important networked storage protocols, and now also supports storage paradigms that are entirely different from the hard drive-like block storage abstraction originally provided by NVMe.

Instead of a base specification for typical PCIe SSDs and a separate NVMe over Fabrics spec, version 2.0 is designed to be a more modular specification and has been split into several documents. The base specification now covers both locally-attached devices and NVMeoF, but more abstractly—enough has been moved out of the base spec that it is no longer sufficient to define all of the functionality needed to implement a simple SSD. Real devices will also need to refer to at least one Transport spec and at least one Command Set spec. For typical consumer SSDs, that means using the PCIe transport spec and the block storage command set. Other transport options currently include networked NVMe over Fabrics using either TCP or RDMA. Other command set options include Zoned Namespace and Key-Value command sets. We already covered Zoned Namespaces in depth when it was approved for inclusion last year. The three standardized command sets (block, zoned, key-value) cover different points along the spectrum from simple SSDs with thin abstractions over the underlying flash, to relatively complicated, smart drives that take on some of the storage management tasks that would have traditionally been handled by software on the host system.
 
Many of the new features in NVMe 2.0 are minor extensions to existing functionality, making those features more useful and more broadly usable. For example, partitioning a device's storage into NVM Sets and Endurance Groups was introduced in NVMe 1.4, but the spec didn't say how those divisions would be created; that configuration would either need to be hard-coded by the drive's firmware, or handled with vendor-specific commands. NVMe 2.0 adds a standard capacity management mechanism for endurance groups and NVM sets to be allocated, and also adds another layer of partitioning (Domains) for the sake of massive NVMeoF storage appliances that needed more tools for slicing up their pool of available storage, or isolating the performance impacts of different users on shared drives or arrays.

The NVMe spec originally anticipated the possibility of multiple command sets beyond the base block storage command set. But the original mechanism included for supporting multiple command sets is not adequate for today's use cases: a handful of reserved bits in the controller capabilities data structure are not enough to encompass all the possibilities for what today's SSDs might implement. In particular, the new system for handling multiple command sets now makes it possible for different namespaces behind the same controller to support different command sets, rather than requiring all namespaces to support all of the command sets their parent controller supports.

Zoned and key-value command sets were already on the radar when NVMe 1.4 was completed, and now those technologies have been incorporated into 2.0 with equal status to the original block storage command set. Future command sets such as for computational storage drives are still a work in progress not ready for standardization, but the NVMe spec is now able to more easily incorporate such new developments. NVMe could in principle also add an Open Channel command set that exposes most or all of the raw details of managing NAND flash memory (pages, erase blocks, defect management, etc.), but the general industry consensus is that the zoned storage paradigm strikes a more reasonable balance, and interest in Open Channel SSDs is waning in favor of Zoned Namespaces.

For enterprise use cases, NVMe inherited Protection Information support from SCSI/SAS—associating some extra information with each logical block, which is used to verify end to end data integrity. NVMe 2.0 extends the existing Protection Information support from supporting 16-bit CRCs to also supporting 32-bit and 64-bit CRCs, allowing for more robust data protection for large-scale storage systems.

NVMe 2.0 introduces a significant new security feature: command group control, configured using a new Lockdown command. NVMe 1.4 added a namespace write protect capability that allows the host system to put namespaces into a write-protect mode until explicitly unlocked or until the drive is power cycled. NVMe 2.0's Lockdown allows similar control to disallow other commands. This can be used to put a drive in a state where both ordinary reads and writes are allowed, but various admin commands are locked out so the drive's other features cannot be reconfigured. As with the previous write protect feature, this command group control supports setting these restrictions until they are explicitly removed, or until a power cycle.

For NVMe over Fabrics use cases, NVMe 2.0 clarifies how to handle firmware updates and safe device shutdown in scenarios where the shared storage is accessible through multiple controllers. There's also now explicit support for hard drives. Even though it's unlikely that hard drives will switch anytime soon to natively use PCIe connections instead of SAS or SATA, supporting rotational media means enterprises can unify their storage networking with NVMe over Fabrics and drop older protocols like iSCSI.

Overall, NVMe 2.0 doesn't bring as much in the way of new functionality as some of the previous updates. In particular, nothing in this update stands out as being relevant to client/consumer SSDs. But the spec reorganization should make it easier to iterate and experiment with new functionality, and the next several years will hopefully see more frequent updates with smaller changes rather than bundling up two or three years of work for big spec updates.

 

Related Reading:

Source: NVM Express

Comments Locked

24 Comments

View All Comments

  • jeremyshaw - Thursday, June 3, 2021 - link

    3 specs that went from 1.4 to 2.0 HDMI, DP and now NVMe :D
  • Tomatotech - Thursday, June 3, 2021 - link

    - What’s that Dad?

    - Son, it’s a NVMe HDD. A spinning platter of glass or rust attached to a multi-gigabit NVMe port.

    - Dad, I don’t understand?

    - We had reasons, sonny, reasons.
  • igor velky - Thursday, June 3, 2021 - link

    TSMC is building processors with direct die connect of NVME flash
    and your telling your son about spinning disks ? WT..
  • baka_toroi - Thursday, June 3, 2021 - link

    Pay more attention. His comment was tongue in cheek because NVMe 2.0 added support for rotational media.
  • sharathc - Friday, June 4, 2021 - link

    He won't. May be he is just a "troll and forget" person.
  • mode_13h - Monday, June 7, 2021 - link

    The news commenting system does tend to encourage that, mostly to its credit.
  • ET - Friday, June 4, 2021 - link

    Well, son, glass is just another form of sand, like the sand used in SSDs.
  • DanNeely - Friday, June 4, 2021 - link

    All joking aside, if HDDs remain relevant for bulk/lower availability storage long enough into the future, they'll eventually hit the capacity limits of current SATA/SAS connections. Especially if they scale from the 2 groups of read/write heads Seagate is currently doing toward each platter being read/writable independently and in parallel.

    Being able to switch to NVME instead of needing to update the existing HDD interconnects would save a lot of work at the standardization committee level.
  • Nexing - Friday, June 4, 2021 - link

    If "HDDs remain relevant for bulk/lower availability storage long enough into the future"...

    So far SSDs are useless as long-term and even mid-term storage data. The information just fades out.
    So much over my grieving feelings, HDDs remain the sole large capacity (over the near eternal but only 35GB or 50GB capacity optical M-discs) retrieval medium. To have an interconnect that acknowledges this is essential.
  • Nexing - Friday, June 4, 2021 - link

    *25GB or 50GB capacity optical M-discs

Log in

Don't have an account? Sign up now