Enmotus MiDrive: Rethinking SLC Caching For QLC SSDsby Billy Tallis on January 30, 2020 8:00 AM EST
For consumer storage, CES 2020 brought a new wave of competition for PCIe 4.0 SSDs and promise of faster portable SSDs, but the most intriguing product demo was from Enmotus. They are planning a profound change to how consumer SSDs work, ditching drive-managed SLC caching in favor of host-managed tiered storage.
Enmotus is a well-established provider of storage management software. Their most familiar product to consumers is probably FuzeDrive, a limited edition of which is bundled with recent generations of AMD motherboards as AMD StoreMI. This serves as AMD's answer to Intel's Smart Response Technology (SRT) and Optane Memory storage caching systems. Enmotus also has enterprise-oriented products in the same vein. Their new MiDrive technology builds on their existing tiering software to manage a combination of SLC and QLC NAND on a single consumer SSD.
Caching and Tiering Challenges
All software-driven caching or tiering solutions tend to have limited consumer appeal due to the complexity of setting up the system. At least two physical drives are required, and the OS needs to load an extra driver to manage data placement. Any compatibility issue or other glitch can easily render a PC unbootable, and data recovery isn't as straightforward as for a single drive. These hurdles don't scare off enthusiasts and power users, but PC OEMs aren't eager to market and support these configurations. But without some form of caching or tiering, consumer SSDs would be limited to the raw performance of TLC or QLC NAND. SLC caching managed transparently by the SSD's firmware has been adopted by almost all consumer SSDs in order to improve burst performance, and it has proven to be very effective for consumer workloads. The fundamental limitation of this strategy is that the SSD must work with limited information about the nature and purpose of the user data it is reading and writing.
Most SSDs rely on fairly simple procedures for managing their SLC caches: sending all writes to the cache unless it's full, and using idle time to fold data from SLC into more compact TLC representation, freeing up cache space for future bursts of writes. There are still some choices to be made in implementing SLC caching for consumer SSDs: whether to use a fixed-size cache or dynamically sized, and whether to stall when the cache fills up or divert writes straight to TLC/QLC. As QLC drives become more common, we're also seeing drives that prefer to keep data in the SLC cache long-term until the drive starts to fill up, so that the cache can help with read performance in addition to write performance.
Enmotus FuzeDrive manual data placement controls
Host-managed caching or tiering opens the door to more intelligent management of data placement, since the host OS has better information: about which chunks of data belong to what file, and about the processes and users that interact with those files. It is easier for the host OS to accurately track the history of access patterns for hot vs. cold files. It is also possible to expose manual control of data placement directly to the user.
Two Drives In One
The Enmotus MiDrive technology allows one SSD to present the host with access to two separate pools of flash storage: QLC and SLC managed by the same SSD controller. To implement this, they have partnered with Phison to modify SSD controller firmware. For server products, a single NVMe SSD would expose two separate NVMe namespaces that Linux treats as different block devices. But for consumers, Enmotus has chosen to maximize backwards compatibility by having the MiDrive present itself as a single block device, with the first 32 or 64 GB initially mapped to SLC NAND and the rest of the drive mapped to QLC NAND. This makes it possible (and fast!) to install an OS to a MiDrive without needing any special Enmotus software or drivers. Once the Enmotus driver has been loaded, it takes over the management of data placement using vendor-specific commands that instruct the SSD to promote or demote ranges of Logical Block Addresses (LBAs) between the QLC and SLC pools of flash. (The initialization process for this tiering currently takes about a quarter of a second, because very little data needs to be moved until there's history indicating what should be in QLC vs SLC.)
Enmotus MiDrive 800GB appearing as a single device
This is a lot simpler for the host side than the strategy Intel uses for their Optane Memory H10, which is two separate PCIe devices on one M.2 card and requires special motherboard support to properly detect both halves before the caching software can even get involved. Enmotus is working to make MiDrive even simpler by having Microsoft distribute the Enmotus driver with Windows, so that a MiDrive will be automatically detected and managed by the Enmotus software without requiring any user intervention. For now, Windows will default to using its standard NVMe driver for a MiDrive, but that should change by the time products hit the shelves.
Example of how MiDrive LBA allocation will change with use
(for illustration purposes only, not based on real testing)
Enmotus supports assigning data to SLC or QLC in 4MB chunks, which is probably the size of a single NAND flash erase block in SLC mode, and thus the smallest chunk size that can easily be remapped between the QLC and SLC portions of the drive without contributing to unnecessary write amplification. That 4MB block size means that a small file moved to SLC is likely to bring along other nearby files, which will often contain related data that may also benefit from being in SLC. It also means that large files can be partially resident in SLC and partially in QLC. Since this process doesn't change the logical block addresses a file occupies, Enmotus MiDrive doesn't need to change anything about how NTFS organizes data, and it doesn't need to behave like an advanced disk defragmenter that tries to move important data toward the beginning of the disk. The MiDrive software only needs to look up what LBAs are used by a file and tell the SSD whether to move that data to SLC or QLC blocks. The only side-effect visible to the rest of the OS is a change in the performance characteristics for accessing that part of the SSD.
The SLC portion of an Enmotus MiDrive differs from a traditional SLC cache not only by being host-managed, but also in how the SSD treats it for wear leveling purposes. A typical SSD's SLC cache may have a static or dynamic size, but in either case when new write commands arrive the SSD will write the data to whatever NAND flash block is currently empty. When the cache is flushed, data from several SLC blocks will be rewritten in TLC or QLC mode to a different empty block, and the SLC blocks are then free to be erased and put back into the pool of available blocks. Managing just one pool of empty blocks means that the actual physical location of the SLC cache can move around over time, and a block that was last used as TLC might end up being used as SLC the next time data is written to it.
By contrast, Enmotus MiDrive technology has the SSD track two entirely separate pools. When the drive is manufactured, the SLC portion is permanently allocated for the lifetime of the drive. Any physical NAND pages and blocks that are used as SLC will always be treated as SLC for the lifetime of the drive, and the same for the QLC portion. The two pools of flash are subject to completely independent wear leveling, even though SLC and QLC portions will exist side by side on each physical flash chip on the drive. This means that the QLC blocks will never be subjected to the short-term Program/Erase cycles of SLC cache filling and flushing. For the SLC blocks, the error correction can be tuned specifically to SLC usage, and that allows Enmotus to achieve around 30k Program/Erase cycles for the SLC portion of the drive (based on Micron QLC NAND). MiDrives will expose separate SMART indicators for the SLC and QLC portions of the drive, so monitoring software will need to be updated to properly interpret this information.
In principle, it would be possible for either the SLC or QLC portion of the drive to be worn out prematurely, but in practice Enmotus is confident that their tiered storage management software will lead to longer overall drive lifespans than drive-managed SLC caching. Files that are known to be frequently modified will permanently reside on SLC and not be automatically flushed out to QLC during idle time. If the Enmotus software is smart enough, it will also be able to determine which files should skip the SLC and go straight to QLC until it becomes clear that a file is frequently accessed. For example, a file download coming into the machine over gigabit Ethernet will not initially need SLC performance because raw QLC can generally handle sequential writes at that speed (especially with no background SLC cache flushing to slow things down). And if that file is a movie which is infrequently accessed and only read sequentially, there's no reason for it to ever be promoted up to SLC. In general, the tiered storage management done by Enmotus should result in less data movement between SLC and QLC, rather than the increased write amplification that traditional SLC caching causes.
Since the SLC portion of an Enmotus MiDrive is a slice carved out of regular QLC NAND, it cannot offer all the benefits of specialized low-latency SLC NAND like Samsung's Z-NAND or Kioxia/Toshiba XL-Flash. The SLC portion of a MiDrive won't be appreciably faster than the SLC cache of a traditional consumer SSD, but that performance will be more consistent and predictable when working with files that are kept entirely on the SLC portion of the drive.
The Business Model
Enmotus MiDrive is currently implemented as a combination of Windows driver software and custom SSD firmware for Phison NVMe controllers, but it does not require any custom hardware. This means that any vendor currently selling Phison E12 NVMe SSDs can make a MiDrive-based product by licensing and shipping Enmotus firmware. PC OEMs can adopt MiDrives by switching to drives with Enmotus firmware and ensuring that they either include the Enmotus drivers in their Windows images, or relying on them to be distributed through Windows Update. No motherboard firmware or hardware modifications are required, or any changes to the process of provisioning a machine and preparing it for delivery to the end user. Enmotus is engaging both with PC OEMs and vendors of retail SSDs, so we can expect pre-built systems with Enmotus MiDrive technology and and upgrade options usable on any Windows 10 PC that already supports standard M.2 NVMe SSDs. Enmotus is optimistic about uptake from PC OEMs, expecting MiDrive to get a much better reception than Intel's Optane H10 did.
The basic MiDrive products will be fully automatic, with the Enmotus driver pre-installed or installed automatically when a MiDrive is detected. Data placement decisions will be completely behind-the-scenes. For enthusiasts, there will also be a premium tier similar to their current FuzeDrive software, which includes Windows Explorer shell integration so that individual files can be manually promoted or demoted, either permanently or for a limited period of time. Enmotus will also be providing a drive health monitoring tool that will include their estimate for how much extra drive lifetime has been won by using their tiering instead of ordinary SLC caching.
Mockup of Enmotus MiDrive SSD health monitoring tool
Enmotus expects SSDs with MiDrive technology to mostly use either 32GB or 64GB SLC portions and offer total capacities from about 400GB up to around 2TB, but the exact configurations will be determined by what their partners want to bring to market. Enmotus is also planning enthusiast-oriented solutions supporting RAID-0 style striping across multiple physical drives, and solutions for single-package BGA SSDs that go into small form factor and embedded devices.
Enmotus MiDrive technology will add to the price of SSDs, but since we're talking about QLC storage that's only relative to the cheapest NVMe SSDs available, and the final sticker prices will still be competitive for consumer SSDs. In return for that, users should get better real-world performance and enough effective write endurance to justify a 5-year warranty. We're looking forward to testing out this technology later this year, even though it will further complicate our benchmarking process. Enmotus is already sampling to interested OEMs.
Post Your CommentPlease log in or sign up to comment.
View All Comments
sheh - Friday, January 31, 2020 - linkQLC isn't for long-term storage, due to bad retention.
PaulHoule - Thursday, January 30, 2020 - linkIt might work out.
It might not. When you try to make something fast out of slow components it is basically a losing
game. If the OS and the drivers are smart about telling the drive where to locate files it would be better, if they make bad decisions it could be horribly worse.
In real life Windows machines run virus checking and often have processes running on them that do a lot of I/O. For instance, my work computer has a client on it that scans the disk periodically looking for bank account numbers, credit card numbers, and PDF files that have lists of coordinates that look like "#### #### #### ####". If those programs interact badly with the disk, then you'll have problems.
What's guaranteed to work is something that is fast all the time, and most of all you need to pay attention to 95-99% latency because what drives you nuts as a computer user is not the median case, it's the occasional slow case that has you looking at a spinner.
Billy Tallis - Thursday, January 30, 2020 - linkBackground file scanning software like you describe shouldn't cause too much interference with tiered storage like what Enmotus does. It should be reading most files with about the same frequency, and thus wouldn't have much impact on which files are identified as hot data. The Enmotus tiering software is deliberately conservative about automatically promoting data to the fast tier, to avoid cache thrashing.
Drkrieger01 - Thursday, January 30, 2020 - linkHeh, reading this MLC/TLC/QLC debate is funny...
At the end of the day, do you have the data from your MLC/TLC/QLC drive backed up anywhere else? If you do, then get the drive that fits your usage scenario.
extide - Thursday, January 30, 2020 - linkBINGO this is the answer -- get what best meets your I/O to dollar capability
DanNeely - Thursday, January 30, 2020 - linkI'm intrigued by the concept and hope you can get a device to test in the near future. Anything that can make QLC less terrible under any but very light loads is a good thing; and there doesn't seem to be any reason this couldn't also be used to enhance the cache on TLC drives too.
29a - Thursday, January 30, 2020 - linkHave any of you people even used a QLC drive, they are nowhere near as bad as you people are making them and they are fucking awesome compared to a hard drive.
PeachNCream - Thursday, January 30, 2020 - linkYes, I have. Compared to a hard drive, they are quicker. Compared to TLC, MLC, and SLC they are slower. Perception of performance has a lot to do with the sort of compairsons you make, but you should already know that without someone having to tell you.
trparky - Thursday, January 30, 2020 - linkQLC may be slower but to the average user, they wouldn't be able to tell the difference between it and TLC. Other than showing benchmark numbers, real-world performance isn't going to show much difference.
name99 - Thursday, January 30, 2020 - linkThis looks, I gotta say, like a copy of Apple's Fusion drive along pretty much every dimension, from the static partitioning to the host side decisions to the simplified setup. Main thing missing is the tight integration with the file system.
Which is fine --- if you're going to copy, copy from the best!
(I do wonder if, at some point, Apple will offer an SLC+QLC Fusion solution for low-end Macs. Honestly it makes more sense than continuing with a hard drive.
Maybe it's one more thing waiting for the grand realignment once the ARM mac arrives, which presumably will include more Apple control over the flash controller?)