Caching And Tiering: Intel Optane Memory H20 and Enmotus FuzeDrive SSD Reviewedby Billy Tallis on May 18, 2021 2:00 PM EST
- Posted in
- SSD Caching
- 3D XPoint
- Optane Memory
- Tiger Lake
An Alternative: Enmotus FuzeDrive SSD
Enmotus is a well-established commercial vendor of storage management software. Their existing FuzeDrive software is a hardware-independent competitor to Intel's RST and Optane Memory software. At CES 2020, Enmotus announced their first hardware product: MiDrive, an SSD combining QLC and SLC using Enmotus FuzeDrive software. This eventually made it to market as the FuzeDrive SSD, more closely matching the branding of their software products.
Like the Intel Optane Memory H20, the FuzeDrive SSD is almost two drives in one: a small SLC SSD and a large QLC SSD. But Enmotus implements it in a way that avoids all the compatibility limitations of the Optane Memory H20. The hardware is that of a standard 1 or 2 TB QLC SSD using the Phison E12S controller—the same as a Sabrent Rocket Q or Corsair MP400. The SSD's firmware does some very non-standard things behind the scenes: a fixed portion of the drive's NAND is set aside to permanently operate as SLC. This pool of NAND is wear-leveled independently from the QLC portion of the drive. The host system sees device with one pool of storage, but the first 24GB or 128GB of logical block addresses are mapped to the SLC part of the drive and the rest is the QLC portion. The Enmotus FuzeDrive software abstracts over this to move data in and out of the SLC portion.
Enmotus FuzeDrive does tiered storage rather than caching: the faster SLC portion adds to the total usable capacity of the volume, rather than just being a temporary home for data that will eventually be copied to the slower device. By contrast, putting a cache drive in front of a slower device using Intel's caching solution doesn't increase usable capacity; it just improves performance.
As an extra complication to the FuzeDrive SSD, the QLC portion of the drive operates exactly like a regular consumer QLC SSD, albeit with an unusual capacity. That means the QLC portion has its own drive-managed dynamic SLC caching that is entirely separate from the static SLC portion at the beginning of the drive.
Boot support is achieved by installing a UEFI driver module that the motherboard firmware loads and uses to access the tiered storage volume where the OS resides. Intel ships a comparable UEFI implementation of their caching system as part of the motherboard firmware, whereas Enmotus needs to install it separately to the SSD's EFI System Partition. Some NVMe RAID solutions such as from HighPoint put their UEFI driver in an option ROM.
|Enmotus FuzeDrive P200 SSD Specifications|
|Form Factor||double-sided M.2 2280|
|NAND Flash||Micron 96L 1Tbit QLC|
|QLC NAND Capacity||814 GiB||1316 GiB|
|Fixed SLC NAND Capacity||24 GiB||128 GiB|
|Total Usable Capacity||838 GiB
|Sequential Read||3470 MB/s|
|Sequential Write||2000 MB/s||3000 MB/s|
|Retail Price||$199.99 (22¢/GB)||$349.99 (23¢/GB)|
The performance specs for the Enmotus FuzeDrive P200 are nothing special; after all, advertised performance for ordinary consumer SSDs is already based on the peak performance attainable from the drive-managed SLC cache. The FuzeDrive SSD can't really aspire to offer much better peak performance than mainstream NVMe SSDs. Rather, the host-managed tiering instead of drive-managed caching changes the dynamics of when and how long a real-world workload will experience that peak performance from SLC NAND. The ability to manually mark certain files as permanently resident in the SLC portion of the drive means most of the unpredictability of consumer SSD performance can be eliminated. This is possible with both the Enmotus FuzeDrive SSD and with Intel Optane Memory caching (when configured in the right mode), but the larger FuzeDrive SSD model's 128GB SLC portion can accommodate a much wider range of applications and datasets than a 32GB Optane cache, even if the latter does have lower latency.
Write endurance is a bit more complicated when there are two drives in one, because in principle it is possible to wear out one section of the drive before the other. Intel sidesteps this question by making the Optane Memory H20 an OEM-only drive, so the warranty is whatever the PC vendor feels like offering for the system as a whole. Enmotus is selling their drive direct to consumers, so they need to be a bit more clear about warranty terms. Ultimately, the drive's own SMART indicators for wear are what determines whether the FuzeDrive SSD is considered to have reached its end of life. Enmotus has tested the FuzeDrive SSD and their software against the JEDEC-standard workloads used for determining write endurance, and from that they've extrapolated the above estimated write endurance numbers that should be roughly comparable to what applies to traditional consumer SSDs. Unusual workloads or bypassing the Enmotus tiering software could violate the above assumptions and lead to a different total lifespan for the drive.
The estimated write endurance figures for the FuzeDrive SSD look great for a QLC drive, and getting more than 1 DWPD as on the 1.6TB model is good even by the standards of high-end consumer SSDs. The tiering strategy used by FuzeDrive will tend to produce less data movement than caching as done by Intel's Optane Memory, and the SLC portion of the FuzeDrive SSD is rated for 30k P/E cycles. So it really is plausible that the 1.6TB model could last for 3.6 PB of carefully-placed writes, despite using QLC NAND for the bulk of the storage.
Caching makes storage benchmarking harder, by making current performance depend highly on previous usage patterns. Our usual SSD test suite is designed to account for ordinary drive-managed SLC caching, and includes tests intended to stress just a drive's cache as well as tests designed to go beyond the cache and reveal the performance of the slower storage behind the cache.
Software-managed caching and tiering make things even harder. The Intel Optane Memory and Enmotus FuzeDrive software is Windows-only, but large parts of our test suite use Linux for better control and lower overhead. There are SSD caching software solutions for Linux, but they come with their own data placement algorithms and heuristics that are entirely different from what Intel and Enmotus have implemented in their respective drivers, so testing bcache or lvmcache on Linux would not provide useful information about how the Intel and Enmotus drivers behave.
Our ATSB IO trace tests bypass the filesystem layer and deal directly with block devices, so caching/tiering software cannot do file-level tracking of hot data during those tests. Even if we could get these tests to run on top of software-managed caching or tiering, we'd be robbing the software of valuable information it could use to make smarter decisions than a purely drive-managed cache.
All of our regular SSD test suite is set up to have the drive under test as a secondary drive, with the testbed's OS and benchmarking software running off a separate boot drive. For the Optane Memory H20, Intel has provided a laptop that only has one M.2 slot, so testing the H20 as a secondary drive would be a bit inconvenient.
For all of these reasons, we're using a slightly different testing strategy and mix of benchmarks for this review. Where possible, we've tested the individual components on our regular test suite without the caching/tiering software. Our regular AMD Ryzen testbed detects the Optane side of the H20 and H10 when they are installed into the M.2 slots, so we've tested those with our usual synthetic tests to assess how much extra performance Intel is really getting out of the newer Optane device. The FuzeDrive SSD was partitioned and the SLC and QLC partitions tested independently. Our power measurements for these tests are still for the whole M.2 card even when only using part of the hardware.
The synthetic benchmarks tell us the performance characteristics of the fast and slow devices that the storage management software has to work with, but we need other tests to show how the combination behaves with the vendor-provided caching or tiering software. For this, we're using two suites of application benchmarks: BAPCo SYSmark 25 and UL PCMark 10. These tests cover common consumer PC usage scenarios and the scores are intended to reflect overall system performance. Since most consumer workloads are relatively lightweight from a storage perspective, there isn't much opportunity for faster storage to bring a big change in these scores. (Ironically, the process of installing SYSmark 25 would make for a much more strenuous storage benchmark than actually running it, but the installer unfortunately does not have a benchmark mode.)
To look a bit closer at storage performance specifically while using the caching or tiering software, we turn to the PCMark 10 Storage tests. These are IO trace based tests like our ATSB tests, but they can be run on an ordinary filesystem and don't bypass or interfere with caching or tiering software.
Since the Intel Optane Memory H20 is only compatible with select Intel platforms, we're using Intel-provided systems for most of the testing in this review. Intel shipped our H20 sample preinstalled in a HP Spectre x360 15-inch notebook, equipped with a Tiger Lake processor, 16GB of RAM and a 4k display. We're using the OS image Intel preloaded, which included their drivers plus PCMark and a variety of other software and data to get the drive roughly half full, so that not everything can fit in the cache. For other drives, we cloned that image, so software versions and configurations match. We have also run some tests on the Whiskey Lake notebook Intel provided for the Optane Memory H10 review in 2019.
|Optane Memory Review Systems|
|Platforn||Tiger Lake||Whiskey Lake|
|CPU||Intel Core i7-1165G7||Intel Core i7-8565U|
|Motherboard||HP Spectre x360 15.6"||HP Spectre x360 13t|
|Memory||16GB DDR4-2666||16GB DDR4-2400|
|Power Supply||HP 90W||HP 65W USB-C|
|OS||Windows 10 20H2, 64-bit|
A few things are worth noting about this Tiger Lake notebook: While the CPU provides some PCIe 4.0 lanes, this machine doesn't let them run beyond PCIe 3.0 speed. The DRAM running at just DDR4-2666 also falls far short of what the CPU should be capable of (DDR4-3200 or LPDDR4-4266). The price of this system as configured is about $1400, which really should be enough to get a machine that comes closer to using the full capabilities of its main components. There's also a ridiculous amount of coil whine while it's booting.
Post Your CommentPlease log in or sign up to comment.
View All Comments
powerarmour - Tuesday, May 18, 2021 - linkQLC garbage again, I can hardly contain myself.
Samus - Wednesday, May 19, 2021 - linkUnderstanding QLC's place in the market (cheap bulk flash storage) I'm also struggling to understand who these premium-priced QLC products are for. Seriously who is going to pay 23-25¢/GB for something like this when it's only crutch is high read throughput that has zero real world advantage for virtually all PC users.
Wereweeb - Wednesday, May 19, 2021 - linkThese products are both proofs of concept, and an advertising for the importance of Caching/Tiering.
Enmotus managed to get 3600 TBW out of a 2TB QLC SSD by reducing it's available capacity by a bit and using their software.
philehidiot - Wednesday, May 19, 2021 - linkThere is definitely the endurance advantage, but you don't need a commercial product for proof of concept. Indeed, I'd say releasing a commercial product just to prove it can be done where there is no real use for it is a bit daft. Unless they plan to inflict it upon customers in a data collection exercise, using their muscle to force it into laptops. We have already seen the advantages of this kind of tech when smaller SSDs were placed as a cache / tier into HDDs.
If their plan is to build this into an industrial product, their proof of concept should be a bunch of engineering samples tested for endurance, not a bodged consumer grade product which seems as though it's going to do more to show you can have a very complex and bodged product and it just about compete with what's already established on the market.
As for advertising, I'd say this is a pretty poor advert. Someone mentioned that Intel's storage division has been held back and it strikes me this is the case. This isn't a new and exciting product, it's two technologies being put together with an inadequate hardware interface and terrible software.
It has potential, but the people who will accept QLC NAND won't know or care what this is and the people who might benefit from the high DWPD won't touch it with a barge pole.
This should have stayed in R&D until it could add something to the market.
Samus - Thursday, May 20, 2021 - linkI'll believe it when it's independently tested. No level of software trickery will enable massive gains in TBW. If you fully write to a drive, the physical cells are fully utilized. Sure you can mask this with a large spare area and aggressive wear leveling but even a 2TB QLD SSD with 4TB of physical NAND (so 2TB spare area) will only yield 4x the endurance and that's best case scenario.
Enmotus can't break the laws of physics with intelligent software unless they've come up with some revolutionary hardware deduplication\compression algorithm that is limiting physical changes to NAND by many orders of magnitude, while also eliminating write amplification that is essential to modern ECC for data integrity.
Billy Tallis - Thursday, May 20, 2021 - linkThe key advantage the Enmotus drive has over regular QLC drives is that the static SLC portion can be used for far more P/E cycles. On a regular QLC drive, which blocks are used for the dynamic SLC cache is constantly changing, and the fact that a block that's currently operating as SLC may soon be repurposed as QLC effectively prevents it from being rated for more P/E cycles than QLC usage can permit. But with a large pool of permanent SLC, the drive can safely re-use those cells long past the point where they would be unusable as QLC. 128GiB at 30k P/E cycles can on its own handle more total writes than the drive as a whole is rated for.
As long as the tiering software does a good job of preventing most writes and write amplification from ever getting to the QLC part of the drive, the endurance rating is completely realistic. The tiering software won't be able to keep the wear confined to the SLC if you are using the drive as a giant circular buffer for video recording or something else that keeps the drive full and constantly modifies all of the data. But most real consumer workloads have a small amount of hot data that's frequently changing and a large amount of cold data that doesn't get rewritten often enough to pose a problem for QLC.
Spunjji - Wednesday, May 19, 2021 - linkAgreed - this would really need to show a serious performance benefit at a similar cost to a TLC drive, or lower cost and similar performance. As it is, it does neither. I'm sure OEMs will lap it up at whatever knockdown price Intel offers it to them to clear the shelves.
Spunjji - Wednesday, May 19, 2021 - linkDerped there and confused the price of the Enmotus with the H20... the Enmotus product really does seem to be in a bad place for price vs. consumer appeal without the benefit of Intel's cosy relationship with OEMs.
Morawka - Friday, May 21, 2021 - linkThe Enmotus product is perfect for Chia miners. Plotting on Chia absolutely destroys consumer-grade SSD's. A 980 Pro will get smoked in around 3 months, whereas this Enmotus drive, even though it's pricier, will last 3-5x longer.
Billy Tallis - Friday, May 21, 2021 - linkI think Chia plotting requires more space than the SLC portion of the Enmotus drive, and plotting is an example of the kinds of workloads that would not be handled well by the Enmotus tiering software unless the plotting could fit entirely in the SLC tier.