Disk strategies

With magnetic disks, there are two strategies to get good OLTP or mail server performance. The "traditional way" is to combine a number of 15000RPM SAS "spindles", all working in parallel. The more "rebellious way" or "Google way" is to use a vast number of cheaper SATA drives. This last strategy is based on the observation that although SATA drives come with higher access times, you can buy more SATA spindles than SAS spindles for the same price. While Google opted for desktop drives, we worked with what we had in the lab: 16 enterprise 1TB Western Digital drives. Since these are one of the fastest 7200RPM drives that can be found on the market, it should give you a good idea what an array with lots of SATA drives can do compared to one with fewer fast spinning SAS drives.

SSDs add a new strategy: if space is not your primary problem, you can trade in storage space for huge amounts of random I/O operations per second, requiring fewer but far more expensive drives to obtain the same performance. SSDs offer superb read access times but slightly less impressive write access times.

As Anand has pointed out, a cheap SSD controller can really wreak havoc on writing performance, especially in a server environment where many requests are issued in parallel. EMC solved this with their high-end Enterprise Flash Disks, produced by STEC, which can store up to 400GB and come with a controller with excellent SRAM caches and a super capacitor. The super capacitor enables the controller to empty the relatively large DRAM caches and write the date to the flash storage in the event of a sudden power failure.

Intel went for the midrange market, and gave its controller less cache (16MB). The controller is still intelligent and powerful enough to crush the competition with the cheap JMicron JMF602-controllers. We check out the SLC version, the Intel X25-E SLC 32GB.

The newest Intel Solid State Disks with their access times of 0.075 ms and 0.15W power consumption could change the storage market for OLTP databases. However, the SLC drives have a few disadvantages compared to the best SAS drives out there:

  • No dual ports
  • The price per GB is 13 times higher

You can see the summary in the table below.

Enterprise Drive Pricing
Drive Interface Capacity Pricing Price per GB
Intel X25-E SLC SATA 32GB $415-$470 $13
Intel X25-E SLC SATA 64GB $795-$900 $12
Seagate Cheetah 15000RPM SAS 300GB $270-$300 $0.90
Western Digital 1000FYPS SATA 1000GB $190-$200 $0.19

If you really need capacity, SATA or even SAS drives are probably the best choice. On the other hand, if you need spindles to get more I/O per second, it will be interesting to see how a number of SAS or SATA drives compares to the SLC drives. The most striking advantages of the Intel X25-E SLC drive are extremely low random access times, almost no power consumption at idle, low power consumption at full load, and high reliability.

Enterprise Drive Specifications
Drive Read Access Time Write Access Time Idle Power Full Power MTBF
(hours)
Intel X25-E SLC 32GB 0.075 ms 0.085 ms 0.06 W 2.4 W 2 million
Intel X25-E SLC 64GB 0.075 ms 0.085 ms 0.06 W 2.6 W 2 million
Seagate Cheetah 15000RPM 5.5 ms (*) 6 ms 14.3 W 17 W 1.4 million
Western Digital 1000FYPS 13 ms (**) n/a 4 W 7.4 W 1.2 million

(*) 5.5 ms = 3.5 ms seek time + 2 ms latency (rotation)
(**) 13 ms = 8.9 ms seek time + 4.1 ms latency (rotation)

Reliability testing is outside the scope of this document, but if only half of the Intel claims are true, the x25-E SLC drives will outlive the vast majority of magnetic disks. First is the 2 million MTBF specification, which is far better than the best SAS disks on the market (1.6 million hour MTBF). Intel also guarantees that if the X25-E performs 7000 8KB random access per second, consisting of 66% reads and 33% writes, the drive will continue to do so for 5 years! That is 2.9TB of written data per day, and it can sustain this for about 1800 days. That is simply breathtaking as no drive has to sustain that kind of IOPS 24 hours per day for such a long period.

Index Configuration and Benchmarking Setup
Comments Locked

67 Comments

View All Comments

  • Rasterman - Monday, March 23, 2009 - link

    since the controller is the bottleneck for ssd and you have very fast cpus, did you try testing a full software raid array, just leave the controllers out of it all together?.
  • Snarks - Sunday, March 22, 2009 - link

    reading the comments made my brain asplode D:!

    Damn it, it's way to late for this!
  • pablo906 - Saturday, March 21, 2009 - link

    I've loved the stuff you put out for a long long time. This another piece of quality work. I definitely appreciate the work you put into this stuff. I was thinking about how I was going to build the storage back end for a small/medium virtualization platform and this is definitely swaying some of my previous ideas. It really seems like an EMC enclosure may be in our future instead of a something built by me on a 24 Port Areca Card.

    I don't know what all the hubub was about at the beginning of the article but I can tell you that I got what I needed. I'd like to see some follow ups in Server Storage and definitely more Raid 6 info. Any chance you can do some serious Raid Card testing, that enclosure you have is perfect for it (I've built some pretty serious storage solutions out of those and 24 port Areca cards) and I'd really like to see different cards and different configurations, numbers of drives, array types, etc. tested.
  • rbarone69 - Friday, March 20, 2009 - link

    Great work on these benchmarks. I have found very few other sources that provided me with the answers to my questions regarding exaclty what you tested here (DETAILED ENOUGH FOR ME). This report will be referenced when we size some of our smaller (~40-50GB but heavily read) central databases we run within our enterprise.

    It saddens me to see people that simply will NEVER be happy, no matter what you publish to them for no cost to them. Fanatics have their place but generally cost organizations much more than open minded employees willing to work with what they have available.
  • JohanAnandtech - Saturday, March 21, 2009 - link

    Thanks for your post. A "thumbs up" post like yours is the fuel that Tijl and I need to keep going :-). Defintely appreciated!



  • classy - Friday, March 20, 2009 - link

    Nice work and no question ssds are truly great performers, but I don't see them being mainstream for several more years in the enterprise world. One is no one knows how relaible they are? They are not tried and tested. Two and three go hand in hand, capapcity and cost. With the need for more and more storage, the cost for ssd makes them somewhat of a one trick pony, a lot of speed, but cost prohibitive. Just at our company we are looking at a seperate data domain just for storage. When you start tallking the need for several terabytes, ssd just isn't going to be considered. Its the future, but until they drastically reduce in cost and increase in capacity, their adoption will be minimal at best. I don't think speed right now trumps capacity in the enterprise world.
  • virtualgeek - Friday, March 27, 2009 - link

    They are well past being "untried" in the enterprise - and we are now shipping 400GB SLC drives.
  • gwolfman - Friday, March 20, 2009 - link

    [quote]Our Adaptec controller is clearly not taking full advantage of the SLC SSD's bandwidth: we only see a very small improvement going from four to eight disks. We assume that this is a SATA related issue, as eight SAS disks have no trouble reaching almost 1GB/s. This is the first sign of a RAID controller bottleneck.[/quote]
    I have an Adaptec 3805 (previous generation as to the one you used) that I used to test 4 of OCZ's first SSDs when they came out and I noticed this same issue as well. I went through a lengthy support ticket cycle and got little help and no answer to the explanation. I was left thinking it was the firmware as 2 SAS drives had a higher throughput than the 4 SSDs.
  • supremelaw - Friday, March 20, 2009 - link

    For the sake of scientific inquiry primarily, but not exclusively,
    another experimental "permutation" I would also like to see is
    a comparison of:

    (1) 1 x8 hardware RAID controller in a PCI-E 2.0 x16 slot

    (2) 1 x8 hardware RAID controller in a PCI-E 1.0 x16 slot

    (3) 2 x4 hardware RAID controllers in a PCI-E 2.0 x16 slot

    (4) 2 x4 hardware RAID controllers in a PCI-E 1.0 x16 slot

    (5) 2 x4 hardware RAID controllers in a PCI-E 2.0 x4 slot

    (6) 2 x4 hardware RAID controllers in a PCI-E 1.0 x4 slot

    (7) 4 x1 hardware RAID controllers in a PCI-E 2.0 x1 slot

    (8) 4 x1 hardware RAID controllers in a PCI-E 1.0 x1 slot


    * if x1 hardware RAID controllers are not available,
    then substitute x1 software RAID controllers instead,
    to complete the experimental matrix.


    If the controllers are confirmed to be the bottlenecks
    for certain benchmarks, the presence of multiple I/O
    processors -- all other things being more or less equal --
    should tell us that IOPs generally need more horsepower,
    particularly when solid-state storage is being tested.

    Another limitation to face is that x1 PCI-E RAID controllers
    may not work in multiples installed in the same motherboard
    e.g. see Highpoint's product here:

    http://www.newegg.com/Product/Product.aspx?Item=N8...">http://www.newegg.com/Product/Product.aspx?Item=N8...


    Now, add different motherboards to the experimental matrix
    above, because different chipsets are known to allocate
    fewer PCI-E lanes even though slots have mechanically more lanes
    e.g. only x4 lanes actually assigned to an x16 PCI-E slot.


    MRFS


  • supremelaw - Friday, March 20, 2009 - link

    More complete experimental matrix (see shorter matrix above):

    (1) 1 x8 hardware RAID controller in a PCI-E 2.0 x16 slot

    (2) 1 x8 hardware RAID controller in a PCI-E 1.0 x16 slot

    (3) 2 x4 hardware RAID controllers in a PCI-E 2.0 x16 slot

    (4) 2 x4 hardware RAID controllers in a PCI-E 1.0 x16 slot

    (5) 1 x8 hardware RAID controllers in a PCI-E 2.0 x8 slot

    (6) 1 x8 hardware RAID controllers in a PCI-E 1.0 x8 slot

    (7) 2 x4 hardware RAID controllers in a PCI-E 2.0 x8 slot

    (8) 2 x4 hardware RAID controllers in a PCI-E 1.0 x8 slot

    (9) 2 x4 hardware RAID controllers in a PCI-E 2.0 x4 slot

    (10) 2 x4 hardware RAID controllers in a PCI-E 1.0 x4 slot

    (11) 4 x1 hardware RAID controllers in a PCI-E 2.0 x1 slot

    (12) 4 x1 hardware RAID controllers in a PCI-E 1.0 x1 slot


    MRFS

Log in

Don't have an account? Sign up now