The Real Issue

While I was covering MWC a real issue with OCZ's SSDs erupted back home: OCZ aggressively moved to high density 25nm IMFT NAND and as a result was shipping product under the Vertex 2 name that was significantly slower than it used to be. Storage Review did a great job jumping on the issue right away.

Let's look at what caused the issue first.

When IMFT announced the move to 25nm it mentioned a doubling in NAND capacity per die. At 25nm you could now fit 64Gbit of MLC NAND (8GB) on a single die, twice what you could get at 34nm. With twice the density in the same die area, costs could come down considerably.


An IMFT 25nm 64Gbit (8GB) MLC NAND die

Remember NAND manufacturing is no different than microprocessor manufacturing. Cost savings aren't realized on day one because yields are usually higher on the older process. Newer wafers are usually more expensive as well. So although you get ~2x density improvement going to 25nm, your yields are lower and wafers are more expensive than they were at 34nm. Even Intel was only able to get a maximum of $110 decrease in price when going from the X25-M G2 to the SSD 320.

OCZ was eager to shift to 25nm. Last year SandForce was the first company to demonstrate 25nm Intel NAND on an SSD at IDF, clearly the controller support was there. As soon as it had the opportunity to, OCZ began migrating the Vertex 2 to 25nm NAND.

SSDs are a lot like GPUs, they are very wide, parallel beasts. While a GPU has a huge array of parallel cores, SSDs are made up of arrays of NAND die working in parallel. Most controllers have 8 channels they can use to talk to NAND devices in parallel, but each channel can often have multiple NAND die active at once.


A Corsair Force F120 using 34nm IMFT NAND

Double the NAND density per die and you can guess what happened next - performance went down considerably at certain capacity points. The most impacted were the smaller capacity drives, e.g. the 60GB Vertex 2. Remember the SF-1200 is only an 8-channel controller so it only needs eight devices to technically be fully populated. However within a single NAND device, multiple die can be active concurrently and in the first 25nm 60GB Vertex 2s there was only one die per NAND package. The end result was significantly reduced performance in some cases, however OCZ failed to change the speed ratings on the drives themselves.

The matter is complicated by the way SandForce's NAND redundancy works. The SF-1000 series controllers have a feature called RAISE that allows your drive to keep working even if a single NAND die fails. The controller accomplishes this redundancy by writing parity data across all NAND devices in the SSD. Should one die fail, the lost data is reconstructed from the remaining data + parity and mapped to a new location in NAND. As a result, total drive capacity is reduced by the size of a single NAND die. With twice the density per NAND die in these early 25nm drives, usable capacity was also reduced when OCZ made the switch with Vertex 2.

The end result was that you could buy a 60GB Vertex 2 with lower performance and less available space without even knowing it.


A 120GB Vertex 2 using 25nm Micron NAND

After a dose of public retribution OCZ agreed to allow end users to swap 25nm Vertex 2s for 34nm drives, they would simply have to pay the difference in cost. OCZ realized that was yet another mistake and eventually allowed the swap for free (thankfully no one was ever charged), which is what should have been done from the start. OCZ went one step further and stopped using 64Gbit NAND in the 60GB Vertex 2, although drives still exist in the channel since no recall was issued.

OCZ ultimately took care of those users who were left with a drive that was slower (and had less capacity) than they thought they were getting. But the problem was far from over.

Introduction The NAND Matrix
Comments Locked

153 Comments

View All Comments

  • Xcellere - Wednesday, April 6, 2011 - link

    It's too bad the lower capacity drives aren't performing as well as the 240 GB version. I don't have a need for a single high capacity drive so the expenditure in added space is unnecessary for me. Oh well, that's what you get for wanting bleeding-edge tech all the time.
  • Kepe - Wednesday, April 6, 2011 - link

    If I've understood correctly, they're using 1/2 of the NAND devices to cut drive capacity from 240 GB to 120 GB.
    My question is: why don't they use the same amount of NAND devices with 1/2 the capacity instead? Again, if I have understood correctly, that way the performance would be identical compared to the higher capacity model.
    Is NAND produced in only one capacity packages or is there some other reason not to use NAND devices of differing capacities?
  • dagamer34 - Wednesday, April 6, 2011 - link

    Because price scaling makes it more cost-effective to use fewer, more dense chips than separate smaller, less dense chips as the more chips made, the cheaper they eventually become.

    Like Anand said, this is why you can't just as for a 90nm CPU today, it's just too old and not worth making anymore. This is also why older memory gets more expensive when it's not massively produced anymore.
  • Kepe - Wednesday, April 6, 2011 - link

    But couldn't they just make smaller dies? Just like there are different sized CPU/GPU dies for different amounts of performance. Cut the die size in half, fit 2x the dies per wafer, sell for 50% less per die than the large dies (i.e. get the same amount of money per wafer).
  • A5 - Wednesday, April 6, 2011 - link

    No reason for IMFT to make smaller dies - they sell all of the large dies coming out of the fab (whether to themselves or 3rd parties), so why bother making a smaller one?
  • vol7ron - Wednesday, April 6, 2011 - link

    You're missing the point on economies of scale.

    Having one size means you don't have leftover parts, or have to pay for a completely different process (which includes quality control).

    These things are already expensive, adding the logistical complexity would only drive the prices up. Especially, since there are noticeable difference in the manufacturing process.

    I guess they could take the poorer performing silicon and re-market them. Like how Anand mentioned that they take poorer performning GPUs and just sell them at a lower clockrate/memory capacity, but it could be that the NAND production is more refined and doesn't have that large of a difference.

    Regardless, I think you mentioned the big point: inner RAIDs improve performance. Why 8 chips, why not more? Perhaps heat has something to do with it, and (of course) power would be the other reason, but it would be nice to see higher performing, more power-hungry SSDs. There may also be a performance benefit in larger chips too, though, sort of like DRAM where 1x2GB may perform better than 2x1GB (not interlaced).

    I'm still waiting for the manufacturers to get fancy, perhaps with multiple controllers and speedier DRAM. Where's the Vertex3 Colossus.
  • marraco - Tuesday, April 12, 2011 - link

    Smaller dies would improve yields, and since they could enable full speed, it would be more competitive.

    A bigger chip with a flaw may invalidate the die, but if divided in two smaller chips it would recover part of it.

    On other side, probably yields are not as big problem, since bad sectors can be replaced with good ones by the controller.
  • Kepe - Wednesday, April 6, 2011 - link

    Anand, I'd like to thank you on behalf of pretty much every single person on the planet. You're doing an amazing job with making companies actually care about their customers and do what is right.
    Thank you so much, and keep up the amazing work.

    - Kepe
  • dustofnations - Wednesday, April 6, 2011 - link

    Thank God for a consumer advocate with enough clout for someone important to listen to them.

    All too often valid and important complaints fall at the first hurdle due to dumb PR/CS people who filter out useful information. Maybe this is because they assume their customers are idiots, or that it is too much hassle, or perhaps don't have the requisite technical knowledge to act sensibly upon complex complaints.
  • Kepe - Wednesday, April 6, 2011 - link

    I'd say the reason is usually that when a company has sold you its product, they suddenly lose all interest in you until they come up with a new product to sell. Apple used to be a very good example with its battery policy. "So, your battery died? We don't sell new or replace dead batteries, but you can always buy the new, better iPod."
    It's this kind of ignorance towards the consumers that is absolutely appalling, and Anand is doing a great job at fighting for the consumer's rights. He should get some sort of an award for all he has done.

Log in

Don't have an account? Sign up now