Conclusion

When selecting a memory kit for a new system, the market is littered with many choices ranging in speed, heatsink design, RGB or no RGB, and capacity. In terms of DDR5 memory, the only platform that can utilize this at the moment is Intel's 12th Gen Core series, with its premier offerings coming in conjunction with the Z690 chipset. This is likely to change later on this year if AMD's Zen 4 architecture launches, but right now, the DDR5 market and the Alder Lake market are one in the same.

For today's article, we focused at looking at the performance differences (or lack thereof) in DDR5 in different rank and DIMM per channel configurations. While these elements are smaller factors in DDR5 performance than frequency and clockspeed, as we have found, they do have a meaningful impact on memory subsystem performance – and thus an impact on overall system performance.

Samsung DDR5-4800B: 1Rx8 (1DPC/2DPC) versus 2Rx8 (1DPC)

In testing Samsung's 2 x 32 GB (2Rx8) kit directly against a 4 x 16 GB (1Rx8) kit, we got some interesting results running at JEDEC speeds with Intel's Core i9-12900K processor. 

WinRAR 5.90 Test, 3477 files, 1.96 GB

Looking at situations where the differences were evident in benchmark results, in our WinRAR 5.90 test, which is very sensitive to memory performance and throughput, the Samsung DDR5-4800 4 x 16 GB kit performed around 9% worse than its higher density 2 x 32 GB counterpart, which is quite a drop. And even in a 1DPC configuration, the 2 x 16 GB kit with its single rank of memory does operate at a deficit versus the dual rank kits. This indicates that using 1DPC yields better performance in memory-sensitive applications than 2DPC. Meanwhile, the Samsung DDR5-4800B 2 x 32 GB configuration performed within a solid margin of error against SK Hynix and Micron kits.

Grand Theft Auto V - 4K Low - Average FPS

It was much the same in some of our game testing, with the Samsung 4 x 16 GB kit being outperformed by the 2 x 32 GB kits, and even the Samsung 2 x 16 GB kit, which is the same single-rank UDIMMs as the 4 x 16 GB combination. While the performance hit was only around 2-3% in Grand Theft Auto V at 4K low settings, in our testing and from a performance point of view, Intel's Alder Lake seems to perform better with two sticks against four sticks of memory.

Throughout most of our testing, it was clear that in most situations, having two higher density 2Rx8 sticks in a 1DPC configuration versus the same capacity in 4 sticks (1Rx8 at 2DPC) is better for overall performance. And even looking at just 1DPC configurations, going dual-rank is still better, though to a smaller degree.

Going under the hood for an explaination for these results, the main reason that 2Rx8 is better than 1Rx8 comes down to how the integrated memory controller can only access one level of rank at a time. So in a dual rank DIMM, Rank Interleaving can be employed, which allows the second rank of memory chips to be ready for immediate access. While the differences are minimal even on a theoretical basis, as we have seen they are not zero: rank interleaving reduces response times in the pipeline refresh cycles, which can mean more performance in latency-sensitive applications, or when an application is going to be able to push DDR5 to its overall bandwidth limits.

Samsung vs SK Hynix vs Micron 32GB DDR5-4800B

Looking at the performance of the 2 x 32 GB kits running at DDR5-4800B from Samsung, SK Hynix, and Micron, the difference was for all practical purposes non-existent. We did not find any meaningful performance difference in our testing, which means that performance isn't a differentiating factor between the three memory manufacturers – at least at JEDEC settings with Alder Lake. Which, given the identical timings and capacities, is not unexpected. This is essentially the null hypothesis of our testing, showcasing that at least from a performance standpoint at fully qualified clockspeeds, there's no innate performance difference from one DRAM manufacturer to another.

Consequently, it's pretty easy here to recommend that if users are planning to build an Intel 12th Gen Core series setup with JEDEC-rated DD5 memory, they should opt for the cheapest option among proven DIMM vendors. For desktop purposes, the DIMMs are functionally equal, and right now DDR5 memory itself is still rather expensive. Although there's much more choice available in stock than there was last year, and it's still a relatively new platform, so that also adds to the cost.

Final Thoughts: 64GB of 2Rx8 with 1DPC is Better Than 64 GB of 1Rx8 with 2DPC

One of the biggest things to note from this article is that there isn't really any difference in performance between Samsung, SK Hynix, or Micron-based 2 x 32 GB DDR5-4800B memory kits. Despite using different memory ICs from each of the vendors, all these kits show that 2Rx8 DDR5 memory performs better than 1Rx8 DDR5.

The only aspect we didn't test was overclocking headroom with the JEDEC-rated kits, which wasn't really an angle we wanted base an article around. Given the lottery-like results of overclocking on any given DIMM, we'd be testing our own luck more than we'd be testing the hardware. In these cases a large sample size is required to get useful data, and that's where the dedicated memory vendors come in with their binning processes.

Taking a more meta overview on the state of the DDR5 market, we already know from vendors such as ADATA, G.Skill, and TeamGroup, that Samsung and SK Hynix's current generation parts show greater frequency and latency headroom when running above DDR5's nominal voltage of 1.1v. Which is why DDR5-6000 (and beyond) kits aren't using Micron chips. Though this may change in the future as all three companies are looking to the future with its manufacturing process, including EUV lithography.


2 x 32 GB kits of DDR5-4800B memory outperform 4 x 16 GB kits at the same frequency/latencies

As for the matter of today's tests, our results are very clear: dual-rank memory is the way to go, as is sticking to a single DIMM per channel when possible.

The most significant performance differences in our testing are found comparing two of Samsung's 1Rx8 DDR5-4800B memory sticks in a 1DPC configuration against four of the same sticks in a 2DPC configuration. There we find that the 1DPC configuration is contently equal or better in every scenario. Using four sticks means data has to travel further along the memory traces, which combined with the overhead of communicating with two DIMMs, results in both a drop in memory performance as well as a sight increase in latency.

And while the differences between 1Rx8 and 2Rx8 are not as large, we find that there is still a difference, and it's in favor of the dual rank memory. Thanks to rank interleaving, single rank memory finds itself at a slight disadvantage versus dual rank memory, at least on today's Alder Lake systems.

Based on this data, we recommend that users looking for 64 GB of DDR5 memory opt for 2 x 32 GB, rather than using a 4 x 16 GB configuration. Besides providing the best performance, the 2 x 32 GB route also leaves room for users to add additional capacity as needed down the line. Plus, if users want to overclock them further, overclocking four sticks of memory is notoriously stressful for the processor's IMC – and DDR5 only makes this worse.

Otherwise, choosing between DDR5-4800B kits from Micron, SK Hynix, and Samsung in terms of 2 x 32 GB kits primarily comes comes down to availability and price. DRAM is a true commodity product, in every sense of the word, so for these JEDEC-standard kits, there's not much to compete on except pricing.

Gaming Performance Benchmarks: DDR5-4800
Comments Locked

66 Comments

View All Comments

  • repoman27 - Thursday, April 14, 2022 - link

    But what if you left Chrome running with more than say 4 tabs open while you're gaming?

    No, I totally get what you're saying, and I'm fine with the gaming focus in general. But I'm sure there are plenty of regular visitors to this site that are more likely to be running a bunch of VMs or some other workload that might be memory bound in ways that differ from gaming scenarios.
  • RSAUser - Tuesday, April 19, 2022 - link

    A case where you care about this, you're probably a power user, at that point in time it would make sense to also test 64GB/memory exhaustion, as people are not taking old sticks with this, they'd directly buy as much as they need since DDR5.

    I can't run my work stack on 32GB RAM, and at home I often enough hit 32GB if I work on a hobby project as I like running my entire stack at once.
  • Jp7188 - Wednesday, April 13, 2022 - link

    4x16 (64GB) performed worse in every test vs. 32GB. Thats reasonable assurance mem exhaustion wasn't much of a factor.
  • Dolda2000 - Thursday, April 7, 2022 - link

    I have to admit I don't quite understand the results. I'd expect the disadvantage of 2DPC to be that they may not be able to sustain the same frequencies as 1DPC, but clearly that's not the case here since all kits are in fact running at the same frequency. That being the case, I would expect 1R, 2DPC memory to behave functionally identically to 2R, 1DPC memory, since, at least in my understanding, that's basically the same thing as far as the memory controller is concerned.

    What would account for the differences? Were the secondary and/or tertiary timings controlled for?
  • MrCommunistGen - Thursday, April 7, 2022 - link

    I've seen passing comments that running 2DPC really messes with signal integrity on current setups but didn't read into it any further. Since DDR5 has SOME built in error handling, even on non-ECC chips, it could be that signal losses are causing transmission retries which slow things down.

    Assuming that signal integrity is the issue, I'm wondering if rev2 or next gen DDR5 motherboards will try to improve the DDR5 memory traces to combat this or if it's something that needs to happen on the memory controller side.

    Also, even though the clockspeeds and primary timings are listed as being the same, the motherboard may be automatically adjusting some of the tertiary timings behind the scenes when using 2DPC, which can also have a measurable impact.
  • Dolda2000 - Thursday, April 7, 2022 - link

    >Since DDR5 has SOME built in error handling, even on non-ECC chips, it could be that signal losses are causing transmission retries which slow things down.
    I had that thought as well, but as far as I understand, DDR5's builtin error-handling is limited entirely to what happens on the die. I don't think there are any error-handling mechanisms on the wire that would allow the memory system to detect errors in transfer and retransmit.
  • thomasg - Thursday, April 7, 2022 - link

    As far as I know, there are no error correction techniques (such as forward error correction) used for the transmission paths of DDR ram, apart from ECC, thus there are no automatic retransmissions.

    The reason why frequencies or timings will suffer for multiple DIMMs per channel may be as simple as signal runtime.

    Electrical signals theoretically travel at the speed of light, but high frequency signals exhibit significant propagation delay, depending on trace design and PCB material. About half the speed of light (~150,000 km/s) is a fair assumption for typical PCB traces with DIMM sockets.

    With DDR5-4800, we're talking about clock cycles of 2400 MHz, which translates to 1 cycle per 400 femtoseconds.
    In 400 femtoseconds, the electrical high-frequency signal can travel 6 centimeters.
    Thus, with 3 centimeters longer traces between DIMM_A and DIMM_B their signals would be 180°out of phase.
    Since we're talking DDR, the rising and falling edge of the clock is used to transmit data, which means the signal timings need to be a lot tighter than 180˚, likely below 90˚, which limits the difference to 1.5 cm.

    It's not hard to imagine that this is a significant constraint to PCB layout.
    Traces can be length matched, but with wide parallel channels (64/72 traces), this is very tricky and cannot be done exactly, as it would be for narrower channels (i.e. 4 or 8 traces).

    As you might have noticed, I'm a radio guy and don't have the slightest clue about DDR memory, so take this with a grain of salt.
  • repoman27 - Friday, April 8, 2022 - link

    Just to add a few grains of salt...

    DDR5 actually does support cyclical redundancy check (CRC) for read and write operations.

    Depending on the material used for the PCB, the signal speed for microstrips might be slightly better than 1/2 c, maybe closer to 1/1.7 c or 58.5% of the speed of light.

    And according to my calculator at least, 1 ÷ 2,400,000,000 = 0.000000000416667 = 416.667 picoseconds for the unit interval.

    And not to downplay the issues you point out in designing DDR5 memory systems, but Alder Lake also supports PCI Express Gen5, which involves routing 64 traces operating at 16.0 GHz for an x16 slot. Serial point-to-point using differential signaling, so not the same thing, but still bonkers nonetheless.
  • Jp7188 - Wednesday, April 13, 2022 - link

    Correct me if I'm wrong, but crc without fec = chance of retransmission = increased latency?
  • repoman27 - Thursday, April 14, 2022 - link

    Yes, but if your BER is even close to reasonable, the additional latency from retries should be negligible. And it's not like ECC or FEC are exactly free. You want to do whatever you can to keep the error rate within acceptable tolerances before resorting to the additional overhead / complexity of error correction.

Log in

Don't have an account? Sign up now