Testing PCIe 4.0

It's been over a year since the first consumer CPUs and SSDs supporting PCIe 4.0 hit the market, so we're a bit overdue for a testbed upgrade. Our Skylake system was adequate for even the fastest PCIe gen3 drives, but is finally a serious bottleneck.

We have years of archived results from the old testbed, which are still relevant to the vast majority of SSDs and computers out there that do not yet support PCIe gen4. We're not ready to throw out all that work quite yet; we will still be adding new test results measured on the old system until PCIe gen4 support is more widespread, or my office gets too crowded with computers—whichever happens first. (Side note: some rackmount cases for all these test systems would be greatly appreciated.)

AnandTech 2017-2020 Skylake Consumer SSD Testbed
CPU Intel Xeon E3 1240 v5
Motherboard ASRock Fatal1ty E3V5 Performance Gaming/OC
Chipset Intel C232
Memory 4x 8GB G.SKILL Ripjaws DDR4-2400 CL15
Software Windows 10 x64, version 1709
Linux kernel version 4.14, fio version 3.6
Spectre/Meltdown microcode and OS patches current as of May 2018

Since introducing the Skylake SSD testbed in 2017, we have made few changes to our testing configurations and procedures. In December 2017, we started using a Quarch XLC programmable power module (PPM), providing far more detailed and accurate power measurements than our old multimeter setup. In May 2019, we upgraded to a Quarch HD PPM, which can automatically compensate for voltage drop along the power cable to the drive. This allowed us to more directly measure M.2 PCIe SSD power: these drives can pull well over 2A from the 3.3V supply which can easily lead to more than the 5% supply voltage drop that drives are supposed to tolerate. At the same time, we introduced a new set of idle power measurements conducted on a newer Coffee Lake system. This is our first (and for the moment, only) SSD testbed that is capable of using the full range of PCIe power management features without crashing or other bugs. This allowed us to start reporting idle power levels for typical desktop and best-case laptop configurations.

Coffee Lake SSD Testbed for Idle Power
CPU Intel Core i7-8700K
Motherboard Gigabyte Aorus H370 Gaming 3 WiFi
Memory 2x 8GB Kingston DDR4-2666

On the software side, the disclosure of the Meltdown and Spectre CPU vulnerabilities at the beginning of 2018 led to numerous mitigations that affected overall system performance. The most severe effects were to system call overhead, which has a measurable impact on high-IOPS synthetic benchmarks. In May 2018, after the dust started to settle from the first round of vulnerability disclosures, we updated the firmware, microcode and operating systems on our testbed and took the opportunity to slightly tweak some of our synthetic benchmarks. Our pre-Spectre results are archived in the SSD 2017 section of our Bench database while the current post-Spectre results are in the SSD 2018 section. Of course, since May 2018 there have been many further related CPU security vulnerabilities found, and many changes to the mitigation techniques. Our SSD testing has not been tracking those software and microcode updates to avoid again invalidating previous scores. However, our new gen4-capable Ryzen test system is fully up to date with the latest firmware, microcode and OS versions.

AnandTech Ryzen PCIe 4.0 Consumer SSD Testbed
CPU AMD Ryzen 5 3600X
Motherboard ASRock B550 Pro
Memory 2x 16GB Mushkin DDR4-3600
Software Linux kernel version 5.8, fio version 3.23

Our new PCIe 4 test system uses an AMD Ryzen 5 3600X processor and an ASRock B550 motherboard. This provides PCIe 4 lanes from the CPU but not from the chipset. Whenever possible, we test NVMe SSDs with CPU-provided PCIe lanes rather than going through the chipset, so the lack of PCIe gen4 from the chipset isn't an issue. (We had a similar situation back when we were using a Haswell system that supported gen3 on the CPU lanes but only gen2 on the chipset.) Going with B550 instead of X570 also avoids the potential noise of a chipset fan. The DDR4-3600 is a big jump compared to our previous testbed, but is a fairly typical speed for current desktop builds and is a reasonable overclock. We're using the stock Wraith Spire 2 cooler; our current SSD tests are mostly single-threaded, so there's no need for a bigger heatsink.

For now, we are still using the same test scripts to generate the same workloads as on our older Skylake testbed. We haven't tried to control for all possible factors that could lead to different scores between the two testbeds. For this review, we have re-tested several drives on the new testbed to illustrate the scale of these effects. In future reviews, we will be rolling out new synthetic benchmarks that will not be directly comparable to the tests in this review and past reviews. Several of our older benchmarks do a poor job of capturing the behavior of the increasingly common QLC SSDs, but that's not important for today's review. The performance differences between new and old testbeds should be minor, except where the CPU speed is a bottleneck. This mostly happens when testing random IO at high queue depths.

More important for today is the fact that our old benchmarks only test queue depths up to 32 (the limit for SATA drives), and that's not always enough to use the full theoretical performance of a high-end NVMe drive—especially since our old tests only use one CPU core to stress the SSD. We'll be introducing a few new tests to better show these theoretical limits, but unfortunately the changes required to measure those advertised speeds also make the tests much less realistic for the context of desktop workloads, so we'll continue to emphasize the more relevant low queue depth performance.

Samsung 980 Pro Cache Size Effects
Comments Locked

137 Comments

View All Comments

  • Luckz - Thursday, September 24, 2020 - link

    At reasonable things like 4K random IOPS, the 1TB P31 seems to crush the 2TB Evo Plus.
  • Notmyusualid - Tuesday, October 6, 2020 - link

    @ Hifi.. - yes totally agree on the latency.

    That is why TODAY I just received my 1TB 970 Pro for my laptop. Even choosing it over the 980's... it was the Avg write latency table that sealed the deal for me. (See ATSB Heavy / Write)

    My Toshiba X5GP 2TB (supposedly enterprise class ssd) is not able to keep up with the huge writes my system endures most days. My write performance drops by about 10x, and when I replay my data, there are clear drop-outs.

    The loss of capacity will be a pain, but I'll push old data to the 2TB, as reads on that disk are normal, and if I need to work on a data set, I'll just have to pull it across to the 970 Pro again.

    My 2c.
  • romrunning - Tuesday, September 22, 2020 - link

    What this review has done for me is to whet my appetite for an Optane drive. I'm looking forward to seeing how the new AlderStream Optane drives perform!
  • viktorp - Wednesday, September 23, 2020 - link

    Right here with you. Goodbye Samsung, nice knowing you.
    Will advise all my clients to stay away form Samsung for mission critical storage.
    Wish we had a choice of selecting SLC, MLC, TLC, trading capacity for reliability, if desired.
  • _Rain - Wednesday, September 23, 2020 - link

    For the sake of your clients, please advice them to use enterprise drives for mission critical storage.
    Those Qvos, Evos and Pros are just client storage drives and not meant for anything critical.
    and of course you can limit the drives capacity to lesser value in order to gain some spare endurance. For example quote 384GB on 512GB drive will definitely double your endurance.
  • FunBunny2 - Wednesday, September 23, 2020 - link

    "please advice them to use enterprise drives for mission critical storage."

    does anyone, besides me of course, remember when STEC made 'the best enterprise' SSD? anybody even know about STEC? or any of the other niche 'enterprise' SSD vendors?
  • XabanakFanatik - Tuesday, September 22, 2020 - link

    It's almost like my comment on the previous article about the anandtech bench showing the 970 Pro is still faster due to the move to TLC were accurate.

    On the random, when the 980 beats the 970 pro it's by the smallest margin.

    Samsung has really let the professionals like myself that bought pro series drives exclusively down.

    Not to mention over 2 years later than the 970 Pro and it's marginally faster sometimes outside raw burst sequential read/write.
  • Jorgp2 - Tuesday, September 22, 2020 - link

    Don't all GPUs already decompress textures?

    And the consoles only have hardware compression to get the most out of their CPUs, same for their audio hardware, video decoders, and hardware scalers.

    There's plenty of efficient software compression techniques, Windows 10 even added new ones that can be applied at the filesystem level. They have good compression, and very little overhead to decompress in real time.
    Only downside is that it's a windows 10 feature, that means it's half baked. Setting the compression flag is ignored by windows, you have to compress manually every time.
  • Ryan Smith - Tuesday, September 22, 2020 - link

    "Don't all GPUs already decompress textures?"

    Traditional lossy texture compression is closer to throwing data out at a fixed ratio than it is compression in the lossless sense. Compressed textures don't get compressed so much as texture units interpolate the missing data on the fly.

    This is opposed to lossless compression, which is closer to ZIP file compression. No data is lost, but it has to be explicitly unpacked/decompressed before it can be used. Certain lossless algorithms work on compressed textures, so games store texture data with lossless compression to further keep the game install sizes down. The trade-off being that all of this data then needs uncompressed before the GPU can work on it, and that is a job traditionally done by the CPU.
  • jordanclock - Thursday, September 24, 2020 - link

    This fast of a drive combined with DirectStorage has me very excited for this particular reason. Though, as I understand it, DirectStorage requires the game to explicitly call the API and thus needs to be built into the game, as opposed to a passive boost to every game.

Log in

Don't have an account? Sign up now