Investigating Performance of Multi-Threading on Zen 3 and AMD Ryzen 5000
by Dr. Ian Cutress on December 3, 2020 10:00 AM EST- Posted in
- CPUs
- AMD
- Zen 3
- X570
- Ryzen 5000
- Ryzen 9 5950X
- SMT
- Multi-Threading
Gaming Performance (Discrete GPU)
For our gaming tests, we are using our AMD Ryzen 9 5950X paired with an NVIDIA RTX 2080 Ti graphics card. Our standard test suite consists of 12 titles, tested at four configurations:
- Stage 1: Actual Gaming (1080p Maximum Quality, or equivalent)
- Stage 2: All About Pixels (‘4K Minimum’ Quality)
- Stage 3: Medium Low (‘1440p Minimum’)
- Stage 4: Lowest Lows (720p Minimum or lower)
The final three settings are a set of CPU-limited gaming, and help find the limit of where we move from CPU limited to GPU limited. Some users baulk at this testing finding it irrelevant, however these configurations have been widely requested over the years. The contraire to this testing is the first setting, at 1080p Maximum: this being requested given that 1080p is the most popular gaming resolution, and Maximum Quality because this graphics card should be able to handle almost everything at that resolution at very playable framerates.
All the details for our gaming tests can be found in our #CPUOverload article.
Stage 1: Actual Gaming AMD Ryzen 9 5950X, SMT On vs SMT Off |
|||
AnandTech | Settings | Average FPS |
95th Percentile |
Chernobylite | 1080p Max | 100% | - |
Civilization 6 | 1080p Max | 103% | - |
Deus Ex: MD | 1080p Max | 99% | 100% |
Final Fantasy 14 | 1080p Max | 102% | - |
Final Fantasy 15 | 8K Standard | 100% | 99% |
World of Tanks | 1080p Max | 100% | 102% |
World of Tanks | 4K Max | 103% | 102% |
Borderlands 3 | 1080p Max | 101% | 103% |
F1 2019 | 1080p Ultra | 103% | 106% |
Far Cry 5 | 1080p Ultra | 104% | 104% |
GTA V | 1080p Max | 99% | 100% |
RDR 2 | 1080p Max | 100% | 100% |
Strange Brigate | 1080p Ultra | 101% | 101% |
In real-world gaming situations, there’s very little to pick between having SMT enabled or disabled. Almost universally it is either beneficial or a smidgen better to have it enabled, with F1 2019, Civilization 6, and Far Cry 5 seemingly the best recipients. I’ve also added in the Stage 3 result from World of Tanks, just because that benchmark doesn’t really have a proper settings menu.
Stage 2: All About Pixels AMD Ryzen 9 5950X, SMT On vs SMT Off |
|||
AnandTech | Settings | Average FPS |
95th Percentile |
Chernobylite | 4K Low | 99% | - |
Civilization 6 | 4K Min | 105% | - |
Deus Ex: MD | 4K Min | 98% | 100% |
Final Fantasy 14 | 4K Min | 102% | - |
Final Fantasy 15 | 4K Standard | 100% | 100% |
Borderlands 3 | 4K Very Low | 101% | 104% |
F1 2019 | 4K Ultra Low | 100% | 100% |
Far Cry 5 | 4K Low | 101% | 100% |
GTA V | 4K Low | 100% | 101% |
RDR 2 | 8K Min | 100% | 100% |
Strange Brigate | 4K Low | 100% | 100% |
With our high resolution settings with minimal quality, there is only one outlier in Civilization 6 on the average frame rates, which seem to be a bit higher when SMT is enabled.
Stage 3: Medium Low AMD Ryzen 9 5950X, SMT On vs SMT Off |
|||
AnandTech | Settings | Average FPS |
95th Percentile |
Chernobylite | 1440p Low | 100% | - |
Civilization 6 | 1440p Min | 105% | - |
Deus Ex: MD | 1440p Min | 97% | 96% |
Final Fantasy 14 | 1440p Min | 102% | - |
Final Fantasy 15 | 1080p Standard | 101% | 105% |
World of Tanks | 1080p Standard | 101% | 101% |
Borderlands 3 | 1440p Very Low | 103% | 105% |
F1 2019 | 1440p Ultra Low | 99% | 99% |
Far Cry 5 | 1440p Low | 99% | 99% |
GTA V | 1440p Low | 100% | 99% |
RDR 2 | 1440p Low | 100% | 100% |
Strange Brigate | 1440p Low | 100% | 100% |
At the more medium settings, we’re starting to see some more variation (Borderlands gets a few percent from SMT). We’re starting to see Deus Ex:MD drop off a bit with SMT enabled.
Stage 4: Lowest Lows AMD Ryzen 9 5950X, SMT On vs SMT Off |
|||
AnandTech | Settings | Average FPS |
95th Percentile |
Chernobylite | 360p Low | 106% | - |
Civilization 6 | 480p Min | 102% | - |
Deus Ex: MD | 600p Min | 91% | 91% |
Final Fantasy 14 | 768p Min | 102% | - |
Final Fantasy 15 | 720p Standard | 99% | 102% |
World of Tanks | 768p Min | 101% | 100% |
Borderlands 3 | 360p Very Low | 108% | 110% |
F1 2019 | 768p Ultra Low | 102% | 105% |
Far Cry 5 | 720p Low | 100% | 101% |
GTA V | 720p Low | 99% | 98% |
RDR 2 | 384p Low | 100% | 103% |
Strange Brigate | 720p Low | 95% | 95% |
This is perhaps our most varied set of results, with Deus Ex:MD showing an almost 10% drop with SMT enabled. DEMD is usually considered a CPU title, but so is Chernobylite, which sees a 6% gain. Borderlands is +8-10% with SMT enabled, which is more of a modern game. However, I doubt anyone is playing at these resolutions.
Overall Gaming Performance
If we take full averages from all the data points, then we’re seeing a rough +1% gain in performance in the more complex scenarios across the board.
Resolution Average Comparison AMD Ryzen 9 5950X, SMT On vs SMT Off |
||||
AnandTech | Setting | aka | Average FPS |
95th Percentile |
Stage 1 | 1080p Max | Actual Gaming | 101% | 101% |
Stage 2 | 4K+ Min | All About Pixels | 101% | 101% |
Stage 3 | 1440p Min | Medium Lows | 101% | 101% |
Stage 4 | < 768p Min | Lowest Lows | 100% | 101% |
In reality, any loss or gain is highly dependent on the title in question, and can swing from one side of the line to the other. It’s clear that Deus Ex prefers SMT off, and F1 2019 or Borderlands prefers SMT on, but we are talking fine margins here.
126 Comments
View All Comments
MrSpadge - Friday, December 4, 2020 - link
That doesn't mean SMT is mainly responsible for that. The x86 decoders are a lot more complex. And at the top end you get diminishing performance returns for additional die area.Wilco1 - Friday, December 4, 2020 - link
I didn't say all the difference comes from SMT, but it can't be the x86 decoders either. A Zen 2 without L2 is ~2.9 times the size of a Neoverse N1 core in 7nm. That's a huge factor. So 2 N1 cores are smaller and significantly faster than 1 SMT2 Zen 2 core. Not exactly an advertisement for SMT, is it?Dolda2000 - Friday, December 4, 2020 - link
>Graviton 2 gives 75-80% of the performance of the fastest Rome at less than a third of the areaTo be honest, it wouldn't surprise me one bit if 90% of the area gives 10% of the performance. Wringing out that extra 1% single-threaded performance here or there is the name of the game nowadays.
Also there are many other differences that probably cost a fair bit of silicon, like wider vector units (NEON is still 128-bit, and exceedingly few ARM cores implement SVE yet).
Wilco1 - Saturday, December 5, 2020 - link
It's nowhere near as bad with Arm designs showing large gains every year. Next generation Neoverse has 40+% higher IPC.Yes Arm designs opt for smaller SIMD units. It's all about getting the most efficient use out of transistors. Having huge 512-bit SIMD units add a lot of area, power and complexity with little performance gain in typical code. That's why 512-bit SVE is used in HPC and nowhere else.
So with Arm you get many different designs that target specific markets. That's more efficient than one big complex design that needs to address every market and isn't optimal in any.
whatthe123 - Saturday, December 5, 2020 - link
The extra 20% of performance is difficult to achieve. You can already see it on zen CPUs, where 16 core designs are dramatically more efficient per core in multithread running at around 3.4ghz, vs 8 core designs running at 4.8ghz. I've always hated these comparisons with ARM for this reason... you need a part with 1:1 watt parity to make a fair comparison, otherwise 80% performance at half the power can also be accomplished even on x86 by just reducing frequency and upping core count.Wilco1 - Saturday, December 5, 2020 - link
Graviton clocks low to conserve power, and still gets close to Rome. You can easily clock it higher - Ampere Altra does clock the same N1 core 32% higher. So that 20-25% gap is already gone. We also know about the next generation (Neoverse N2 and V1) which have 40+% higher IPC.Yes adding more cores and clocking a bit lower is more efficient. But that's only feasible when your core is small! Altra Max has 128 cores on a single die, and I don't think we'll see AMD getting anywhere near that in the next few years even with chiplets.
peevee - Monday, December 7, 2020 - link
It is obviously a lot LESS than 5%. Nothing that matters in terms of transistors (caches and vector units) increases. Even doubling of registers would add a few hundreds/thousands of transistors on a chip with tens of billions of transistors, less than 0.000001%.They can double all scalar units and it still would be below 1% increase.
Kangal - Friday, December 4, 2020 - link
I agree.Adding SMT/HT requires something like a +10% increase in the Silicon Budget, and a +5% increase in power draw but increases performance by +30%, speaking in general. So it's worth the trade-off for daily tasks, and those on a budget.
What I was curious to see, is if you disabled SMT on the 5950X, which has lots of cores. Leaving each thread with slightly more resources. And use the extra thermals to overclock the processor. How would that affect games?
My hunch?
Thread-happy games like Ashes of Singularity would perform worse, since it is optimised and can take advantage of the SMT. Unoptimized games like Fallout 76 should see an increase in performance. Whereas actually optimised games like Metro Exodus they should be roughly equal between OC versus SMT.
Dolda2000 - Friday, December 4, 2020 - link
>What I was curious to see, is if you disabled SMT on the 5950X, which has lots of cores.That is exactly what he did in this article, though.
Kangal - Saturday, December 5, 2020 - link
I guess you didn't understand my point.Think of a modern game which is well optimised, is both GPU intensive and CPU intensive. Such as Far Cry V or Metro Exodus. These games scale well anywhere from 4-physical-core to 8-physical-cores.
So using the 5950X with its 16-physical-cores, you really don't need extra threads. In fact, it's possible to see a performance uplift without SMT, dropping it from 32-shared-threads down to 16-full-threads, as each core gets better utilisation. Now add to that some overclocking (+0.2GHz ?) due to the extra thermal headroom, and you may legitimately get more performance from these titles. Though I suspect they wouldn't see any substantial increases or decreases in frame rates.
In horribly optimised games, like Fallout 76, Mafia 3, or even AC Odyssey, anything could happen (though probably they would see some increases). Whereas we already know that in games that aren't GPU intensive, but CPU intensive (eg practically all RTS games), these were designed to scale up much much better. So even with the full-cores and overclock, we know these games will actually show a decrease in performance from losing those extra threads/SMT.