What Is New: Zen+ Updates

For all the new Ryzen Threadripper 2000 series, the Zen cores inside are in ‘Zen+’ mode, which affords three or four main new features, identical to the Ryzen 2000-series.

First up are the faster caches – as we saw in our Ryzen 7 2700X review, the L1 and L2 caches are slightly faster, the L3 cache gets a boost too, and the main memory support goes up from DDR4-2666 to DDR4-2933. All this accounts for a 3% IPC increase, and is the result of better understanding the design and tweaking the internal dials to extract best performance.

Second, the Zen+ cores also take advantage of GlobalFoundries' 12nm process, an enhanced version of their 14nm process used for Threadripper-1000. While not an optical shrink, it does allow AMD to extract higher frequencies as well as reduce voltage at the same time. Along with the new turbo methodology, combining this with the 3% IPC gain from the caches resulted in an overall 10% performance gain in the Ryzen 2000-series processors.

Third is Precision Boost 2, which manages how the CPU implements its turbo depending on workload. Rather than referring to a fixed turbo table, relating how many cores are active to a given frequency, PB2 now means that the internal sensors guide how much power/temperature is still available and prompts the CPU to increase frequency until it hits that barrier. Due to the 25 MHz granularity of the multiplier, this allows the processor to boost as much as possible for performance. We saw this on the Ryzen 2000-series processors and it worked really well, although it is worth noting that it does increase power consumption for variable threaded workloads.

Fourth is XFR2, or ‘eXtended Frequency Range’. This is essentially the ‘temperature’ bit of Precision Boost 2, but uses the benefits of a cooler ambient temperature and better cooling to push the processor frequency. In for the mainstream Ryzen 2000-series processors, this afforded up to a 10-15% performance increase. For today’s announcement, as this is not the embargo for performance numbers, we can’t give you hard data. However AMD included both the Wraith Ripper (a 250W-rated air cooler) and the Enermax Liqtech 240 (a 500W-rated liquid cooler) in our press kits for exactly this reason. 

(A note here: we’re currently going through a heat wave in Europe, one of the biggest ever, and home air conditioning does not really exist in the UK.  As a result, AMD has hit a spot of potential bad luck, as it means a lot of reviewers will be hampered by the super-high ambient (32C+) ‘home office’ temperatures. I have lucked out – Intel invited me to an event in San Francisco this week, so despite having to cart 30kg of kit 5500 miles away, I am currently testing in a thermally controlled 20C hotel room while on the road. All this being said, it would be interesting if European reviewers that are struggling in the heat this week were to re-test in a few months, when ambient temperatures are back to being reasonably cool. As for Americans, we all know you lot love your AC, especially in AMD's home state of Texas)

Sweet Memories

One of the big questions when AMD initially announced the second generation of Threadripper was around the memory configuration. In the first generation, the two active dies on the chip each used two memory channels giving a total of four. For the second generation, with four active dies, we now have a non-uniform memory design: two dies have access to two memory channels each, while the other two dies have zero memory channels directly connected, meaning that memory accesses require a hop.

To clarify, as people were speculating, the design is not one memory channel per die. While not impossible, doing it that way would require adjustment of the pin-out arrangement and Threadripper firmware. This is only designed to be a mid-generation microarchitecture refresh, not a full update. One of the benefits is that these processors should go straight in to all motherboards currently on the market without a BIOS flash, although once installed, an updated BIOS is recommended for enhanced memory and feature support.

When discussing the matter with AMD, they noted that this memory configuration means that the scheduler in the operating system will aim to fill in the cores directly attached to a memory controller first. However, it will not be a simple case of filling up 16 cores across the two directly connected dies first: after the first few threads are allocated, new threads will enter a round-robin mode, where the ‘value’ of a thread landing on a core changes based on how the other cores are loaded. If it makes sense for power and temperature reasons, threads will spawn on the silicon not directly attached to memory, for example. So it is something to note, as Threadripper 2 core scheduling isn't going to be as simple as it may initially appear.

While users were speculating on a fairer memory distribution, almost no-one touched upon the PCIe situation. As with the memory, the PCIe lanes will also only come from two of the silicon dies, rather than split between all four. Most if not all motherboards should support multiple graphics cards and other add-in devices as a result.

AMD Ryzen Threadripper 2: Second Generation Show Me the Chips
POST A COMMENT

104 Comments

View All Comments

  • iwod - Tuesday, August 7, 2018 - link

    1. Multiple Thread application are INSANELY hard to write CORRECTLY. ( That is why we have
    RUST )

    2. There are still a lot of performance to be squeezed out from parallelism. As proved by Servo.

    3. Because Software has to care about the lowest common denominator, that is why no one is optimising for 8 Core yet.

    If we could push the bottom market to 8 Core, middle market to 16 and top end market to 32 Core, and each segment is then differentiated by its Full Core Speed. We may see software optimise for Multiple Core sooner.

    The only problem is 1. There is no incentive for them to do so and 2. The computer we have today are fast enough for majority of use case.
    Reply
  • npz - Tuesday, August 7, 2018 - link

    Finely threaded single applications isn't the only use. In guyr's example, one that I also use, it's multi-process. Even compilation is multi process. VM is inherently multi-threaded and multi-process. And there are many highly parallel applications that don't require much synchronization and fine grained locking which is where the difficulty comes from, so generalizing "Multiple Thread application are INSANELY hard to write CORRECTLY" is not accurate. Reply
  • Foeketijn - Tuesday, August 7, 2018 - link

    I'm now regulary waiting for excel to do some numbercrunching. 3 to 4 minutes 100% on all 8 threads (xeon e3 1240). I am wondering if such a threadripper would make that 20 to 30 seconds. If a 2700x would half that time, I am going to hit myself in the head for not going the threadripper route. Reply
  • BigDH01 - Tuesday, August 7, 2018 - link

    Depending on the nature of your formula graph in Excel the problem may not be easily to parallelize. Excel performs some tricks to try and determine if formulas can be calculated concurrently but they can and do fall victim to fragile nodes in their directed cyclic graphs. Even if your graph is very flat, they don't always get parallelism correct as maintaining those facts are either 1) hard to determine in a scalable manner 2) push a lot of state handling to the graph editing side of things which can cause massive slowdowns in user experience to make simple edits. Unfortunately, a lot of programs we use on the desktop aren't just hard to parallelize, but don't parallelize very well (far less than linear scaling). Traversing your graph while tracking state (because excel keeps track of circular dependencies) in the correct order is just a hard problem and even though they can pound your CPU by speculatively executing, you probably won't see a huge speedup unless you've taken steps to make your graph as flat as humanly possible. And if you are doing the latter, why not just use Access? Reply
  • Cooe - Monday, August 6, 2018 - link

    *facepalm*
    Then you obviously aren't the target market.
    Reply
  • cerealspiller - Monday, August 6, 2018 - link

    Legitimate is overrated :-) Reply
  • eastcoast_pete - Monday, August 6, 2018 - link

    Go AMD, keep holding chipzilla's feet to the fire and their pricing honest (Intel just reported new record earnings, so there is room there).
    Unrelated, while I assume that the inactive dies in the cheaper TRs may well be dies that binned too low or are just defective, and are locked down better than Fort Knox, just out of interest: Has anybody tried and succeeded to bring back the dead, i.e. reactivate the inactive ones? Anybody? Even trying would, of course, immediately void your warranty, but maybe, just maybe, somebody tried. Would love to hear what happened, successful or not.
    Reply
  • drajitshnew - Monday, August 6, 2018 - link

    I have thinking about the same thing since it was revealed that the inactive dies have also been etched by derbauer-- are not just pieces of silicon.
    I would like to read that review too.
    Reply
  • Da W - Monday, August 6, 2018 - link

    And then somehow, you'll see on Tomshardware ''We tested the new CPU with our 1995 suite of games, Intel has superior IPC and shows a 2% advantage on single threaded games, so Intel is better, buy Intel.'' :) Reply
  • Da W - Monday, August 6, 2018 - link

    Seriously though i've been waiting for this AMD for almost 2 decades. Good job! Reply

Log in

Don't have an account? Sign up now