What Is New: Zen+ Updates

For all the new Ryzen Threadripper 2000 series, the Zen cores inside are in ‘Zen+’ mode, which affords three or four main new features, identical to the Ryzen 2000-series.

First up are the faster caches – as we saw in our Ryzen 7 2700X review, the L1 and L2 caches are slightly faster, the L3 cache gets a boost too, and the main memory support goes up from DDR4-2666 to DDR4-2933. All this accounts for a 3% IPC increase, and is the result of better understanding the design and tweaking the internal dials to extract best performance.

Second, the Zen+ cores also take advantage of GlobalFoundries' 12nm process, an enhanced version of their 14nm process used for Threadripper-1000. While not an optical shrink, it does allow AMD to extract higher frequencies as well as reduce voltage at the same time. Along with the new turbo methodology, combining this with the 3% IPC gain from the caches resulted in an overall 10% performance gain in the Ryzen 2000-series processors.

Third is Precision Boost 2, which manages how the CPU implements its turbo depending on workload. Rather than referring to a fixed turbo table, relating how many cores are active to a given frequency, PB2 now means that the internal sensors guide how much power/temperature is still available and prompts the CPU to increase frequency until it hits that barrier. Due to the 25 MHz granularity of the multiplier, this allows the processor to boost as much as possible for performance. We saw this on the Ryzen 2000-series processors and it worked really well, although it is worth noting that it does increase power consumption for variable threaded workloads.

Fourth is XFR2, or ‘eXtended Frequency Range’. This is essentially the ‘temperature’ bit of Precision Boost 2, but uses the benefits of a cooler ambient temperature and better cooling to push the processor frequency. In for the mainstream Ryzen 2000-series processors, this afforded up to a 10-15% performance increase. For today’s announcement, as this is not the embargo for performance numbers, we can’t give you hard data. However AMD included both the Wraith Ripper (a 250W-rated air cooler) and the Enermax Liqtech 240 (a 500W-rated liquid cooler) in our press kits for exactly this reason. 

(A note here: we’re currently going through a heat wave in Europe, one of the biggest ever, and home air conditioning does not really exist in the UK.  As a result, AMD has hit a spot of potential bad luck, as it means a lot of reviewers will be hampered by the super-high ambient (32C+) ‘home office’ temperatures. I have lucked out – Intel invited me to an event in San Francisco this week, so despite having to cart 30kg of kit 5500 miles away, I am currently testing in a thermally controlled 20C hotel room while on the road. All this being said, it would be interesting if European reviewers that are struggling in the heat this week were to re-test in a few months, when ambient temperatures are back to being reasonably cool. As for Americans, we all know you lot love your AC, especially in AMD's home state of Texas)

Sweet Memories

One of the big questions when AMD initially announced the second generation of Threadripper was around the memory configuration. In the first generation, the two active dies on the chip each used two memory channels giving a total of four. For the second generation, with four active dies, we now have a non-uniform memory design: two dies have access to two memory channels each, while the other two dies have zero memory channels directly connected, meaning that memory accesses require a hop.

To clarify, as people were speculating, the design is not one memory channel per die. While not impossible, doing it that way would require adjustment of the pin-out arrangement and Threadripper firmware. This is only designed to be a mid-generation microarchitecture refresh, not a full update. One of the benefits is that these processors should go straight in to all motherboards currently on the market without a BIOS flash, although once installed, an updated BIOS is recommended for enhanced memory and feature support.

When discussing the matter with AMD, they noted that this memory configuration means that the scheduler in the operating system will aim to fill in the cores directly attached to a memory controller first. However, it will not be a simple case of filling up 16 cores across the two directly connected dies first: after the first few threads are allocated, new threads will enter a round-robin mode, where the ‘value’ of a thread landing on a core changes based on how the other cores are loaded. If it makes sense for power and temperature reasons, threads will spawn on the silicon not directly attached to memory, for example. So it is something to note, as Threadripper 2 core scheduling isn't going to be as simple as it may initially appear.

While users were speculating on a fairer memory distribution, almost no-one touched upon the PCIe situation. As with the memory, the PCIe lanes will also only come from two of the silicon dies, rather than split between all four. Most if not all motherboards should support multiple graphics cards and other add-in devices as a result.

AMD Ryzen Threadripper 2: Second Generation Show Me the Chips
POST A COMMENT

104 Comments

View All Comments

  • evernessince - Wednesday, August 8, 2018 - link

    Seriously. Tom's hardware has some crazy single threaded benchmarks. I stopped reading them when they refused to remove project cars from their benchmark suite, which was heavily optimized for Nvidia. It's like they don't realize what an outlier is. Reply
  • SetiroN - Monday, August 6, 2018 - link

    The memory configuration is going to be a huge bottleneck.

    Just try you try to use a 32 core Epyc with only 4 channels populated: performance it's hindered so badly you end up making very little use of the additional cores unless you're not accessing memory at all.

    This all feels like an afterthought.
    Reply
  • artk2219 - Monday, August 6, 2018 - link

    So you're telling me AMD is shaving off features from their more expensive server parts so that theres some market differentiation? For shame! Seriously though, it is annoying that TR4 and SP3 are "2 different sockets", would have been nice to be able to use Epyc's in TR4. Reply
  • drajitshnew - Monday, August 6, 2018 - link

    My "guess" is that while TR4 ( SP3R2) and SP3 are both 4094 pins, in TR4 the pins leading to the 2nd 2 processors are just that-- pins. They are just for physical support & are not electrically connected to anything. Hence, to maintain backwards comptibility AMD disabled the memory & PCIE of the second pair of dies Reply
  • eastcoast_pete - Monday, August 6, 2018 - link

    While I also believe that there is no such thing as too much computing power, the 32 (and 24?) core TRs are the CPU equivalents of a 1,000 HP engine in a car: great for bragging rights, but only useful in very specific situations, and otherwise not faster than mere 8 core chips. In this case, the applications where 32 cores can make a difference are those that are not that dependent on memory speed/access. I would love to see some benchmarks for compiling and complex CAD situations.

    Overall, the question is/remains how well AMD executed on this second round of "NUMA on a chip".
    Lastly, about EPYC vs. TR: AMD learned from the master (Intel). It's not about not letting people run server chips on desktop boards, it's about blocking people from doing the opposite: using much less expensive desktop CPUs in server boards and for server applications. That is also why desktop CPUs and chipsets basically never support ECC RAM, which is a requirement for many servers. TR is almost "EPYC", but just not quite, so you still have to buy EPYC and pay epic prizes for your servers. But than, Intel does the same, and gouges us even worse.
    Reply
  • mapesdhs - Monday, August 6, 2018 - link

    Not sure how these are about blocking people from doing the opposite, since they do support ECC, so surely one could use these CPUs just as they are with a good quality consumer mbd and they'd do just fine for a wide range of server tasks, using ECC memory if desired. If companies cared about cost that much then this is an option. Most though won't do that. There's a belief that companies will cram a consumer chip onto a pro board if they can, but really that's very rare as most bulk buyers of workstations and servers get them from OEMs, very few build their own.

    Nobody's gouging anyone btw, it's still a free market choice whether to buy Intel or not.
    Reply
  • smilingcrow - Monday, August 6, 2018 - link

    In theory TR boards can support ECC but I've heard reports that validation of ECC RAM is not exactly a priority and with all the work Ryzen boards required regarding RAM that's not a surprise.
    So anybody here built a TR ECC system and how did you get on? 1st hand reports are always better.
    Reply
  • Oxford Guy - Tuesday, August 7, 2018 - link

    ECC RAM is sold at slower speeds than typical enthusiast RAM. I fail to see why validation would be necessary. The fastest ECC RAM I know of is only 2666. If there is anything faster it should still fit within the TR2 spec. Reply
  • imaheadcase - Monday, August 6, 2018 - link

    So why did the CPU race slow to a crawl now for years? Have we actually reached a "safe" limit for CPUs until some new tech can make it faster? I know the need isn't as great as it used to be, but remember the days that CPU speed leaped so much each generation..like 500mhz jumps each new CPU it seemed. Now we are seeing boosts..which is basically like saying "We can go this high, but its just a limit because we not sure of ourselves". Reply
  • DigitalFreak - Monday, August 6, 2018 - link

    Two reasons come to mind - technology and competition. It's becoming increasingly difficult to go to smaller process nodes (see Intel 10nm) which are necessary to make faster chips. As to competition, Intel hasn't had any until AMD's Zen architecture. They're not going to put a lot of money into R&D if they don't have to. Unfortunately for them, AMD caught them with their pants down, and their 10nm process has had nothing but problems. Reply

Log in

Don't have an account? Sign up now