Power and Battery Life

Earlier in the year AMD was keen to promote that in Renoir it has made significant advances as to how power is managed across the APU, leading to increased performance and better battery life. The two key figures here were ‘20% reduced SoC power’ and ‘5x reduction in power gating latency’ (also known as an 80% reduction, because you can’t have a 5x reduction of a time). We now have some details.

First up it should be mentioned that 7nm helps a lot here. The smaller process node, with smaller transistors (assuming they’ve been laid out correctly), will require a lower voltage. That lower voltage directly translates into lower power, and we’ve seen how well AMD has pushed the 7nm designs on the desktop and in the enterprise space to know that compared to previous process nodes, there is a lot of power to save here. That being said, the design choices and features matter too.

AMD’s power management all goes through a system-level management controller. For this generation, AMD has re-written the firmware with speed in mind (they claim 33% faster), but also made other improvements, such as aggressive clock gating of the L3 cache when not needed, and using power optimized circuits for IO features such as for the embedded display controller and PCIe physical layers.

The updated system management controller (SMC) is built around user preference. In this case if the user tells the OS he or she wants more performance, or more battery life, then the SMC can take into consideration everything involved in the system and plan accordingly. If the OS can provide guidance as to an upcoming workload, then voltages and frequencies (or parts of the chip unused can be put in idle), then the SMC is built to understand it.

Ultimately there are many sensors around the APU, monitoring activity and the type of activity going on in that particular region, even down to the types of instructions being used. The SoC is a lot more dynamic in its clock control, allowing for different clock domains in various parts of the SoC to be adjusted depending on both the activity of the region but also the thermal limits, system limits, and other items that might affect performance. This is especially useful for powering down parts of the SoC that are not in use, leading to AMD’s efficiency claims, or the performance claims such as maintaining a specific bandwidth across an interconnect (quality of service). The thresholds for these activity monitors can be set by the OS and by the user. The SMU also takes into account the power source (battery vs power supply) and connected hardware (displays, power over USB).

For the power gating latency, AMD has doubled the save and restore bus width from the buffers to the cores, allowing for a system to resume faster from a CPUOFF state. Not only this, but AMD is using the ACPI 6.3 specifications to take advantage of offering multiple C states in the OS.

One of the issues of the previous generation of Picasso APUs, on the left, is that there was only a single set of states that the processor could be in. This means that at any time, the CPU could fall from a power state (a P state) into a lower power state, or an idle state, or an off state. If the CPU went too far down this stack, while it would be saving power, each hop down the rabbit hole meant a longer time to get back out of it, diminishing performance and latency but also requiring more power changes at the silicon level. Each hop in its own right requires additional power.

With the new Renoir designs, a system can take advantage of multiple different sets of states. This means that the CPU can’t go down too low when the system is in use. With a system in use, the OS or system controller can’t put parts of the core into low power states because those are not available, which means that even if the system goes into the lowest power mode possible, while the system is still being used, then there are fewer jumps to get back up to high speed.

As the system becomes less used, known as ‘increased idle duration’, then the system has access to sets of states that allow the parts of the APU to enter deeper idle states. This means that the system can only enter a low frequency domain if that part of the core has been sufficiently idle, or user interaction has willed it.

This is all part of the ACPI 6.3 standard, and AMD states that this combined with the reduced SoC power gives both better battery life and better immediate performance for the user. To show this in action, AMD pinpointed a common activity that most users might be familiar with: opening applications.

In this case, AMD took the start of the PCMark 10 Application Loading benchmark. In this benchmark a number of applications are loaded, and the requirements are often more IO driven than CPU driven. A good CPU with a fast reaction time will keep its power and frequency low while the IO requests are being done, and speed up one or two threads when the CPU needs to get involved.

In AMD’s benchmark, where they are using frequency as a proxy for power, They show that in the initial 5 seconds of the test, the new Ryzen 4000 CPU is hovering at an idle frequency, whereas the older Ryzen 3000 CPU is fluttering around, even peaking near 4.0 GHz when it doesn’t need to. This allows parts of the new CPU to be powered down for longer periods of time, even when the system is actually in use.

When I asked AMD’s executives where they stand on battery life, one of them hinted that the difference between themselves and the competition (in similar designs) should be on the order of minutes rather than dozens of minutes. Specifically AMD sees itself better than the competition in productivity/web browsing workloads, graphics workloads, and video playback, and cited that most battery benchmarks don’t often take into account a good mix of ‘the average user’. A number of the media responded that often our benchmarks are geared towards different types of users consummate to our audience, such as gamers or content creators. Ultimately we will see what the results are when we have hardware on hand.

What’s New in CPU, GPU, and Connectivity for the Renoir APU AMD SmartShift and System Temperature Tracking (Version 2)
POST A COMMENT

95 Comments

View All Comments

  • eek2121 - Monday, March 16, 2020 - link

    This is an intriguing part. I am hoping for laptop designs with a 4800U and 5600M, but also desktop APUs. Hopefully AMD can bring some of the nee stuff forward to desktop Zen 3 as well. Reply
  • heffeque - Monday, March 16, 2020 - link

    It would be interesting to see these in fan and fanless AMD versions of Surface Pro versus fan and fanless Intel versions of Surface Pro.

    I'm especially interested in battery life, since AMD 3780U Surface Pro has horrible battery life compared to its Intel counter part.
    Reply
  • The_Assimilator - Monday, March 16, 2020 - link

    The fact that OEMs are willing to make custom designs for AMD is already a good sign that they're confident in the product. Lisa Su certainly has the right stuff. Reply
  • Khenglish - Monday, March 16, 2020 - link

    I'm pretty unimpressed by the GPU vs the Vega 11 in APU desktops. The only major advantage Renoir has is higher clocks on the GPU core and higher officially supported memory speeds. They likely got the 56% performance per core improvement by comparing to a Zen+ with Vega 11, which will be severely clocked constrained on 12nm with a bigger core, where Renoir gets an even higher clock advantage not just from the nominal clock, but also from Picasso APUs hitting their TDP limit hard in a 25W or 35W environment.

    On desktop with much higher TDPs I expect Renoir to slightly beat the 3400g at stock clocks, but lose when comparing overclocked results. Picasso easily overclocks up to 1700-1800 MHz from the measly 1240 MHz stock clock. I would guess Renoir would hit around 2000, not enough to compensate for the smaller core.
    Reply
  • eek2121 - Tuesday, March 17, 2020 - link

    There are a lot of problems with your comment, but let’s start with the obvious: The TDP of the part you mentioned is at least triple that of the 4800U. Depending on how the chip is configured it is quadruple.

    These are laptop parts, we haven’t seen desktop APUs. AMD could add 3X as many Vega cores and still hit a 45-65 watt TDP or they can go aggressive on the CPU clocks like they did the 4900H.
    Reply
  • Spunjji - Tuesday, March 17, 2020 - link

    I'm pretty sure the desktop APU won't have more Vega CUs. Reply
  • tygrus - Tuesday, March 17, 2020 - link

    These days doubling the GPU cores/units and running half the speed is more energy efficient. Uses more die space but I don't understand the focus on GPU MHz over energy efficiency. Reply
  • Spunjji - Tuesday, March 17, 2020 - link

    1) Not sure evidence I've seen bears out that a 1700-1800Mhz GPU overclock is "easy". That sounds like the higher end of what you can expect. Would welcome evidence to the contrary, as I'm still considering picking one up.
    2) RAM speed is the big difference here. The desktop APU should get much higher memory speeds than the 3400G due to the improved Zen 2 memory controller, which ought to relieve a significant bottleneck. GPU core overclocks weren't actually the best route to wringing performance out of the 3400G.
    Reply
  • Fataliity - Monday, March 16, 2020 - link

    @Ian Cuttress, did they say what version of N7 they used for this? The density looks like either a HPC + variant, or a N7 mobile variant from what I can tell?

    Thank you!
    Reply
  • abufrejoval - Monday, March 16, 2020 - link

    What I want is choice. And flexibility to enable it.

    15 Watt TDP typically isn’t a hard limit nor is 35 or 45 Watt for that matter: It’s mostly about what can be *sustained* for more than a second or two. Vendors have allowed bursting at twice or more TDP because that’s what often defines ‘user experience’ and sells like hotcakes on mobile i7’s.

    We all know the silicon is the same. Yes, there be binning but a 15 Watt part sure won’t die at 35, 45 or even 65 or 95 Watts for that matter: It will just need more juice and cooling. And of course, a design for comfortable cooling of 15 Watts won’t take 25 or 35 Watts without a bit of ‘screaming’.

    But why not give a choice, when noise matters less than a deadline and you don’t want to buy a distinct machine for a temporary project?

    I admit to have run machine-learning on Nvidia equipped 15.4” slim-line notebooks for days if not weeks, and having to hide them in a closet, because nobody in the office could tolerate the noise they produced at >100 Watts of CPU and GPU power consumption: That’s fine, really, when you can choose what to do where and when.

    Renoir has a huge range of load vs. power consumption: Please, please, PLEASE ensure that in all form factors users can make a choice of power consumption vs. battery life or cooling by setting max and sustained Wattage preferably at run-time and not hard-wiring this into distinct SKUs. I’d want a 15 Watt ultrabook to sustain a 35 Watt workload screaming its head off, just like I’d like a 90 Watt desktop or a 60 Watt NUC to calm down to 45/35/25 Watt sustained for night-long batches in the living room or bed-side—if that’s what suits my needs: It’s not a matter of technology, just a matter of ‘product placement’.
    Reply

Log in

Don't have an account? Sign up now