Dynamic Power Management: A Quantitative Approach

Name: Dynamic Power Management: A Quantitative Approach
Item: Dynamic Power Management: A Quantitative Approach
Author: Johan De Gelas

by Johan De Gelas on January 18, 2010 2:00 AM EST

Posted in
IT Computing

35 Comments | Add A Comment

35 Comments

How Much Power?

All this hardcore testing just made us more curious. Would we be able to determine how much power the PCU of Nehalem actually saves? Let's add a little more machine code to our hardware C-state scripts. The MSR 3FCh contains the info we need. We test once again with two active chess threads.

PCU Sleep State Comparison
	Clockticks	Ticks spent in C3	Ticks spent in C6	Percentage C3	Percentage C6
Core 1	2961889630	3497984	71450624	0.12%	2.41%
Core 2	2989850634	4128768	768581632	0.14%	25.71%
Core 3	3022277437	186195968	1032536064	6.16%	34.16%
Core 4	3033988899	171286528	387645440	5.65%	12.78%
Average				3.02%	18.76%

At first you may think that these measurements contradict our previous measurements even though they were measured in the same circumstances (two active threads + one measurement thread). But if you calculate how much time the cores spend on average in C6, you get 19%, in the same ballpark as our previous measurement (21%). Notice that the PCU forces the Xeon cores to move quickly from C3 to a deeper C6 sleep: only 3% (!) is spent in C3.

So this means that the ACPI C2 state consists of 13.85% C3 and 86.15% C6 (18.76/ (3.02 + 18.76). Let's take the ACPI readings again.

ACPI C-State Comparison
	% idle	C1	C2	C3
Opteron 2435	86	100	0	0
Xeon L3426	81	7	93	0
Opteron 2389	72.44	100	0	0

So now we can calculate how much time the CPU actually spent in the real hardware C-states.

% time spent in C1 = 7% of 81% idle

The "software" ACPI C2 states are mapped by the Xeon CPU to two "hardware CPU" states:

% time spent in C3 = 13.85% out of 93% C3, at 81% idle = +/- 10.3%
% time spent in C6 = 86.15% out of 93% C3, at 81% idle = +/- 65%

So our two threads of Chess caused the L3426 cores to spend:

19% in C0
5.7% in C1
10.3% in C3
65% (!) in C6

…on average.

What effect would this have on the power consumption of the chip? Intel gives us a good idea of what each C-state consumes with the Xeon X3400 series. In the thermal specifications and design guidelines [6] we find this table.

Intel does not give us C1 power, but let's assume it is 25W on the L3426; our industry sources tell us this should be close enough. If the complex circuitry of the PCU was not available, the CPU would be limited to the C1 state to save power. Other C-states would only be available if all cores were idle or the system was idle. We assume that C0 consumes 45W, which is not far from the truth either as the CPUs with low TDP tend to be quicker.

Total power w/o PCU
= 45W * 19% (C0) + 25W * 81% (C1)
= 28.8W

Total Power with PCU
= 45W * 19% (C0) + 25W * 5.7% (C1) + 17W * 10.3% (C3) + 4W * 65% (C6)
= 14.5W

The actual absolute numbers are not that important, but our simplified calculation shows that the fact that the PCU forces the CPU to go very quickly to C6 allows the "Lynnfield" Xeon to morph from a rather mediocre low power CPU into a "real" low power CPU. 14W for four complex out of order processors is very impressive, less than 4W per core! Intel's claims are justified: the PCU enables the "Nehalem" based cores to run in a deep sleep C6 state, even if other cores are hard at work. To end with an interesting note: even with four threads active on the Xeon L3426 we found out that the cores spent 11% of the time in C6.

Analysis: What Happened? More Performance Please!

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

35 Comments

View All Comments

JohanAnandtech - Tuesday, January 19, 2010 - link
Well, Oracle has a few downsides when it comes to this kind of testing. It is not very popular in the smaller and medium business AFAIK (our main target), and we still haven't worked out why it performs much worse on Linux than on Windows. So chosing Oracle is a sure way to make the projecttime explode...IMHO.
ChristopherRice - Thursday, January 21, 2010 - link
Works worse on Linux then windows? You have a setup issue likely with the kernel parameters or within oracle itself. I actually don't know of any enterprise location that uses oracle on windows anymore. "Generally all Rhel4/Rhel5/Sun".
TeXWiller - Monday, January 18, 2010 - link
The 34xx series supports four quad rank modules, giving today a maximum supported amount of 32GB per CPU (and board). The 24GB limit is that of the three channel controller with unbuffered memory modules.
pablo906 - Monday, January 18, 2010 - link
I love Johan's articles. I think this has some implications in how virtualization solutions may be the most cost effective. When you're running at 75% capacity on every server I think the AMD solution could have possibly become more attractive. I think I'm going to have to do some independent testin in my datacenter with this.

I'd like to mention that focusing on VMWare is a disservice to Vt technology as a whole. It would be like not having benchmarked the K6-3+ just because P2's and Celerons were the mainstream and SS7 boards weren't quite up to par. There are situations, primarily virtualizing Linux, where Citrix XenServer is a better solution. Also many people who are buying Server '08 licenses are getting Hyper-V licenses bundled in for "free."

I've known several IT Directors in very large Health Care organization who are deploying a mixed Hyper-V XenServer environment because of the "integration" between the two. Many of the people I've talked to at events around the country are using this model for at least part of the Virtualization deployments. I believe it would be important to publish to the industry what kind of performance you can expect from deployments.

You can do some really interesting HomeBrew SAN deployments with OpenFiler or OpeniSCSI that can compete with the performance of EMC, Clarion, NetApp, LeftHand, etc. NFS deployments I've found can bring you better performance and manageability. I would love to see some articles about the strengths and weaknesses of the storage subsystem used and how it affects each type of deployment. I would absolutely be willing to devote some datacenter time and experience with helping put something like this together.

I think this article really lends itself well into tieing with the Virtualization talks and I would love to see more comments on what you think this means to someone with a small, medium, and large datacenter.
maveric7911 - Tuesday, January 19, 2010 - link
I'd personally prefer to see kvm over xenserver. Even redhat is ditching xen for kvm. In the environments I work in, xen is actually being decommissioned for VMware.
JohanAnandtech - Tuesday, January 19, 2010 - link
I can see the theoretical reasons why some people are excited about KVM, but I still don't see the practical ones. Who is using this in production? Getting Xen, VMware or Hyper-V do their job is pretty easy, KVM does not seem to be even close to being beta. It is hard to get working, and it nowhere near to Xen when it comes to reliabilty. Admitted, those are our first impressions, but we are no virtualization rookies.

Why do you prefer KVM?
VJ - Wednesday, January 20, 2010 - link
"It is hard to get working, and it nowhere near to Xen when it comes to reliabilty. "

I found Xen (separate kernel boot at the time) more difficult to work with than KVM (kernel module) so I'm thinking that the particular (host) platform you're using (windows?) may be geared towards one platform.

If you had to set it up yourself then that may explain reliability issues you've had?

On Fedora linux, it shouldn't be more difficult than Xen.
Toadster - Monday, January 18, 2010 - link
One of the new technologies released with Xeon 5500 (Nehalem) is Intel Intelligent Power Node Manager which controls P/T states within the server CPU. This is a good article on existing P/C states, but will you guys be doing a review of newer control technologies as well?

http://communities.intel.com/community/openportit/...">http://communities.intel.com/community/...r-intel-...
JohanAnandtech - Tuesday, January 19, 2010 - link
I don't think it is "newer". Going to C6 for idle cores is less than a year old remember :-).

It seems to be a sort of manager which monitors the electrical input (PDU based?) and then lowers the p-states to keep the power at certain level. Did I miss something? (quickly glanced)

I think personally that HP is more onto something by capping the power inside their server management software. But I still have to evaluate both. We will look into that.
n0nsense - Monday, January 18, 2010 - link
May be i missed something in the article, but from what I see at home C2Q (and C2D) can manage frequencies per core.
i'm not sure it is possible under Windows, but in Linux it just works this way. You can actually see each core at its own frequency.
Moreover, you can select for each core which frequency it should run.

Dynamic Power Management: A Quantitative Approach

Post Your Comment

35 Comments

View All Comments

JohanAnandtech - Tuesday, January 19, 2010 - link

ChristopherRice - Thursday, January 21, 2010 - link

TeXWiller - Monday, January 18, 2010 - link

pablo906 - Monday, January 18, 2010 - link

maveric7911 - Tuesday, January 19, 2010 - link

JohanAnandtech - Tuesday, January 19, 2010 - link

VJ - Wednesday, January 20, 2010 - link

Toadster - Monday, January 18, 2010 - link

JohanAnandtech - Tuesday, January 19, 2010 - link

n0nsense - Monday, January 18, 2010 - link

Log in

Don't have an account? Sign up now