vApus Mark I: Performance-Critical Applications Virtualized

If you have virtualized your datacenter a while ago, chances are that the light loads are already virtualized. What is next? Well, if you have been following the virtualization scene, you’ll know that the virtualization vendors are very actively promoting that you should virtualize your performance-critical applications. vSphere 4 allows you to use up to 8 vCPUs and up to 255 GB of RAM, Xenserver 8 vCPUs and 32 GB RAM. Hyper-V is still lagging with only 4 vCPUs and a maximum of 16 CPUs (24 with the “Dunnington” hotfix”) per host. But that will change in Hyper-V R2. Bottom line is, it is getting attractive to virtualize “heavy duty” applications too. If only to be able to migrate them (“Vmotion”, “Xenmotion”, “Live Migration”) or manage them more easily.

That is where vApus Mark I comes in: one OLAP, one DSS and two heavy websites are combined in one tile. These are the kind of demanding applications that still got their own dedicated and natively running machine a year ago. vApus Mark I shows what will happen if you virtualize them. If you want to fully understand our benchmark methodology: vApus Mark I has been described in great detail here. We have changed only one thing compared to our previous benchmarking: we used large pages as it is generally considered as a best practice (with RVI, EPT). This increases performance by 4 to 5%.
Our other choices remain the same:
  • RVI and EPT are enabled on all VMs if possible
  • HT-Assist is off, unless indicated otherwise

vApus Mark I uses four VMs with four server applications:

- A SQL Server 2008 x64 database running on Windows 2008 64-bit, stress tested by our in-house developed vApus software.
- Two heavy-duty MCS eFMS portals running PHP, IIS on Windows 2003 R2, stress tested by our in-house developed vApus software.
- One OLTP database, based on the Oracle 10G Calling Circle benchmark of Dominic Giles.

The beauty is that vApus (stresstesting software developed by the Sizing Servers Lab) uses actions made by real people (as can be seen in logs) to stresstest the VMs, not some benchmarking algorithm. First we look at the results in ESX 3.5 Update 4, at the moment the most popular hypervisor.

Sizing Servers vAPUS Mark I  - ESX 3.5

If you just plug Istanbul into your virtualized server, you can't tell if you're running with a six-core or quad-core. You might remember from our previous article that a 2.9 GHz 2389 scored 203. Pretty dissapointing that six cores at 2.6 GHz equals 4 cores at 2.9 GHz. What went wrong? By default, the VMware ESX 3.5 scheduler logically partitions the available cores into groups of four, called “Cells”. The objective is to schedule VM’s always on the same cell, thereby making sure that the VM’s stay in the same node and socket. This should make sure that the VM always uses local memory (instead of needing remote memory of another node) and more importantly that the caches stay “warm”. If you use the default cell size of 4 cores, one or more VM’s will be split among two sockets with lots of traffic going back and forth. Once we increase the cell size from 4 to 6 (see VMware’s knowledge base), the ugly duck becomes a swan. The six-core Opteron keeps up with the best Xeons available!

The Xeon x55xx is however somewhat crippled in this case, as ESX 3.5 update 4 does not support EPT and does not make optimal use of HyperThreading. You can see from our measurements above that hyperthreading improves the score by about 17%. According to our OEM sources, VMmark improves by up to 30% on ESX 4.0. This shows that ESX 4.0 makes better use of HyperThreading. So let us see some ESX 4.0 numbers!

Reference 175.3




Server System Based On OLAP VM Webportal VM2 Webportal VM3 OLTP VM
Dual Xeon X5570 2.93 103% 50% 51% 95%
Dual Opteron 2435 2.6 91% 43% 43% 90%
Dual Opteron 2377 2.3 82% 36% 35% 53%

Sizing Servers vAPUS Mark I  - ESX 4.0

The Nehalem-based Xeon moves forward, but does not make a huge jump. Performance of the six-core Opteron was decreased by 2%, which is inside the error margin of this benchmark. It is still an excellent result for the latest Opteron: this results means it will have no trouble competing with the 2.66 Ghz Xeon X5550. VMmark tells us that the latest Xeon “Nehalem” starts to shine when you dump huge amounts of VM on top of the server. So we decided to test with 8 VM’s. It is very unlikely that you will consolidate more than 10 Performance-Critical applications on top of one physical server, so we feel that 8 VM’s should tell the whole story. We changed only one thing: we decreased the amount of memory to the webportals from 4 to 2 GB, to make sure that the benchmark fits within the maximum of 24 GB that we had on the Xeon X5570. To keep things readable, we have made an average of each 2 identical VM’s (so OLAP VM = (OLAP VM1 + OLAP VM5)/2).

Reference 175.3




Server System Based On OLAP VM Webportal VM2 Webportal VM3 OLTP VM
Dual Xeon X5570 2.93 79% 34% 32% 47%
Dual Opteron 2435 2.6 71% 23% 23% 38%
Dual Opteron 2377 2.3 76% 19% 19% 28%

vAPUS Mark I 2 tile test  - ESX 4.0

Notice that HT-assist is a performance killer in 2P configurations: you remove two times 1 MB of L3-cache, which is a bad idea with 8 VM’s hitting your two CPUs. It is interesting to see that the Xeon X5570 starts to break away, as we increase the number of VM’s. The Xeon X5570 is about 30% faster than the Dual Opteron 2435. It gives us a clue why the VMmark scores are so extreme: the huge amount of VM’s might overemphasize world switch times for example. But even with light loads, it is very rare to find more than 20 VM’s on top of DP processor.

There is more. In the 2-tile test the ESX scheduler has to divide 16 logical CPU’s among 32 vCPU’s. That is a lot easier than dividing 12 physical CPUs among 32 vCPU’s. This might create coscheduling issues on the six-core Opteron.

So our 2-tile test was somewhat “biased” towards the Xeon X5570.

We reduced the number of vCPUs on the webportal VMs from 4 to 2. That means that we have:

- Two times 4 vCPUs for the OLAP test
- Two times 4 vCPUs for the OLTP test
- Two times 2 vCPUs for the OLTP test

Or a total of 24 vCPU’s. This test is thus biased towards the “Istanbul” processor. Remember that our reference score was based on a 4 CPU “native” score. So we adjusted the reference score of webportals to one that was obtained with 2 native CPU’s. The reference score for the OLTP and OLAP test remained unchanged. The results below are not comparable with the ones you have seen so far. It is an experiment to understand our scores better. To keep things readable, we have made an average of each 2 identical VM’s (so OLAP VM = (OLAP VM1 + OLAP VM5)/2).

Reference 175.3




Server System Based On OLAP VM Webportal VM2 Webportal VM3 OLTP VM
Dual Xeon X5570 2.93 82% 53% 53% 43%
Dual Opteron 2435 2.6 81% 38% 38% 44%


vAPUS Mark I 2 tile test - 24 vCPUs - ESX 4.0

The result is that the Xeon Nehalem is once again only 11% faster. So it is important to remember that relation between the number of vCPU’s and the Cell size is pretty important when you are dealing with MP virtual machines. We expect that the number of VM’s with more than one vCPU will increase as time goes by.

Virtualization: To Be or Not to Be Power Consumption & Market Analysis


View All Comments

  • befair - Wednesday, June 3, 2009 - link

    yeah, yeah, its always "more details review coming soon" Reply
  • smith1795 - Thursday, June 4, 2009 - link">">
  • genkk - Tuesday, June 2, 2009 - link

    ohhh I see, johan is only doing what his boss (anand) has told him so... Reply
  • aguilpa1 - Monday, June 1, 2009 - link

    As an AMD fanboy I skipped to the one metric that AMD shows competitive performance and focused on that ignoring all other. Reply
  • classy - Monday, June 1, 2009 - link

    That is laughable. Lets see I have only purchased all Intel servers in the last 7 years. But in the last 2 years anyone who does any system administration knows virtualization has just leaped to the forefront. Its that important. Even email is being virtualized. Databases are still physical and will probably be for some time to come. But make no mistake about it, how well it does at virtualizing is at the top of the list. Especially considering the recent recession. Virtualizing allows more to be done with less of everything. Next time maybe have some experience in something else besides reading the internet and maybe you might understand a thing or two. Reply
  • Natfly - Monday, June 1, 2009 - link

    Virtualization isn't the be-all end-all of computing. It definitely can be a way to make more efficient use of your hardware, but the "virtualize everything" mentality isn't going to help you in the long run. Reply
  • solicitorsuk - Saturday, September 26, 2009 - link

    This is the thing that i looking for from couple months ago.">
  • nycromes - Tuesday, June 2, 2009 - link

    I don't think thats really what he/she was saying, they were saying that in terms of the IT world, Virtualization is now one of the (if not the) most important features. Right now, there are major pushes in the industry to make more efficient use of hardware, virtualization is one major part of doing that. I agree that a "virtualize everything" mentality is not good, but the OP makes a great point about the importance of virtualization in todays IT world. Reply
  • Jakey1999 - Thursday, May 6, 2010 - link


    What would you recommend for a SQL Server 2005 64bit database server? Hybrd, system OTLP and OLAP - 75% read. Thanks. Please respond to e-mail "". Thanks man.
  • AlexRot - Wednesday, October 7, 2020 - link

    Компания Азия-Трейдинг оказывает весь комплекс вэд услуг по таможенному оформлению грузов экспортно-импортного направления. Мы контролируем каждый шаг для того, чтобы Ваши грузы оформлялись без задержек. Доскональное знание условий оформления и профессионализм сотрудников позволяет добиться значительного снижения издержек и максимальной скорости оформления грузов. Reply

Log in

Don't have an account? Sign up now