AnandTech Home IT Portal Home Increase Font Size Decrease Font Size Change Page Size
AMD's Quad-Core Barcelona: Defending New Territory
AMD's Quad-Core Barcelona: Defending New Territory
Date: September 10th, 2007
Topic: IT Computing
Manufacturer: AMD
Author: Johan De Gelas
Buy the HE Opteron Quad-core 2350HE
Blank
 
 

64-bit Linux HPC Performance: LINPACK

There is one kind of code where Core really ate the AMD CPUs for breakfast. It was close to embarrassing: floating point intensive code that makes heavy use of vector SIMD, also called packed SSE (and SSE2/SSE3) runs up to two times as fast on a Xeon 5160 (3GHz) than on Opteron 2222 (3GHz) . This is also one of the (but probably not the main) reason why AMD was also falling a bit behind in the gaming area.

AMD has really gone a long way to improve the performance of 128-bit packed SSE instructions:
  • Instruction fetch has been doubled to 32 bytes
  • 128-bit SSE computations now decode into a single micro-op (two in K8)
  • The load unit can load two 128-bit numbers from the L1 cache each cycle
  • FP Reservation stations are still 36 entry, but they're now 128-bits wide instead of 64-bits
  • All three FPU executions units were widened to 128-bit (64-bit before)
  • The L2 cache has double the bandwidth to cope with this
Together with the excellent memory subsystem, Barcelona should be ready to take on the Intel Core architecture when it comes to pure SIMD/SSE power.

Meet LINPACK, a benchmark application based on the LINPACK TPP code, which has become the industry standard benchmark for HPC. It solves large systems of linear equations by using a high performance matrix kernel. We used Intel's version of LINPACK, which uses the highly optimized Intel Math Kernel Library. The Intel MKL is quite popular and in an Intel dominated world, AMD's CPUs have to be able to run Intel optimized code well.

We used a workload of square matrices of sizes 5000 to 30000 by steps of 5000, and we ran four (dual dual-core) or eight threads (dual quad-core). As the system was equipped with 8GB of RAM, the large matrixes all ran in memory. LINPAC is expressed in GFLOPs (Giga/Billions of Floating Operations Per Second). We'll start with the quad-core scores (one quad or two duals).


Yes, this code is very Intel friendly but it does exist in the real world, and it is remarkably interesting. Look at what Barcelona is doing: it is outperforming a 60% higher clocked Opteron 2224 SE. That means that clock for clock, the third generation Opteron is no less than 142% faster. That is a massive improvement!

Thanks to meticulous tuning for the Intel's cores, the Xeon is still winning the benchmark. A 17% higher clocked Xeon 5345 is about 25-26% faster than Barcelona, but the days where this kind of code resulted in embarrassing defeats for AMD are over. We are very curious how a LINPACK compiled with AMD's math kernel libraries and other compilers would do, but the late arrival didn't allow us to do much recompiling.

Now let's take a look at the eight thread results. We kept the Xeon 5160 (four threads) in this graph, so you can easily compare the results with the previous graph.


Normally you would expect that this kind of code with huge matrices has to access the memory a lot, but masterly optimization together with hardware prefetching ensures most of the data is already in the cache. The quad-core Xeon wins again, but the victory is a bit smaller: the advantage is 20%-23%. Let us see if Intel can still keep the lead when we look at a benchmark which is very SSE intensive and which is optimized for Intel CPUs, but this time it's developed by a third party.

Software Rendering: zVisuel (32-bit Windows)   Next Page

 
  Index

Tools Share
Find lowest prices Find the lowest prices
Digg   del.icio.us   E-mail  
Print This Article Print this article  

46 Comments - Last by tshen83, 781 days ago
Username:
Password:
Finally it's here.... by MDme, 803 days ago
Let the games begin!

Reply
RE: Finally it's here.... by MDme, 803 days ago
I think Barcelona will be a success in the server world. It's performance is around 20% faster than equivalently clocked xeons with the exception of certain programs like fritz and the linpack intel library where it is around 5-10% slower. But since it scales better than the xeon chips it should negate that and increase it's lead on others as core/sockets increase. add to that it's power efficiency tweaks and aggressive pricing, AMD will be able to hold off intel in the server world.....maybe.

With 2.5Ghz Barceys coming up that would be equivalent to around 3-3+ Ghz xeons. So AMD was right that they need to get to 2.6 Ghz....AMD needs to ramp up clock to get the highest-end performance crown, but for now, their offering offers a nice balance of performance and power efficiency for the price.

Now time for the Phenom to get it's act together.

Reply
RE: Finally it's here.... by wegra, 803 days ago
You should not forget the Penryn. 2.5Ghz Barcelona will face to 3.1+Ghz Penryn. According to result from this article, I expect the performance of 2.5Ghz Barcelona will reach between 2.8 ~ 2.9Ghz Penryn. So wait till (hopefully) next year to see that AMD becomes the performance king. BTW, talking about the multi-processor servers, AMD will lead w/o much difficulties, I expect, thanks to the scalable architecture.

Reply
RE: Finally it's here.... by IntelUser2000, 803 days ago
AMD won't compete against Intel's Tulsa chips anymore. They will have to compete against Tigerton Xeon MP and the newly introduced Clarksbro chipset.

On the DP server platform, Intel will introduce Harpertown and Seaburg chipset. Seaburg chipset features 1600MHz bus with significantly improved memory controller performance. We'll see how it all turns out but as of now, Barcelona is a bit late to be competitive.

Reply
RE: Finally it's here.... by JackPack, 802 days ago
The problem is, 45nm Harpertown and 1600 MHz FSB will be rolling in soon.

Barcelona would have looked great 6 or 9 months ago. But today, it's a little weak unless they can raise the frequency fast.

Reply
RE: Finally it's here.... by Viditor, 802 days ago
quote:

45nm Harpertown and 1600 MHz FSB will be rolling in soon


True, but so will HT 3.0 and the newer mem controller for the Barcelonas...

Reply
RE: Finally it's here.... by jones377, 802 days ago
You got your work cut out for you now :)

Reply
RE: Finally it's here.... by TA152H, 802 days ago
The article should have mentioned the performance penalty Intel chips are suffering from with regards to FB-DIMMS. While it's true they should be benchmarked in servers with with memory, it's also widely rumored that they are going to be offering choices in the near future. This memory has a really big impact on a lot of benchmarks, so when looking towards the future, or desktop, it's important to keep in mind the importance of Intel using different memory. I don't think even Intel is stubborn enough to stick with this seriously slow, and power hungry memory. Maybe as a choice it's fine, but it must be clear to them that offering something else as well as FB-DIMMs is very desirable in the server space. Then again, look at how long they stuck with Rambus.

Reply
RE: Finally it's here.... by JohanAnandtech, 802 days ago
well said. I don't think AMD will have that advantage for a long time in 2P space :-)

Reply
RE: Finally it's here.... by Viditor, 799 days ago
Are you going to be re-doing the review with the shipping version (stepping BA) anytime soon?
I'm most curious to see if the improvement of 5%+ claims are true...

Reply
Comments Page 1 of 5

Free Forrester Risk Management Report
Demystifying Enterprise Risk Management. Download Free With Registration.
DOWNLOAD vWire Today - FREE TRIAL
Take Control of Your Virtual Infrastructure. Manage VI Data & Prevent Problems.
Report Unlicensed Business Software Use
Earn Up to $1 Million by Reporting Unlicensed Software Use. Fill Out Our Form!
Download Microsoft Visual Studio ® Team System
Streamline Dev processes, Reduce time to market. Try Microsoft Visual Studio Team System, FREE!
Supermicro Barebone Servers
We Carry Everything Supermicro. Low Price, Top Service, FREE Shipping, and more.




Latest news by
DailyTech

 November 20, 2009

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank

 November 19, 2009

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank




pipeboost
Copyright © 1997-2009 AnandTech, Inc. All rights reserved. Terms, Conditions and Privacy Information.
Click Here for Advertising Information