Is Sandy Bridge-EP an Upgrade Path?

At the beginning of this review, I referred back to Johan’s article on the behind the scenes benefits that Sandy Bridge-EP offers over Westmere-EP, and condensed them into a list for what a non-CS student in a scientific field might have to consider:

- The improved core and µop cache on Sandy Bridge-EP should boost IPC through the roof with calculations that can take advantage, especially advanced trigonometric functions.
 - The increase in L3 cache would reduce stress on jumps out to main memory for values, although the improved memory bandwidth would also help in this regard.
 - More cores are always welcome – Turbo 2.0 would help with pre-release code testing, which often occurs in debug / single thread mode.
 - An increase of memory limits would help various simulation scenarios, as well as aid having VMs of different environments.
 - The move up to PCIe 3.0 helps any GPGPU simulation that requires lots of memory transfers back and forth across the bus (matrix solving), as long as the GPU supports PCIe 3.0 (K10, K20X, FirePro, not Xeon Phi which uses PCIe 2.0).

Every scenario that an individual faces, either in the office, the laboratory, or generic work place #147 is going to be different – perhaps only slightly, but different nonetheless.  We have to weigh up the pros and cons of the specific workload and make relative suggestions. 

For the most part, any simulation which has large parts that can be computed in parallel should be looking at GPUs, unless the thread are ‘dense’ (require lots of memory registers for the serial calculation) or are already optimized for SSE4/AVX.  Double precision can also be a hurdle to GPU computing, but the NVIDIA GTX Titan makes the cost a lot more palatable on research grants.  Lots of researchers will be dealing with Fortran code tens of thousands of lines long and 20 years old, meaning that porting to GPUs is not a reasonable situation (unless you encourage the research supervisor to apply for a 3 year grant to convert the code).  In these cases, make a note of how much memory the simulation needs – if it is sub 2.5 MB, then load up on as many cores as you can get as you will still be in L3 cache on the 20MB L3 processors.  For more than that, you will be dealing with memory accesses out to main memory, and unless you are comfortable dealing with NUMA based code and tools (which your Fortran probably is not geared for), then a single fast processor is probably the best bet.  MPI based Fortran is where dual processors systems would be best, or for simulations that require more memory than what a single processor can have equipped.

In terms of Westmere-EP vs. Sandy Bridge-EP for our benchmark suite, the relative numbers are:

Dual E5-2690 vs. Dual X5690
Price +25% (before tax and additional seller markup)
  HT On HT Off Recommended Setup
2D Explicit FD +12.7% +7.3% GPU or
Single Multicore CPU
w/High Speed Memory
3D Explicit FD +7.7% -10.3% GPU or
Single Multicore CPU
w/High Speed Memory
2D Implicit +25.6% +9.9% Single CPU
High Mem Bandwidth
Brownian Motion
Single Thread
+2.4% +2.8% High Single CPU Speed
Brownian Motion
Multi Thread
+31.8% +23.4% GPU
n-Body +29.0% +47.7% GPU
WinRar +27.4% +3.4% High Mem Bandwidth
FastStone +6.5% +3.2% High Single CPU Speed
Xilisoft Video +14.3% +24.4% GPU or
Multi-CPU
x264 Pass 1 -9.0% +3.4% Single CPU
x264 Pass 2 +27% +24.3% Multi-CPU

While we do not get a price equivalent speed up across the board, certain scenarios (Xilisoft, x264 Pass 2) benefit greatly from a dual processor Sandy Bridge-EP system over either Westmere-EP or GPU.  Sometimes a GPU is not available, putting the Brownian Motion benchmark through the roof when it comes to more cores.  A limiting factor in many of these benchmarks is memory speed – if you do not need a Xeon, then the latest Intel/AMD processors can handle 2133+ MHz memory which provides an absolute tangible boost in finite difference simulation and WinRar.

If we come back to the original question ‘Is moving from Westmere-EP to Sandy Bridge-EP a reasonable upgrade’, in the majority of our scenarios it probably is not – either other alternatives exist that perform better (single CPU, GPU, memory bandwidth) or the price difference is not worth the jump.  Remember that most scenarios will have to absorb the whole cost, rather than the cost of an upgrade, and calculating that into the cost/benefit analysis is a major part of the equation.  But none of our scenarios need more than 96 GB of memory, PCIe 3.0, VMs for different environments, or use advanced processor instruction sets, which could be vital to your work. 

Ivy Bridge-EP is slated for the end of the year, meaning that those on Westmere-EP would probably consider waiting to see what comes out from Intel.  If you need a DP system now, then Sandy Bridge-EP is an obvious choice if you want to go down the Intel route, though NUMA related code may benefit from a quad AMD system better.  If we get one in for another comparison point, we will let you know.

A final note to give thanks to the Gigabyte server team for loaning us the CPUs and motherboard to make this testing possible.

Compression and Video Benchmarks
Comments Locked

44 Comments

View All Comments

  • Kevin G - Monday, March 4, 2013 - link

    Ivy Bridge-E is a drop in replacement so that investment into RAM, storage, motherboard, chassis would be identical to today. The transition between Sandy Bridge-E and Ivy Bridge-E will mirror the transition between Nehalem and Westmere: socket compatible drop-in replacements in most cases.
  • colonelpepper - Monday, March 4, 2013 - link

    yeah, what i was thinking might be a decent route to take is to build out a workstation with 2 of the lower end more moderately priced Xeon 2600's... save the big $$ for the new chips.
  • Shadowmage - Monday, March 4, 2013 - link

    Your current suite of benchmarks is extremely limited for you to be able to call this a review for "scientists". For example, I'm interested in how these processors perform in Xilinx XST/MAP/PAR and simulation (e.g. Gem5) benchmarks.
  • IanCutress - Tuesday, March 5, 2013 - link

    Of course - any review aimed at scientists is going to be extremely limited. Forgive me when I can only represent where I have come from - I haven't done research in every field.

    Ian
  • Simen1 - Tuesday, March 5, 2013 - link

    Wouldnt it be fair to compare the Dual Xeon systems to a similar priced dual Opteron system?
  • Simen1 - Tuesday, March 5, 2013 - link

    And the mentioning of the 3 year old Opteron 6100 and 1,5 year old 6100 on the first page is irellevant now in 2013. Todays models are in the 6300 series.
  • IanCutress - Thursday, March 7, 2013 - link

    If we get a dual Opteron 6300 system in, we will compare.
  • plext0r - Tuesday, March 5, 2013 - link

    Would have been nice to throw in some bigadv work units from the Folding@Home project to see how the systems compare.
  • Michael REMY - Wednesday, March 6, 2013 - link

    hi !

    i really thought it is unfair and un-objectif to not include one of the E3-1290V2 or xeon E5-1620 in your test. Why (the hell) the i7-3770 do in you "profesional server" comparaison test ?

    E3-1290V2 and E5-1620 are the higher clock and newer xeon ! you should put them in the race !

    best regard
  • IanCutress - Thursday, March 7, 2013 - link

    It's all about the equipment we have to hand. We don't have every CPU ever created. Plus, putting in consumer CPUs lets everyone know the playing field.

    Ian

Log in

Don't have an account? Sign up now