Software Rendering: zVisuel (32-bit Windows)

This benchmark is the zVisuel Kribi 3D test, which is exclusive to AnandTech.com and which simulates the assembly of a mechanical watch. The complete model is very detailed with around 300,000 polygons and a lot of texture, bump, and reflection maps. We render more than 1000 frames and report the average FPS (frames per second). All this is rendered on the "Kribi 3D" engine, an ultra-powerful real-time software rendering 3D engine. That all this happens at reasonable speeds is a result of the fact that the newest AMD and Intel architectures contain four cores and can perform up to eight 32-bit FP operations per clock cycle and per core. The people of zVisuel told us that - in reality - the current Core architecture can sustain six FP operations in well-optimized loops. Profiling for Barcelona architecture is not yet complete, so we did our best with CodeAnalyst 2.74 for Windows. We only profiled the non-AA benchmark so far.

ZVisuel Kribi3D Profiling
Profile Total
Average IPC (on Opteron 2350) 1
Instruction mix
Floating Point 31%
SSE 35%
Branches 6%
L1 datacache ratio 0.63
L1 Instruction ratio 0.22
Performance indicators on Opteron 2350
Branch misprediction 8%
L1 datacache miss 1%
L1 Instruction cache miss 1%
L2 cache miss 0%

This is a very different engine than the scanline-rendering engine of 3ds Max. SSE instructions play a very dominant role, and the zVisuel Kribi 3D benchmark gives us a view on how the different CPUs perform on well-optimized SSE applications. While the application seems to run almost perfectly from the L2 cache, this seems to be a result of well-tuned, predictable access to the memory. We noticed that hardware prefetching and the new Seaburg chips help this benchmark a lot:

Zvisuel Intel Platform Performance Comparison
CPU HW Prefetch on HW Prefetch disabled Difference
Dual Xeon E5365 3.0 (Blackford) 99.9 87.7 14%
Dual Xeon E5365 3.0 (Seaburg) 110 104.2 6%
Dual Xeon E5472 3.0 (Seaburg) 124.8 110 13%

Let us see all the results.


zVisuel
Watch Assembly (no AA)


zVisuel
Watch Assembly (high quality AA)

Although we haven't done a detailed analysis, we can assume that the "Super Shuffle Engine" and "Radix-16" divider that Intel has implemented in the Xeon 5472 is paying off here. AMD Opteron 2360 SE at 2.5GHz can overtake the best Xeon at 65nm, but the new Xeon has a tangible lead. A silver lining to the cloud hanging over AMD is that the Opteron 23xx series scale perfectly with clock speeds: compare the 2GHz with the 2.5GHz results. Still, Intel has the advantage when it comes to SSE processing.

The results with AA show that the memory subsystem of the Xeon 53xx is a major bottleneck, but the new Seaburg chipset has made this bottleneck a bit smaller. The result is a crushing victory for the latest Intel architecture. Enough FP testing, let us see what Barcelona can do when running typical integer server workloads.

3ds Max 9 (32-bit Windows) 64-bit Linux Java Performance: SPECjbb2005
Comments Locked

43 Comments

View All Comments

  • tshen83 - Tuesday, November 27, 2007 - link

    Seriously, can you buy the 2360SE? Newegg doesn't even stock the 1.7Ghz 2344HEs.

    The same situation exist on the Phenom line of CPUs. I don't see the value of reviewing Phenom 9700, 9900s when AMD cannot deliver them. I am trouble locating Phenom 9500s.
  • alantay - Tuesday, November 27, 2007 - link

    The MySQL scalability problem is not so much in MySQL as in the Linux kernel and Glibc used.

    To have it scale correctly to 8 CPUs you need kernel 2.6.22.x (alternatively you could try with a 2.6.24-RC -should be a bit faster-, but not with 2.6.23.x) and Glibc 2.6 or higher.

    A default Ubuntu 7.10 for example should scale well with MySQL (OpenSUSE 10.3 *might* work, but they have backported the 2.6.23 scheduler which has a scalability problem).

    Thanks for the article!
  • JohanAnandtech - Tuesday, November 27, 2007 - link

    Excellent feedback.

    It is a bit frustrating that once again you need some ultra new kernel and libraries to get good scalability. THat is unrealistic for people who use SLES and who rely on their support contract to get updates.
  • MGSsancho - Wednesday, November 28, 2007 - link

    how about opensolaris? i dont know how much different it is from solaris 10, but it should be able to scale to dozens of cores nicely. I was about to ask about oracle and DB2 benchmarks but you answered that in your article; expensive, and the oems usually publish that info.

    anyways awesome article
  • Roy2001 - Tuesday, November 27, 2007 - link

    I cannot find a SINGLE one, nowhere.
  • drebo - Tuesday, November 27, 2007 - link

    Newegg has the Phenom 9500 in stock. At least, they did yesterday. I've also got a vendor I use that has them in stock.
  • JarredWalton - Tuesday, November 27, 2007 - link

    But Phenom isn't Opteron 23xx. Different socket, different market, and it has L3. (Does Phenom X4 have an L3 cache? Maybe I should go check....)
  • drebo - Wednesday, November 28, 2007 - link

    Yes, Phenom 9500 has an L3. But if you look at his question (in the subject line), he is asking about barcelona as a whole and phenom specifically. The answer is Yes, they are available.
  • Slaimus - Tuesday, November 27, 2007 - link

    They may be gobbled by up Cray for that Budapest supercomputer.
  • Regs - Tuesday, November 27, 2007 - link

    I would not expect any from vendors and wholesalers until early next year.

    Matter of fact I wouldn't want one until then anyhow. I would at least wait until B3 stepping.

Log in

Don't have an account? Sign up now