Last Friday, AMD has given a good answer to the approaching  Intel Xeon Nehalem EP thunderstorm. AMD demonstrated to a handful of journalists (Charley and Scott) an up and running dual and quad socket Hexacore Istanbul system. Istanbul, which should be ready in the Autumn of this year, is basically a six core version of the current AMD Opteron "Shanghai". While we could not attend the Istanbul demo, we had a long phone conversation with the AMD people. A few interesting points came up during that phone conversation, and we love to share them with you.

AMD seems to recognize that the best Nehalem EP will be between 40 to 100% faster than their flagship CPU, but claims there will be much more benchmarks near the 40% than the 100% mark. AMD however believes that Intel will only be able to steal back the "performance is everything" HPC market, as it will counter Nehalem by launching an Energy Efficient version of the current Shanghai CPU. AMD firmly believes that the 95W Nehalems EP (2.66 to 2.93 GHz) will not be very attractive to many datacenters. AMD also points out that even the low power versions of Nehalem (up to 2.26 GHz) need 60W. We will see whether AMD can offer higher clockspeeds with lower energy consumption.It is interesting to hear that AMD firmly targets the low power market. According to AMD, many customers are already putting "power caps" (a BIOS feature) on their CPUs to avoid that the server exceed a certain power consumption level. This means that the CPU is staying in the lower p-states and is never able to run at full clockspeed. This is used by many customers that do not buy low power CPUs.
 
Secondly, AMD believe that  the total number of servers, based on Nehalem EP, will probably amount to being small percentages of the total server shipped in Q2. Buyers will oppose the high price of DDR-3 according to AMD. We are rather sceptic:
 
So the price difference is small to non-existing on a $3000-$4000 dual socket server. 
 
Still, Nehalem is a completely new platform and it will take some effort from the system administrator to verify if the currently running applications run well with Hyperthreading and Turbo Mode. Also AMD's RVI is already well supported in ESX 3.5, while we'll have to wait for VMware's vSphere ("ESX 4.0") before EPT will be supported. That means that the realworld performance of Nehalem running ESX will probably be lower than the published benchmarks in 2009. Yes, we are at VMworld 2009 remember!
 
The Shanghai platform is basically the same as the Barcelona one, so that earns AMD a few points in the "easier to integrate and upgrade to" departement. AMD is thus hoping that by the time Nehalem EP will really take off (Q3?), Istanbul will be ready to answer the threath.And there is something interesting about Istanbul... but we'll discuss that in a later post. 
 
 
 
 
Comments Locked

11 Comments

View All Comments

  • befair - Thursday, February 26, 2009 - link

    "So the price difference is small to non-existing on a $3000-$4000 dual socket server. "

    The difference is not neglible. A quick check online might give you whats the difference but when buyign from Tier1 vendors, the costs are much higher. You should also consider that businesses dont buy one or two systems, they buy a sizeable number and at volume, even $1 becomes significant. isnt it? As usual overall article is in the tone of belittling AMD's products.

    Well, anyway expecting you to write a well though out article is beyond anyone's wildest dreams. Intel would not like it if it were balanced and fair.
  • MossySF - Wednesday, February 25, 2009 - link

    My company has been using both AMD and Intel servers for 10 years now and quite frankly, we're sick of the mega box server paradigm. They're heavy to transport, a pain to do maintenance, a major point of failure and so on. Even 1U servers with multiple drives take 2 people to carry and setup -- heaven forbid you have to lift a 4U with 8+ drives in it.

    So we're experimenting with the distributed model -- instead of a few servers with a gazillion cores each, use a whole bunch of cheap, lightweight single-socket machines. Just this week, we threw out our oldest 1U/2Us in our data center and replaced them with Shuttle cubes sporting Phenom II X4 940s. For our web/db app, they turning out to be 35% faster than Xeon X3370s at a cost of $600 per box.

    Basically, if you can develop your server apps to run in a distributed fashion, forget about Nehalem and Xeons. Forget about Shanghai and Opterons. Pick the best option available from Shuttle and run with it. You can transport 2 cubes in each hand. You can put 4 cubes onto a single tray and have near the same density as 1Us. And your server cost will be about 1/3rd that of the traditional megabox route.

    Now will Shuttle Core i7s be better for us than Phenom2s? Don't know yet -- will have to see when they come out.
  • JohanAnandtech - Wednesday, February 25, 2009 - link

    why is the weight of your server so important? It seems that you are transporting your servers a lot?

    Did you virtualize? I mean, a good 2U box can contain lots of virtual "shutlle boxes" and it is not hard to buy a second machine and make sure that virtual machine fail over to the second box if necessary. Just curious why you feel that the shutlle boxes are easier. It looks like a management nightmare :-)
  • JarredWalton - Wednesday, February 25, 2009 - link

    Holy cow! Shuttle? For mission critical servers!? Please say you're joking. If you're serious, I hope these are "servers" that can be replaced at a whim and aren't doing anything critical. You'll notice our SFF section has disappeared, and SFFs in general are a dying breed. All I can say is that of all the SFFs I tested over the years, exactly one is still in use to day... one from Biostar, ironically enough, as all of the Shuttle boxes died before they reached two years old. Quality control at Shuttle is something I'm very skeptical about, to say the least

    As for moving 1U (or 4U) servers, I worked at a datacenter for 3.5 years. We had to access the hardware on one of the three dozen servers I think three times total during that time. Installing a 70-100 pound rack mount server is no walk in the park, but once it's done you usually won't have to touch the internals for a few years. Yes, even the 1U boxes are heavy, but the only parts to ever fail on us were the fans, which were easily replaced. That's why companies buy enterprise equipment (not to mention the remote management and diagnostic utilities).
  • MossySF - Wednesday, February 25, 2009 - link

    We run a multi-million dollar business on these servers. With a distributed architecture, so what if a machine or two dies? Hell, we have some inperable Opteron servers right now in a data center across the country. So we just leave it powered off and instead of X machines ganging up to do work, it's X-N machines. This model is *FAR* more mission critical than having a few beefy machines where a single part could die and you are totally down. You can try your best to put redundant components into a server but there are things outside of your control -- like for example, we once had a transister blow right off a DIMM due to heat/power surge/whatever. Hell, if a machine dies in 2 years -- perfect! Just in time for an upgrade to the next model. Since these machines cost 1/3rd the price, that more than pays 3 replacements over 6 years. Anything beyond that, the machine is too old anyways.

    Seriously, we have to send our people out once a year to all our data centers just to freaking service the fans and regrease the CPU heatsinks. That means every machine must come out. And every trip to our remote data center (thousands of miles away to provide geographic redundancy), we run out of time.

    This is the very model Google uses for their servers -- a crapload of really cheap 1Us where if they die, they don't even care. They just leave the broken servers in place until they replace the entire cage/cabinet/room/etc. because it costs more to have somebody track down what machine is broken and then put a replacement in.
  • tshen83 - Tuesday, February 24, 2009 - link

    Just wanted to point out some things that aren't true:

    1. 6 core Opteron will be competitive with Nehalem-EP
    Not true. First of all, the 6 core Opteron will be stuck with the same HT1.0 and dual channel DDR2 memory bus on the Nvidia 3600 chipset. The additional cores will give bigger memory lock contention problems. Since memory bandwidth didn't change, I don't expect much performance boost. Also, adding another 2 cores will boost the 6 core Opterons into the 110-130W TDP range, hardly performance/watt competitive compared to 80-95W Nehalem-EPs let alone 60W L series. I would love to see AMD fudge their ACPs once again to make their processors look less power hungry to people who don't know crap about processors.

    2. Fall release date.
    I don't know about you, but isn't that at least half a year behind? By that time, the real competition would be Quad Nehalem-EX for the 8000 series Opterons, and Dual 32nm Westmere-EPs by 2010. Istanbul is dead on arrival.

    3. Adding a few cores isn't anything mind boggling. If you look at AMD engineering, their innovation stopped at the introduction of Opteron DDR1. That's because of the integrated mem controller and the HyperTransport Bus, while Intel was busy trying to crack 4Ghz barrier with the Pentium 4s. Even Intel admits that once the memory controller is integrated, where else can you boost more performance?

  • loquehayqueoir - Thursday, February 26, 2009 - link

    "The additional cores will give bigger memory lock contention problems. Since memory bandwidth didn't change, I don't expect much performance boost."

    Thats funny. Memory lock problems don´t depend on memory bandwidth. And in fact, many real world applications neither. In a server environment, threads will be working mainly on thread local data, so no big memory lock issues.
    Most server machines which are application servers or databases, now usually virtualized in some environments, are I/O bounded, network and storage (and no, VM dont resolve the problem, only mitigate it). Of course HPC applications arent the biggest market out there.
    Sad you chat about people who "don't know crap about processors".
  • JohanAnandtech - Wednesday, February 25, 2009 - link

    "the 6 core Opteron will be stuck with the same HT1.0 and dual channel DDR2 memory bus on the Nvidia 3600 chipset. The additional cores will give bigger memory lock contention problems. Since memory bandwidth didn't change, I don't expect much performance boost."

    No by that time, the AMD platform will have move towards HT3.0. And HT assist also lowers the memory bandwidth that goes to waste. I have some info that not yet has been disclosed, I'll put that up today or tomorrow.

  • Natfly - Tuesday, February 24, 2009 - link

    Well its hard to say whether or not they will be competitive without actually having some numbers. One thing AMD added besides the two additional cores is something they call a "Probe Filter," heavily increasing bandwidth by reducing latency in multi-socket machines. I suspect that's what was being referred to here: "And there is something interesting about Istanbul... but we'll discuss that in a later post." (More info here: http://www.theinquirer.net/inquirer/news/107/10511...">http://www.theinquirer.net/inquirer/news/107/10511... )

    By the time these are released AMD's 45nm process will be much more mature than it is now, so predicting TDPs nothing more than a guess at best.

    "Even Intel admits that once the memory controller is integrated, where else can you boost more performance?"

    It is the same situation for hyperthreading/SMT, something AMD needs to pick up.

    All-in-all I don't think Istanbul will be able to best the Nehalem-EP, but who knows until more numbers appear.
  • JarredWalton - Tuesday, February 24, 2009 - link

    My bet is that there are certain setups where Istanbul will beat (or at least be competitive with) Nehalem. Overall crown is almost certainly going to be with Intel, but in the 2S and 4S servers there's a lot of specific applications where raw performance isn't always the final determiner of performance. Virtualization systems is one, and without doing the testing on a variety of apps I wouldn't venture to say Intel or AMD will be ahead. If I were to guess, that would be the prime area where AMD can keep up with Intel - depending on vendor specific optimizations, naturally.

Log in

Don't have an account? Sign up now