As most of you might know, Intel and AMD have made fundamentally different choices when it comes to the platform, chipset, and memory subsystems. Intel chose to create a massive chipset that has four independent front side buses.
Intel Xeon MP Caneland platform.
In theory, this platform should deliver up to 8.5GB/s to each CPU, but the bottleneck is of course the connection to the memory. Four channels of FB-DIMMs are capable of delivering 21GB/s at most or about 5GB/s per CPU.
AMD promised us very low latency HyperTransport 3 connections in the quad-socket space…
Single hop HT3 connects were promised for the quad-core Opteron.
…but decided to give the current platform a longer life:
The current platform for the brand-new quad-core Opteron uses the old 1GHz HyperTransport connections.
So at the moment AMD's platform still works with a 1GHz DDR, 16-bit HyperTransport connection between the different CPUs. The 4GB/s bandwidth (full duplex) seems a little low when you consider that the dual-channel DDR2 DIMMs can deliver about 10.6GB/s in theory and 5.2GB/s in reality. This means that whenever a CPU has to get data from a remote node, bandwidth is limited by the HT connection. Also, latency can sometimes be increased by the fact that in some cases, the remote data has to go over two hops.
Although in most applications this is not a show stopper, AMD has some headroom when it introduces the new 45nm quad-core Opteron with HT3 connection. Each HT3 connection is capable of a 2.6GHz connection, which is able to deliver up to 10.4GB/s in full duplex. Intel will deliver a similar NUMA platform for its Nehalem CPU with 12.8GB/s QPI (CSI) full duplex links at the end of this year.