Dual-Core Servers: AMD or Intel?8:30 AM EST Mon. Jul. 09, 2007
Ever since AMD's entry to the server market with its Opteron processor, system builders have faced the question of which chips to use in their servers: AMD or Intel?
Is there really any difference? The short answer is yes. But what does a system builder really need to know before fielding systems based on these rival architectures? In this Recipe, we will go beyond the marketing hype to get to the heart of the matter " creating server specs for your customers that will achieve best performance for their application at the lowest overall cost.
For a price-comparing illustration, I will use a Bill of Materials (BOM) that includes motherboard, processor and memory for both the Intel and AMD server platforms. To keep the comparison as objective as possible, all other subsystems of the server " such as I/O and storage " will be assumed identical, and therefore irrelevant to our discussion. Motherboards by Tyan and memory by Kingston are used for example here; comparable parts from other manufacturers can be substituted at similar price points. My pricing data is from Tech Data Corp. and is current as of June 15, 2007.
|ITEM (QUANTITY)||AMD PART||AMD COST||INTEL PART||INTEL COST|
|Motherboard (1)||S3992G3NR-RS||$373 each||S5372G3NR-RS||$330 each|
|Processors (2)||OSP2212GAA6CQ||$230 each||BX805565130A||$348 each|
|Memory (4)||KVR667D2D8P5/1G||$60 each||KVR533D2D8F4/1GI||$77 each|
The first useful piece of information to glean from this table is that the initial build for an Intel Xeon-based server will be about 25 percent more expensive than its AMD Opteron counterpart. Also, this price gap will widen as more memory modules and processors are added. That is where the majority of the difference lies.
So, assuming for a minute that we decide to go with the Intel Xeon-based system, what are we getting for that 25 percent additional cost? Does the Intel processor perform 25 percent better than the AMD Opteron? That depends on your application. Let's take a look at the differences between the processor architectures, memory controllers and motherboard chipsets to get the low-down on this subject.
|SPECIFICATION||Intel Xeon 5130||AMD Second-Generation Opteron 2212 HE|
|Clock Speed||2.0 GHz||2.0 GHz|
|Bus Bandwidth||10.66 GB/sec.||16.0 GB/sec. (Direct Connect Architecture)|
|Memory Bandwidth||10.66 GB/sec. (Front-Side Bus)||10.7 GB/sec. (integrated onto chip)|
|L2 Cache||2 x 2 MB||2 x 1 MB|
|Lithography||65 nm||90 nm|
|Power||65 W||68 W|
|Voltage||Varies (1.55 V max.)||Varies (1.2 V / 1.2 5V max.) AMD PowerNow! Tech.|
At first glance, the larger L2 cache available to the Intel Xeon might lead one to believe that it would be a better performer, hands down.
But it's not that simple. AMD's Direct Connect Architecture eliminates the bottleneck associated with the Front-Side Bus (FSB) architecture that Intel uses, giving the AMD chip a 50 percent greater bandwidth to off-chip resources. This accounts for the majority of data transactions.
Also, the Opteron gains additional efficiency by integrating its memory controller directly onto the chip itself, rather than routing memory traffic through a bus, as the Intel Xeon does. Under maximum load, the Opteron will use slightly more power -- about 5 percent more wattage. But under partial loads, the Opteron is more energy-efficient than the Intel Xeon.
HARDWARE TECHNICAL COMPARISON: MEMORY
Main memory is another area where Intel departs from AMD in terms of architectural design. While both chipmakers use DDR2 RAM chips, this is where the similarity in their memory subsystems ends.
AMD integrates the memory controller directly onto the processor chip. The chip, in turn, directly connects its 240 or so parallel circuit traces to the memory array on the motherboard. The maximum memory allowed by the system increases in linear fashion with the number of processor chips in the system.
Also, the AMD processor makes use of industry-standard registered ECC DDR2 RAM modules. This helps limit system costs, even when large amounts of memory are used.
Intel does not integrate the Xeon's memory controller directly onto the chip -- memory transactions are routed through the FSB. So Intel had to come up with a new way of doing things, to keep pace with system demands on memory bandwidth. Intel's solution is called Fully Buffered DIMM (FB-DIMM), which also uses DDR2 RAM chips on the modules.
Under Intel's arrangement, part of the memory controller is integrated onto the DIMM itself; it is called the Advanced Memory Buffer (AMB). Each FB-DIMM has an AMB chip, in addition to the actual DDR2 RAM chips. Only the common clock signal source and a serial link to the AMB chips are located on the Intel's Northbridge chip (part of the motherboard chipset, which I discuss below). This point-to-point serial connection requires only 69 circuit traces per memory channel, compared with the 240 circuit traces of AMD's parallel interface.
Each of Intel's FB-DIMMs uses more power than do the standard registered modules used in AMD systems, mainly due to their AMB chip. Consequently, the Intel units also generate more heat.
One advantage of Intel's split controller design: It introduces Error Checking and Correction (ECC) schemes to the address and control signals, in addition to the data signals. This increases system stability at high clock speeds.
Also, Intel's design scales well, in terms of available bandwidth and maximum memory capacity. But the full potential of this new design has not yet been realized with existing components now available to system builders.
HARDWARE TECHNICAL COMPARISON: MOTHERBOARD CHIPSET
AMD and Intel use different design methods for the motherboard chipset, and this further separates the two products in terms of performance. The Tyan/AMD motherboard shown in the table above uses a combination of Broadcom's HT-2000 and HT-1000 chips. Meanwhile, the Tyan/Intel motherboard uses Intel's 5000V Memory Controller Hub (MCH) and 6321ESB I/O Controller Hub (ICH).
Why are these chips important? Let's take a closer look at how they affect the performance of our server.
First, the AMD platform. The HT-2000 chip is a highly integrated device, bringing onto one chip all of our system interfaces: Gigabit Ethernet, PCI-Express, PCI-X, and HyperTransport.
This integrated HyperTransport interface is an essential part of AMD's Direct Connect Architecture, which permits communications between the processor and the HT-2000 at full CPU speed. In other words, as CPU clock speed increases, so does available bandwidth. This means faster performance without bottlenecks, up to 24 GB per second, or 3.0 GHz CPU clock speed. The HT-2000 chip can be thought of as the system's Northbridge chip, since it performs the same type of duties.
AMD's HT-1000 chip functions as the Southbridge chip, handling I/O such as SATA, USB, floppy-drive controller, parallel ATA, etc. The HT-1000 is connected to the HT-2000 seamlessly via HyperTransport, for high performance without bottlenecks in bandwidth.
The bottom line is higher efficiency in inter-chip communications, since there are no bottlenecks with the HyperTransport's point-to-point link. Data transactions flow more efficiently from storage I/O to memory to processor and back again, creating high throughput overall.
Next, the Intel chipset. Intel uses a different technique for inter-chip communications on the motherboard. As mentioned above, the Intel system uses a split controller for the memory, with the 5000V MCH handling the clock source and serial interconnect to the AMB chips on each of the FB-DIMMs.
Most I/O transactions occur across the FSB, which is a shared parallel bus. This limits available bandwidth as compared to the direct chip-to-chip HyperTransport links found in the AMD system design. It also creates the potential for resource contention when multiple resources are competing for bandwidth.
Each processor core inside the CPU chip has access to its own bus--what Intel refers to as Dual Independent Bus (DIB) architecture. But overall system-level performance is impacted once I/O leaves the CPU and tries to move data across the motherboard to other chips.
The Intel 6321ESB chip serves as the Southbridge for this arrangement. It is connected to the 5000V MCH chip via a PCI-Express interface and a proprietary ESI interface, which provides adequate bandwidth. The 6321ESB is analogous to the HT-1000 in the AMD system, since it also provides SATA, USB, PCI, PCI-Express, etc.
With Intel's design, the chipset depends on keeping the processors' large shared L2 caches full. It scales performance by raising the clock speed of the CPU chip itself. This is one explanation for Intel's release of Xeon chips with much higher clock speeds than AMD's competing Opterons. Performance falls off once data and instructions leave the CPU core and enter the realm of inter-chip communications across the FSB.
So, in an apples-to-apples comparison, with all the technical complexities aside, here's the bottom line: Intel dual-core servers are more expensive to build than those based on AMD. What's more, the costs rise as more memory and faster CPUs are added.
The fact that Intel Xeon servers have an advantage with higher CPU clock speeds actually increases the size of this hurdle. With the benefits of FB-DIMM memory technology not yet fully realized, this drawback is further compounded.
Also, the increased power consumption, cost and heat dissipation of the FB-DIMMs do not bode well for Intel dual-core server system efficiency when compared with AMD Opteron-based machines.
The final nail in the Xeon's coffin, in my opinion, is the fact that inter-chip communication performance encounters bottlenecks instead of continuing to scale upward, as it does on the AMD architecture.
In the absence of software that is highly optimized specifically for the new architecture of Intel Xeon multi-core processors, AMD holds the advantage in both performance per dollar and performance per watt of power expended.
Evidence of this is visible in the increased market share that AMD Opteron servers have been achieving since AMD's entry into the server market some time.
But this is a game of high-tech leap-frog. Both companies have already introduced quad-core server chips. Only time will tell who will field the ultimate server when the next round of product releases is announced.
DAVID GILBERT is the owner of Appalachian Computer Systems, a West Virginia system builder that specializes in multiprocessor SCSI RAID servers.