These Servers Get Your Blood Flowing

Midwest System Builder's Linux Cluster Clocks IBM To Win University Deal

CRN logo By Joseph F. Kovar, ChannelWeb

11:00 AM EST Wed. Nov. 22, 2006
From the November 22, 2006 issue of CRN
Page 2 of 2
The nodes also include motherboards from Supermicro featuring the industry-standard Intelligent Platform Management Interface to allow remote monitoring of node temperature, fan operations, system voltage and other environmental conditions.

About 5 Tbytes of SATA hard drives, a combination of 400-Gbyte and 500-Gbyte drives from Western Digital, are attached directly to the super node, Daninger said. They are controlled by a high-performance Areca RAID controller from Brea, Calif.-based Tekram Systems. One feature of the Areca that Daninger likes is its error lights that indicate which drive has failed.

The cluster nodes run on the CentOS Linux operating system, an open source derivative of Red Hat Enterprise Linux. Daninger said this helped keep costs low by avoiding having to pay license fees on all the nodes. "It works for a fairly savvy end-user like the University of Minnesota," he said.

To tie the cluster nodes together, Reason used Lawrence Berkeley National Laboratory's Warewulf Linux solution for managing Linux clusters. "It works great with diskless nodes and works hand-in-glove with CentOS," Daninger said.

Also included was the Sun Grid Engine for allocating processors to various jobs depending on priority. Reason also brought in Ganglia, an open source application that provides a graphical view of how busy the nodes are.

Reason initially built an eight-node test cluster, but found that eight nodes was the maximum when tied together with Ethernet. So Reason brought in Myrinet, a proprietary networking solution from Myricom, Arcadia, Calif.

Latency for each transaction using Myrinet is one-fifth that of Ethernet, Daninger said. "In these parallel environments, code is written so that one node can do its part of an operation and hand it off to another node," he said. "This all takes time."

Before installing the laboratory's application, Reason downloaded a fluid flow computational software application from NASA in order to benchmark the cluster for the laboratory. "Its benchmarks are well known in the fluid dynamic market," Daninger said. "It let us prove the cluster before the university added their software."

The configuration and deployment of the cluster fortunately went well because the contract called for it to be up and running 30 days from the day the purchase order was signed. "We delivered it on day 30," Daninger said. "My understanding is there was a time limit on the grant. If it was not done on time, they could lose the money. So there was some stress."

But there were minor problems, including the occasional bad driver and a few bad memory modules, as well as some driver issues with the Myrinet cards, and Daniger recalls a lot of late-night lunches. Ge, at least, had some fun with the memory modules, some of which caused a compute node to randomly crash. It took some time to realize it was a memory module problem.

"During that time, our IT guys and users would play a game to predict on which day the next crash would occur and on which node," he said. "I won both times."

The pressure was double for Reason, which was moving to a larger facility at the same time it was building the cluster. The move required installing enough power in the facility to test clusters the size of the one that was being built, Daninger said. "I told the electrician we needed power connects of 14 kilowatts for a computer system in the new office. He said, 'What the heck kind of a computer you putting in?' "

Dr. Fotis Sotiropoulos, director of the St. Anthony Falls Laboratory, said his laboratory does a wide range of projects related to fluid dynamics. In fact, the cluster built by Reason is also used to study the water flow in rivers and streams to aid in river restoration projects and to see how the flow of fish is affected by hydroelectric facilities, Sotiropoulos said. "Biology is just one part, but a big part," he said.

For scientific applications, there's never enough computing resources and more grants could mean adding more resources. That is what is likely to keep Reason and its competitors busy for a very long time.

 
Channelweb : Promofinder
FEATURED PROMOTIONS
Avnet 0% Lease Promotion
The Avnet Capital Solutions “0% Lease Promotion” has been extended to December 31, 2009! This offering significantly reduces ...
PROMISE Technology Turns Sales into Reseller Rewards
PROMISE Technology Turns Sales into Reseller Rewards: From desktop to data-center, PROMISE has a full line of storage solutio...
ADVERTISEMENT




CHANNEL SERVICES >>