Clusters' Last Stand

To be sure, the technology is suffering an image problem,many VARs think of it as it was in its early days. Back then, clustering technologies were simply mating two servers together. The servers ran the same OS, used the same hardware configuration and helped to handle hardware failures gracefully. But today, as Web and database applications have increased in visibility, clustering is much less monogamous and,as a result,more interesting, intriguing and challenging for solution providers.

Those changes have come about as a result of two strong forces: applications have become more aware and capable of clustering technologies, and applications vendors have become more dissatisfied with the pace of innovation around clustering on Intel/PC hardware and Microsoft Windows OS software. As a result, clustering has branched out beyond just the OS and is now handled directly by many major applications, such as Oracle databases, and by storage vendors, such as Veritas.

"Customers' needs are different, and we need all these different approaches," says Arun Taneja, a senior analyst at analyst firm Enterprise Storage Group, Milford, Mass.

Clustering Primer
At its heart, clustering is a very simple idea, but with lots of complex subtleties that can bedevil the savviest VAR: Tie together two or more servers with a variety of hardware and software tools and tricks, and enable them to keep running if any one component fails. Clustering actually encompasses several different notions. First and foremost is high availability. That means that there is no single point of failure, and often a great deal more built-in redundancy if more than one component fails. Part of the concept of high availability includes failover and failback recovery. If a component fails, the rest of the cluster continues to operate, and once the failed component is brought back online, the newly fixed system can be brought back in synch with the rest of the cluster.

id
unit-1659132512259
type
Sponsored post

Solution providers today have their choice of clustering methods, but they also have the responsibility to bone up on these new technologies and methods of supporting more reliable applications. For providers to be effective, they might have to change their own loyalties, remixing their products, platforms and programs to leverage the new breeds of clustered applications. To play in this space, providers will need to become more proactive in marrying clustering solutions with their customers.

At its core, clustering involves the duplication of just about every computer component,from disk drives to processors to network adapters. In case one item fails, the working twin (or triplicate, or quadruplicate, depending on how many replicated servers are part of the cluster) soldier on, without the applications missing a beat.

"Clustering is ideal for a database-driven application because of synchronization issues related to redundancy," says Capel English, CTO of SNAPS, an integration firm specializing in large-scale fax-server systems that include clustering.

That's why it's got so much buy-in from big vendors, including Oracle. Oracle's clustering technology "has taken us more than 10 years to develop and is a breakthrough in the industry," says Robert Shimp, vice president of database marketing for Oracle. "It's creating a revolution in the industry. Lots of vendors are coming out with clustering solutions. Clustering drops the cost of computing dramatically, along with dramatically improving applications' reliability. It also makes Linux a much more viable alternative in enterprise computing."

Back in its monogamous days, clustering was strictly the province of the OS: You needed specialized versions of Unix and Windows to be able to support redundant, tightly coupled file servers and handle hardware failures without losing data or individual transactions.

But as Web and database applications have grown in prominence, so, too, has the need for specialized applications-layer clustered solutions, and those have begun to complicate the need for OS-layer clustered services. The issue, of course, is the data. Keeping it intact, synchronized and available in case of a failed component has become more complex as Web, database and applications servers interact and share information. The OS-layer clusters can't keep track of individual transactions, because they operate on gross-file levels. That makes clustering a harder solution for resellers and integrators to deploy.

Clustering is more than just parallel processing or its latest incarnation called server blades,servers on add-in cards that add multiprocessing horsepower. Such approaches marry several computers together for particular applications and have become more popular as hardware prices drop.

"We have seen that at the low end of the market, where people are moving toward ganging together a bunch of commodity desktops and spreading the workload over them," says Dan Kusnetzky, researcher IDC's vice president of system software research. "Of course, you have to be able to break up your application into separate pieces and then collect the results together. This can lower the cost of hardware and use lower-cost licenses or open source and get the same work done for a lower budget."

A Closer Look
Clustering encompasses many different situations, according to analysts and vendors (see "Let Us Count the Ways," page 32). Storage and applications-related clustering ups the ante for both integrators and customers. "Clustering is a logical extension of basic redundancy. We wouldn't typically take a customer to that level unless they had the fundamentals already in place,solid backup, multiple NICs in each server, multiple switches, proper power protection and so forth," says Neil Rosenberg, CEO at Quality Technology Solutions, South Orange, N.J., a network integrator delivering clustering solutions to midsize businesses in the metro New York region.

"Even though the technology for hardware clustering has been around for years, it has not been adopted by medium to small companies. Application programmers know little about hardware and don't know how to design for clustering," says Hal Lavender, chief architect at Cognizant Technologies, a leading provider of custom-software development.

The refocusing of clustering on applications has happened for other reasons as well, two of which include the older OS clustering versions falling victim to high pricing and their limited applications support.

When it comes to clustering, price is an issue. There are better and cheaper alternatives to providing fault-tolerant services than clustering available from Microsoft or versions of Unix from HP/Compaq, Sun and others. One example is the LifeKeeper software product from SteelEye, which runs on both Unix and Windows and can provide support for a number of different applications. An alternative on the hardware side is the Express line of fault-tolerant servers from NEC that run Windows 2000 and are available in configurations for less than $25,000.

"[The new HP makes it very easy to get started with their CL380 packaged cluster bundle," says Rosenberg, who sells plenty of this configuration to midsize businesses. "It is a good solution for customers looking for significant disk capacity and server failover with Windows 2000 Server or with NetWare 6.

But price isn't the only problem. "We are seeing very few applications that are cluster-aware," says Doug Garcia, vice president of sales at QuestingHound, a solution provider based in Lighthouse Point, Fla., that concentrates on security and networking services. Putting clustering services inside the OS doesn't make much sense unless those services can support a wide variety of applications without too much effort. And therein lies the rub, because the effort to make applications cluster-aware is odious, requiring the use of special scripts, programming interfaces or other techniques that are only for the most experienced developers.

That is where Oracle's Real Application Cluster (RAC) comes into play. RAC still requires clustered OSs and storage solutions, so it's an addition,not a replacement,to those technologies. But the key here is that it runs all existing Oracle applications without any modification, and provides a layer of redundancy and scalability that isn't possible with just OS and storage-clustering solutions. "RAC runs with any packaged app in the world, and it runs across the board on various OSs," Oracle's Shimp says. "That is why customers care. We have hundreds of customers using this for huge applications. It is a very sophisticated piece of software and RAC provides big savings when you compare it with the cost of a multiprocessing system."

Windows Vs. Unix
Microsoft has had three major problems with clustering and has been a distant second to the various Unix vendors. First, Microsoft Windows has lagged behind Unix in terms of supporting multiple-machine clusters. Second, it has not included clustering support for many applications into Windows, beside its own servers, such as SQL Server, Exchange and Internet Information Server. Third, even with support for its own servers, integrators still needed to make modifications to applications to make them cluster-aware.

"When it comes to sophistication in clustering software, Microsoft is a babe in the woods," Taneja says. "Folks like Veritas, Legato, IBM and Sun are much more sophisticated,they can handle 16 or 32 clustered servers, and have had more than eight server versions available for two or three years, whereas Microsoft is just coming out with this support later this year," he says.

"Our clustering story needs to be better," says Stan Sorensen, director of SQL Server marketing for Microsoft. "We have a marketing problem and some work to do there, but we have a solid foundation on which to build. Clearly, we know we can make it better,we want to be able to scale up to multiple machine clusters. We are really limited by today's hardware."

Many resellers and integrators agree: "If an end user truly needs clustering, they need to be using Unix or Linux for distributed processing," Garcia says. Oracle's Shimp also recommends Linux. "Linux dramatically lowers the cost of hardware and still achieves a good quality of service. Linux is extremely attractive for clustering."

To make matters more complex, different versions of Windows support different configurations and numbers of nodes per cluster (see "Microsoft Clustering Specs," right). However, Microsoft has made things a bit easier with the more recent versions of Windows. "Windows load-balancing services in Advanced Server made it real easy to get things set up. You don't have to be a rocket scientist to configure this," says Tom Rizzo, group product manager for SQL Server at Microsoft. "All Microsoft servers such as Exchange, Internet Information Server and SQL Server support clustering," he says.

Included in all copies of Windows 2000 are the programming interfaces for clustering, so developers can write clustered applications on any Win2k version. However, they can only deploy them on either Advanced or Datacenter Server versions.

Keys To Clustering
"Clustering forces our developers to have some knowledge of the hardware environment and the administration staff to understand fundamental architectural and application-design parameters," Cognizant's Lavender says. "For example, network administrators must be more careful about designing directories and how failovers happen from one machine to another."

Scalability is next. Having multiple servers means you can share their computing power and balance the overall processing load. Note that this is different from multiprocessing servers that contain several CPUs in a single cabinet: The clusters comprise at least two separate machines. Some clustering services support up to eight or more individual nodes. But having more nodes isn't always a better deal: "In our environment, more than two nodes doesn't really buy you anything," English says.

Third is the ability to manage the cluster as a single entity. End users, and applications and network administrators can have a single point of control over the cluster. There are numerous parameters, such as how the individual nodes of the cluster are connected together; whether the cluster is homogeneous so that each machine is an exact copy in terms of processor, network adapter and other configuration details; and whether any special software is required on each network client to access the clustered resources. Any management scheme will have to take into account those variations.

Finally, clustering style is important. A cluster can share a single disk array, so that multiple copies of the same application (such as a database or Web server) have access to the same file system. Or each node in the cluster can take ownership of its own disk resources. Clusters can be connected together using ordinary Ethernet cabling, a shared SCSI bus to connect disk

drives to multiple servers, or a dedicated private link among the servers in addition to the network connection (see "Basic Clustering Bill-of-Materials List," page 36). A typical cluster can be active/active, meaning that separate copies of the application are running concurrently on all nodes of the cluster. If one node fails, the application will continue seamlessly and be picked up by the other nodes. Alternatively, clusters can be active/passive, meaning that only one node of the cluster is running the application while the others operate as hot spares.

How you configure your cluster depends on cost and the properties of your underlying OS and applications: Some combinations only support one type of configuration.

"Some customers don't like the idea of the passive node sitting there doing nothing but waiting for the active node to fail over," English says. Passive configurations "require that you buy twice as much hardware as you need to support your production environment," Lavender says. "That means it is normally cost-prohibitive and seldom used."

Solution-Provider Concerns
Such issues make clustering a hard sell for solution providers. They have to learn how to specify, configure, deploy and support clusters. That requires a great deal of up-front effort, and continuing education as clustering evolves through the years.

Sometimes, the margins aren't wide enough to justify this kind of effort: "Hardware and software are essentially commodities," Rosenberg says. "The work,and the revenue,is in the design and integration services." That means that integrators relying just on hardware sales are going to have a tough time making any money selling clustered solutions, unless they include fees for their services as part of the package.

Numerous training courses are available that can help

integrators. "With the right resources, and for a standard installation, an experienced, strong engineer can pick this up and run with it. We used several books and a three-day Microsoft/Compaq training session held locally to get up to speed on clustering," Rosenberg says.

Such training is essential due to the specialized nature of clustering. "Most customers are either without the specialized experience and skill levels, or have the talent onboard but can't spare them from other critical areas," Garcia says.

"Many users aren't prepared for the complexity and support issues of clustered applications. You have to take it on an application-by-application basis, and find out about licensing requirements, specialized drivers and interfaces, and any upgrades that are needed," he says. "In many cases, their money would be better spent on high-availability servers such as NEC's Fault Tolerant Platform. About 95 percent of the requests we get for clustering are just for high-availability servers, and we would rather recommend a fault-tolerant solution that has simpler implementation and lower cost of ownership."

Veritas And Oracle
Finally, VARs can go the route of using clustering provided by the application vendors themselves,the two leading vendors are Veritas and Oracle. Cluster Server from Veritas supports a wide range of Windows and Unix clusters and storage configurations. Lavender is a big fan of Veritas' clustering software on both Unix and Windows.

"Veritas' offering is both vendor-neutral and packed with features that normally meet the needs of any application," he says. "However, Microsoft's Clustering Service is still in its infancy."

"Application clustering allows users to resubmit a transaction and have it processed normally on the other nodes in the cluster immediately after a failure occurs," Lavender says. "Application clustering can normally be configured to provide load-balancing between instances/nodes so in normal operation, load-balancing can be used to improve overall system throughput. Hardware clustering, however, ensures that after a brief period,typically a few seconds to a few minutes,that transaction can again be processed."

Oracle's RAC presents a single database image for any applications, yet handles failover and failback situations. "It has only been on the market for about a year, and a small percentage of our customers are using it," Oracle's Shimp says. "While the uptake is small, it is encouraging to us." And, unlike Microsoft's clustering solution, RAC doesn't require any modification to existing applications. "You can't actually run any clustered applications with SQL Server; you would have to make a custom application to make clustering work," he says.

Of course, there are downsides, as well. "Oracle's RAC is very difficult to set up and not every application needs to be done that way," Taneja says. "RAC makes sense for supporting one large database and when you want to make sure that you have two or more servers to provide better performance."

No matter which way an integrator goes on clustering,applications or OS,it is still a hard sell and will take some effort to support. But resellers who can marry the right solution will find that clustering can be a profitable source of revenue, and find that these marriages will have staying power as their customers continue to grow their applications.

"To implement clustering has required reasonable skills,if you are a VAR, you are going to have to really provide value and know what you are doing," Taneja says. "Because it is complicated, those VARs that understand these issues can get paid quite a bit of money, because the clustering customer can

really use that help.