ClusterComputingCodeProject
articles
Q&A
forums
Sign in
lounge
Searchforarticles,questions,tips
Cluster Computing
Lubna Luxmi Chowdhry, 20 Sep 2005
Rate this:
3.74 14 votes
Cluster is a term meaning independent computers combined into a unified system through software and networking. Clusters
are typically used for High Availability for greater reliability or High Performance Computing to provide greater computational
power than a single computer can provide.
Introduction
Cluster is a widely used term meaning independent computers combined into a unified system through software and
networking. At the most fundamental level, when two or more computers are used together to solve a problem, it is considered
a cluster. Clusters are typically used for High Availability HA for greater reliability or High Performance Computing HPC to
provide greater computational power than a single computer can provide.
As highperformance computing HPC clusters grow in size, they become increasingly complex and timeconsuming to
manage. Tasks such as deployment, maintenance, and monitoring of these clusters can be effectively managed using an
automated cluster computing solution.
Why a Cluster
Cluster parallel processing offers several important advantages:
Each of the machines in a cluster can be a complete system, usable for a wide range of other computing applications.
This leads many people to suggest that cluster parallel computing can simply claim all the "wasted cycles" of
workstations sitting idle on people's desks. It is not really so easy to salvage those cycles, and it will probably slow your
coworker's screen saver, but it can be done.
The current explosion in networked systems means that most of the hardware for building a cluster is being sold in high
volume, with correspondingly low "commodity" prices as the result. Further savings come from the fact that only one
video card, monitor, and keyboard are needed for each cluster although you will have to swap these to each machine to
perform the initial installation of Linux; once running, a typical Linux PC does not need a "console". In comparison, SMP*
and attached processors are much smaller markets, tending towards somewhat higher price per unit performance.
Cluster computing can scale to very large systems. While it is currently hard to find a Linuxcompatible SMP with many
more than four processors, most commonly available network hardware easily builds a cluster with up to 16 machines.
With a little work, hundreds or even thousands of machines can be networked. In fact, the entire Internet can be viewed
as one truly huge cluster.
The fact that replacing a "bad machine" within a cluster is trivial compared to fixing a partly faulty SMP yields much
higher availability for carefully designed cluster configurations. This becomes important not only for particular
applications that cannot tolerate significant service interruptions, but also for general use of systems containing enough
http://www.codeproject.com/Articles/11709/ClusterComputing
1/7
30/11/2015
ClusterComputingCodeProject
processors so that singlemachine failures are fairly common. For example, even though the average time to failure of a
PC might be two years, in a cluster with 32 machines, the probability that at least one will fail within 6 months is quite
high.
OK, so clusters are free or cheap and can be very large and highly available... why doesn't everyone use a cluster? Well, there are
problems too:
With a few exceptions, network hardware is not designed for parallel processing. Typically latency is very high and
bandwidth relatively low compared to SMP and attached processors. For example, SMP latency is generally no more
than a few microseconds, but is commonly hundreds or thousands of microseconds for a cluster. SMP communication
bandwidth is often more than 100 MBytes/second, whereas even the fastest ATM network connections are more than
five times slower.
There is very little software support for treating a cluster as a single system.
Thus, the basic story is that clusters offer great potential, but that potential may be very difficult to achieve for most
applications. The good news is that there is quite a lot of software support that will help you achieve good performance for
programs that are well suited to this environment, and there are also networks designed specifically to widen the range of
programs that can achieve good performance.
*SMP: Multiple processors were once the exclusive domain of mainframes and highend servers. Today, they are common in all
kinds of systems, including highend PCs and workstations. The most common architecture used in these devices is symmetrical
multiprocessing SMP. The term "symmetrical" is both important and misleading. Multiple processors are, by definition,
symmetrical if any of them can execute any given function.
2/7
30/11/2015
ClusterComputingCodeProject
Generation. In the former, large input datasetsoften taken from some scientific instrumentare processed to identify patterns
and/or produce aggregated descriptions of the input. This is the most common scenario for seismic processing, as well as
similarly structured analysis applications such as micro array data processing, or remote sensing. In the latter scenario, small
input datasets parameters are used to drive simulations that generate large output datasetsoften time sequencedthat can be
further analyzed or visualized. Examples here include crash analysis, combustion models, weather prediction, and computer
graphics rendering applications used to generate special effects and fullfeature animated films.
Figure 1
3/7
30/11/2015
ClusterComputingCodeProject
Yet until recently, these storage systemsimplemented using traditional SAN and NAS architectureshave only been able to
support modestsized clusters, typically no more than 32 or 64 nodes.
Figure 2
Each of these traditional approaches has its limitations in high performance computing scenarios. SAN architectures improve on
the DAS model by providing a pooled resource model for physical storage that can be allocated and reallocated to servers as
required. But data is not shared between servers and the number of servers is typically limited to 32 or 64. NAS architectures
afford file sharing to thousands of clients, but run into performance limitations as the number of clients increases.
While these storage architectures have served the enterprise computing market well over the years, cluster computing
represents a new class of storage system interactionone requiring high concurrency thousands of compute nodes and high
aggregate I/O. This model pushes the limits of traditional storage systems.
4/7
30/11/2015
ClusterComputingCodeProject
of gigabytes per second aggregate throughputall in a single global namespace with dynamic load balancing and data
redistribution. These systems extend current SAN and NAS architectures, and are being offered by Panasas ActiveScale, Cluster
File Systems Lustre, RedHat Sistina GFS, IBM GPFS, SGI CxFS, Network Appliance SpinServer, Isilon IQ, Ibrix Fusion,
TerraScale Terragrid, ADIC StorNext, Exanet ExaStore, and PolyServe Matrix.
These solutions use the same divideandconquer approach as scaleout computing architecturesspreading data across the
storage cluster, enhancing data and metadata operation throughput by distributing load, and providing a single point of
management and single namespace for a large, high performance file system.
Conclusion
With the advent of cluster computing technology and the availability of low cost cluster solutions, more research computing
applications are being deployed in a cluster environment rather than on a single sharedmemory system. Highperformance
cluster computing is more than just having a large number of computers connected with highbandwidth lowlatency
interconnects. To achieve the intended speedup and performance, the application itself has to be well parallelized for the
distributedmemory environment.
Points of Interest
My final year thesis project is on the "Development of Grid Resource Allocation and Management GRAM for High Performance
Computing" which is totally related to Grid Computing. Before writing this article I studied Grid Computing and Cluster
Computing in depth and understood the differences between them. After studying, I planned to write an article on "What is
cluster computing, what are its benefits, limitations and scope. And how does Cluster Computing differentiate with Grid
Computing".
License
This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves.
If in doubt please contact the author via the discussion board below.
A list of licenses authors might use can be found here
Share
EMAIL
http://www.codeproject.com/Articles/11709/ClusterComputing
TWITTER
5/7
30/11/2015
ClusterComputingCodeProject
Go
First Prev Next
My vote of 4
Mic 13Nov12 5:32
1
http://www.codeproject.com/Articles/11709/ClusterComputing
6/7
30/11/2015
ClusterComputingCodeProject
hi
rajeshbritto
28Jun08 3:22
Refresh
General
News
1
Suggestion
Question
http://www.codeproject.com/Articles/11709/ClusterComputing
Bug
Answer
Select Language
Joke
Praise
Rant
Admin
7/7