What is a Beowulf?
Beowulf is way of building a supercomputer out of a bunch of smaller computers. The smaller computers are connected together by a LAN
(local area network) which is usually an Ethernet. Historically, these smaller computers were cheap computers; even Pentium I and
486 class machines running Linux! That was the appeal. Using cheap hardware that nobody wanted and using it to build something
resembling a supercomputer.
It's said that the Beowulf idea can enable a modest university department or small research group to obtain a computer which can
operate in the gigaflop range (a billion floating point calculations per second). Normally, only mega rich corporations like IBM, AT&T
and the NSA can afford such awesome computational power.
In the taxonomy of parallel computing, a beowulf cluster is somewhere below a massively parallel processor (MPP) and a network of
workstations (NOW).
Why Beowulf
Besides getting the equivalent of a supercomputer for literally a fraction of the cost, some other benefits of a Beowulf cluster:
- Decrease vendor dependency. No proprietary hardware/software means that you have full control of the system and can scale it or
modify it however you please without having to pay for corporate assistance. You don't have to restrict yourself to a single
vendor's hardware.
- Most Beowulf clusters run the Linux operating system, known for its stability and speed. Also known for its price (free).
- ScalabilityL There are almost no limits to how large a Beowulf can be scaled. If you find a 486 out near the dumpster, waiting to
be picked up, snatch it up and simply add it to your cluster.
- Because all Beowulfs are Linux, you're guarenteed that anything you write for one Beowulf will perform correctly on other Beowulfs.
History of the Beowulf
The first Beowulf was developed in 1994 at the Center of Excellence in Space Data and
Information Sciences (CESDIS), a contractor to NASA at the Goddard Space Flight Center in
Greenbelt, Maryland. It was originally designed by Don Becker and Thomas Sterling and
consisted of 16 Intel DX4 processors connected by 10MBit/sec ethernet.
Beowulf was built by and for researchers with parallel programming experience. Many of
these researchers have spent years fighting with MPP vendors, and system administrators
over detailed performance information and struggling with underdeveloped tools and new
programming models. This lead to a "do-it-yourself" attitude. Another reality they
faced was that access to a large machine often meant access to a tiny fraction of the
resources of the machine shared among many users. For these users, building a cluster
that they can completely control and fully utilize results in a more effective, higher
performance, computing platform. The realization is that learning to build and run a
Beowulf cluster is an investment; learning the peculiarities of a specific vendor only
enslaves you to that vendor.
These hard core parallel programmers are first and foremost interested in high performance computing
applied to difficult problems. At Supercomputing '96 both NASA and DOE demonstrated clusters costing
less than $50,000 that achieved greater than a gigaflop/s sustained performance. A year later, NASA
researchers at Goddard Space Flight Center combined two clusters for a total of 199, P6 processors and
ran a PVM version of a PPM (Piece-wise Parabolic Method) code at a sustain rate of 10.1 Gflop/s. In the
same week (in fact, on the floor of Supercomputing '97) Caltech's 140 node cluster ran an N-body problem
at a rate of 10.9 Gflop/s. This does not mean that Beowulf clusters are supercomputers, it just means one
can build a Beowulf that is big enough to attract the interest of supercomputer users.
Beyond the seasoned parallel programmer, Beowulf clusters have been built and used by programmer with
little or no parallel programming experience. In fact, Beowulf clusters provide universities, often with
limited resources, an excellent platform to teach parallel programming courses and provide cost effective
computing to their computational scientists as well. The startup cost in a university situation is minimal for
the usual reasons: most students interested in such a project are likely to be running Linux on their own
computers, setting up a lab and learning of write parallel programs is part of the learn experience.
In the taxomony of parallel computers, Beowulf clusters fall somewhere between MPP (Massively Parallel
Processors, like the nCube, CM5, Convex SPP, Cray T3D, Cray T3E, etc.) and NOWs (Networks of
Workstations). The Beowulf project benefits from developments in both these classes of architecture.
MPPs are typically larger and have a lower latency interconnect network than an Beowulf cluster.
Programmers are still required to worry about locality, load balancing, granularity, and communication
overheads in order to obtain the best performance. Even on shared memory machines, many programmers
develop their programs in a message passing style. Programs that do not require fine-grain computation
and communication can usually be ported and run effectively on Beowulf clusters. Programming a NOW is
usually an attempt to harvest unused cycles on an already installed base of workstations in a lab or on a
campus. Programming in this environment requires algorithms that are extremely tolerant of load balancing
problems and large communication latency. Any program that runs on a NOW will run at least as well on a
cluster.
A Beowulf class cluster computer is distinguished from a Network of Workstations by several subtle but
significant characteristics. First, the nodes in the cluster are dedicated to the cluster. This helps ease load
balancing problems, because the performance of individual nodes are not subject to external factors. Also,
since the interconnection network is isolated from the external network, the network load is determined
only by the application being run on the cluster. This eases the problems associated with unpredictable
latency in NOWs. All the nodes in the cluster are within the administrative jurisdiction of the cluster. For
examples, the interconnection network for the cluster is not visible from the outside world so the only
authentication needed between processors is for system integrity. On a NOW, one must be concerned
about network security. Another example is the Beowulf software that provides a global process ID. This
enables a mechanism for a process on one node to send signals to a process on another node of the system,
all within the user domain. This is not allowed on a NOW. Finally, operating system parameters can be
tuned to improve performance. For example, a workstation should be tuned to provide the best interactive
feel (instantaneous responses, short buffers, etc), but in cluster the nodes can be tuned to provide better
throughput for coarser-grain jobs because they are not interacting directly with users.
The Beowulf Project grew from the first Beowulf machine and likewise the Beowulf community has grown
from the NASA project. Like the Linux community, the Beowulf community is a loosely organized
confederation of researcher and developer. Each organization has its own agenda and its own set of reason
for developing a particular component or aspect of the Beowulf system. As a result, Beowulf class cluster
computers range from several node clusters to several hundred node clusters. Some systems have been
built by computational scientists and are used in an operational setting, others have been built as test-beds
for system research and others are serve as an inexpensive platform to learn about parallel programming.
Most people in the Beowulf community are independent, do-it-yourself'er. Since everyone is doing their
own thing, the notion of having a central control within the Beowulf community just doesn't make sense.
The community is held together by the willingness of its members to share ideas and discuss successes
and failures in their development efforts. The mechanisms that facilitate this interaction are the Beowulf
mailing lists, individual web pages and the occasional meeting or workshop.
The future of the Beowulf project will be determined collectively by the individual organizations contributing
to the Beowulf project and by the future of mass-market COTS. As microprocessor technology continues
to evolve and higher speed networks become cost effective and as more application developers move to
parallel platforms, the Beowulf project will evolve to fill its niche.