Clusters: Open Source Meets Commodity Hardware

During the last semester I wrote two papers for my Computer Architectures class. I spent quite a bit of time on them and have been thinking about posting them on my weblog for quite some time. I’m a bit worried about plagarism though, and I’m not sure what to do about it. I’m pretty sure that I can submit it to the auto-plagarism-detector service that my university subscribes to, and I’m probably going to do that now that this paper is posted.

Secondly, I’m releasing this paper under the by-nc-sa (Attribution NonCommercial ShareAlike 2.0) license, so unless you can turn in your paper to your teacher with a by-nc-sa license displayed on it, you can’t include it in your paper without proper citation.

PLEASE NOTE: If you are considering plagarising this, please don’t. If your teacher allows you to cite non-academic internet sources, then by all means borrow my ideas and cite me. What I would really suggest doing is taking a look at my primary sources and then heading to your university library or computer system to consult them yourself. All of the ACM journal sources that I cited are available online if your university subscribes to the ACM Portal. This paper was thoroughly researched but there were some late nights involved in the production of it so it is provided WITHOUT WARRANTY against correctness or anything like that. My RAID paper is probably a little better than this one.

This work is licensed under a Creative Commons License.

Clusters:

Open Source Meets

Commodity Hardware

Matt Croydon

CMSC 311

May 6, 2005

Before the mid-1980's, most supercomputers were large, monolithic machines. Over time the Top 500 Supercomputers list has seen clusters go from non-existent to being the dominant architecture[1], currently representing 296 out of the top 500 slots (almost 60%)[2]. Compared to monolithic supercomputers such as those from Cray Research, clusters are extremely cheap for the amount of performance realized. When lower cost is combined with cheap off-the-shelf hardware and open source software platforms, clusters can't help but improve and gain popularity.

Different Tools for Different Jobs

The definition of the word â€œclusterâ€ varies greatly depending on the context in which it is used. A cluster is commonly used in high availability situations when, for example, equipment must gracefully fail over or requests must be divided among the available hardware. For clustered application servers, this can be accomplished by simple round robin DNS entries or more complex load balancing hardware or software.

Clusters are also used when data needs to be stored on multiple machines for redundancy or performance sake. MySQL, an open source database program, can use a cluster of computers for both data replication as well as load balancing[3] in several configurations.

This paper will focus on the most common and most popular form of clustering: clusters used for parallel computing, scientific computing, and simulation in educational, professional, and government organizations. More specifically, it will focus on open source software that is available to make the construction and administration of clusters easier and more powerful.

A Brief History

Cluster computing has its roots in the mid 1980's when developers wanted to tie together multiple computers in order to harness their collective power. In 1985, Amnon Barak developed the first predecessor to Mosix called MOS that ran on a cluster of four Digital Equipment Corporation (DEC) PDP/11 computers[4]. In 1986, DEC decided try try clustering for themselves with VAXCluster[5]. At the time, VAXCluster was able to take advantage of a much higher data rate of 70mbit/sec[5] but because of the proprietary interconnect used, VAXCluster remained much more tightly coupled, while MOS and Mosix decided to use token ring LAN technology[4]. As Mosix was ported to other platforms and improved, it was also available to take advantage of advances in networking technology without its looser coupling being effected. Mosix relied on patches to the Unix kernel in order to allow processes to migrate among nodes in the cluster. Mosix was later ported to the Linux kernel by Moshe Bar[6], where it thrives as an open source project.

Beowulf came on to the scene in the early to mid 1990's with a huge splash. Beowulf allowed users to tie together large numbers of lower cost desktop hardware (486 DXes at the time) rather than the specialized hardware used by Mosix and VAXCluster.[7]. The project originated at NASA's Goddard Space Flight Center and quickly led to a successful open source project[8] as well as a successful business for some of the original developers called Scyld Software.

Beowulf Internals

PVM (Parallel Virtual Machine) and MPI (Message Passing Interface) are standards for preforming parallel operations. Both frameworks have language bindings (for example FORTRAN, C, Perl, and other languages) that abstract the underlying standard in to something easier to work with. The direction of MPI is steered by a group called the MPI Forum. Since the release of the initial specification, the MPI Forum have updated the specification to MPI 2.0, adding features and clarifying issues that were deemed important[9] Each cluster software, tool, or operating system vendor implements their own version of MPI, so cross-platform compability is not guaranteed, but porting between MPI implementations is quite possible.

In contrast to MPI, PVM software is usually provided at the PVM website[10] in either source or binary form. From there users can call the PVM library directly or though third party bindings. PVM provides binaries for Windows, which can allow users to program parallel applications on a platform that they may be more familiar with. However, most Beowulf clusters run standard Linux or some variant thereof. PVM also supports monolithic parallel computers such as Crays and other specialized machines. Further differences and similarities between MPI and PVM can be found in the paper Goals Guiding Design: PVM and MPI by William Gropp and Ewing Lusk[11].

In recent years, another tool called BProc (the Beowulf Distributed Process Space) has expanded the abilities of parallel processing and management of data between nodes. BProc allows a parallel job to be started on the main controlling node and parallel-capable processes are automatically migrated to child nodes[12]. This paradigm is also used by Mosix and OpenMosix, which will be discussed later. BProc is an open source project available at bproc.sourceforge.net.

Parallel processes also need to take in to consideration the amount of time that will be needed for preparation, cleanup, and merging of parallel data. Amdahl's law[26] stipulates that the total execution time for a parallel program is equal to the parallel part of the problem divided by the number of nodes plus the serial part of the program. Even if a cluster contains thousands of nodes, the amount of time ti takes to execute the serial code is going to remain constant.

How to Build a Beowulf[7]

Large Beowulf clusters run complex simulations and crunch teraflops of information per second. At the same time, small 4-16 node clusters are often used in educational settings to teach parallel processing design paradigms to Computer Science students as well as cluster design and implementation to Computer Engineering students.

College Beowulf clusters are often (but not always) comprised of outdated computers and hand-me-down hardware. While extremely fast speeds cannot be obtained with these antiquated clusters, they are valuable in teaching and observing the differences between a program or algorithm written for a single processor machine and the same program/algorithm written for and run on a cluster.

There are several tools available for deploying a Beowulf cluster, but almost all require a basic installation of a compatible Linux distribution on either the mater node or the master and all child nodes. Scyld software makes what is widely considered the easiest to install Beowulf software. All that is needed is a $3 CD[14] containing an unsupported Scyld distribution for the master and each child node. Official copies with commercial support are also available directly from Scyld. Once the CD is booted on the mater note, a simple installation menu is presented. After installing and configuring Scyld on the master node, insert a Scyld CD in each child node and they automatically get their configuration information from the master node and the child nodes can run directly from CD.

Another popular package that runs on top of many modern RPM-based Linux distributions is OSCAR, the Open Source Cluster Application Resources project[15]. OSCAR offers a very simple user interface to install and configure cluster software on the master node. Once that is accomplished, client nodes can be Network booted and the client software automatically installed. OSCAR also supports other installation and boot methods.

While many colleges take the small cluster approach, Virginia Tech has taken advantage of the modern Macintosh platform and created a top 10 supercomputer for a fraction of traditional costs. Virginia Tech started out with desktop machines, but now maintains a cluster of 1100 Apple XServe 1U servers running Mac OS X server (based on an open source BSD-derived core called Darwin).

Another Approach to Clustering: OpenMosix

While most Beowulf clusters are dedicated to cluster-related tasks all the time, clustering does not have to be that way. OpenMosix is a set of patches to the Linux kernel and a few userland monitoring tools for keeping track of where processes are running and how efficiently. OpenMosix is extremely flexible. Nodes can join or leave a cluster whenever they wish. Many programs and algorithms can take advantage of clustering with the automatic node migration built in to OpenMosix. Whenever a new process is spawned or forked (as is common in traditional Unix-like software design) OpenMosix may choose to execute that process locally or on another node.

Many OpenMosix clusters are implemented in a head/client node configuration much like Beowulf clusters, but they are not limited to such configurations. Because OpenMosix is just a patch to the standard kernel, machines in a cluster can have multiple uses. They can run standard graphical window managers and be used as desktop machines while processes are migrated to them if they have computing cycles to spare. OpenMosix does an excellent job at making sure that client nodes still have enough resources to do whatever else they are doing in addition to cluster process execution.

In addition to the mutli-use scenario, OpenMosix cluster nodes can run as true peers. For example, if there are 20 computers currently connected to a dynamic cluster and all but a few of them are idle, processes from the machines being actively used can be automatically migrated for execution throughout the cluster. Similarly, if all computers are heavily used, virtually no process migration will occur since execution will be quicker on the local machine. Also, if 400MHz desktop machine needs to do some complex calculations, as long as the program is written in a way that can take advantage of process migration, those calculations could be run extremely quickly on an idle 3GHz machine. Many of the scenarios above are described in a Linux Journal article entitled Clusters for Nothing and Nodes for Free[16], but also come from my experiences building and experimenting with a 2-3 node OpenMosix cluster a few years ago[17].

Recently the OpenMosix community has embraced â€œinstant clusters,â€ or the idea that any hardware with local network connections can become a cluster without interfering with its other uses. The OpenMosix website lists a page[18] with several open source â€œinstant clusterâ€ software projects. The most popular project is called ClusterKnoppix[19], a Linux distribution with OpenMosix installed on it that runs directly from CD-ROM. With a minimum of one CD burned on a master node, a 30 seat computer lab can instantly become a 30 node cluster without disturbing the operating system installed on the hard drives.

To share data among nodes, OpenMosix uses the Cluster File System, a concept originally developed for the Mosix project called the Mosix File System[4]. The file system was renamed after the Mosix project closed its source code and Moshe Bar and others began working on the GPL-licensed[20] code which would become OpenMosix between 2001 and 2002. This cluster file system along with the ability to run a cluster as peer-nodes gives OpenMosix quite an advantage over traditional monolithic and cluster systems.

How Open Source Helps Clusters

While some computational clusters run on Windows, the vast majority run on top of an open-source Linux distribution. The Linux Kernel itself is open source and depending on the Linux distribution, all, most, or at least some of the operating system is open source. Sometimes Linux distributions can be open source without being free (as in no cost) such as Red Hat Enterprise Linux. There are many excellent free (open source and no cost) Linux distributions to run Beowulf, OpenMosix, or any other type of clustering software on.

There are many open source applications that help users install, configure, and maintin clusters; many have been mentioned before. These include OSCAR, Beowulf and OpenMosix themselves, various PVM and MPI implmentations, BProc, and more. In addition to the tools already mentioned, there is a suite of open source utilies for OpenMosix called openMosixView[21]. The various programs included in the suite allow for visualization as well as graphical management of the cluster, visual feedback for processes, process migration, load per node, and also allow for logging and analysis of cluster performance.

There are many other interesting open source clustering projects that don't require a Beowulf or OpenMosix frame to run on. One of the most popular examples of this is distcc[22], a program that allows for distributed compilation of C or C++ code. Distcc is quite lightweight and does not require a shared filesystem, it just requires child nodes to be running distcc in daemon mode.

The Future of Clusters

While Robert Lucke considers openMosix the next generation of clustering software because of its flexibility[23], some of the most stunning advances are happening in the world of grid and distributed computing[24]. Grid computing can mean different things to different people, but generally extends computing platforms beyond location and geography.

The SETI@home project[25] has managed to create a very powerful supercomputer by utilizing the spare CPU cycles of thousands of desktop machines spread throughout the world. The program usually runs as a screen saver so that it does not consume computing resources while the machine is being actively utilized. SETI@home and other projects are pushing the envelope of using spare processor cycles to tackle a task that would otherwise require large dedicated clusters or supercomputers.

While grid and distributed computing may take away part of the supercomputing market share that clusters (and particularly those built on open source software using commodity hardware), I believe that clusters are here to stay. Individual component prices continues to drop, network throughput is improving, and cluster software continues to evolve. Expect to hear even more about clusters over the next several years.

References

[1] Top500 Supercomputer Sites, â€œCharts for November 2004 â€“ Clusters (NOW),â€ April 2005, http://top500.org/lists/2004/11/overtime.php?c=5.

[2] Top500 Supercomputer Sites, â€œHighlights from Top500 List for November 2004,â€ April 2005, http://top500.org/lists/2004/11/trends.php.

[3] J. Zawodny and D. Balling, High Performance MySQL, Sebastapol: O'Reilly and Associates, 2004, chaps. 7 and 8.

[4] A. Barak et al, The Mosix Distributed Operating System: Load Balancing for Unix, Berlin: Springer-Verlag, 1993, pp. 1-18.

[5] N. Kronenberg et al, â€œVAXcluster: a closely-coupled distributed system,â€ in ACM Transactions on Computer Systems (TOCS), 1986, pp. 130-146.

[6] The openMosix Project, â€œopenMosix, an Open Source Linux Cluster Project,â€ April 2005, http://openmosix.sourceforge.net/.

[7] T. Sterling et al, How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters, Cambridge, Mass: The MIT Press, 1999.

[8] The Beowulf Project, â€œBeowulf.org: The Beowulf Cluster Site,â€ April 2005, http://www.beowulf.org/.

[9] The MPI Forum, â€œMessage Passing Interface,â€ April 2005, http://www-unix.mcs.anl.gov/mpi/.

[10] Computer Science and Mathematics Divison, Oak Ridge National Laboratory, â€œPVM: Parallel Virtual Machine,â€ April 2005, http://www.csm.ornl.gov/pvm/pvm_home.html.

[11] W. Gropp and E. Lusk, â€œGoals Guiding Design: PVM and MPI,â€ in IEEE International Conference on Cluster Computing (CLUSTER'02), 2002, pp. 257-268.

[12] E. Hendricks, â€œBProc: The Beowulf Distributed Process Spaceâ€ in Proceedings of the 16
th
international conference on Supercomputing, 2002, pp. 129-136.

[13] P. Prins, â€œTeaching Parallel Computing Using Beowulf Clusters: A Laboratory Approach,â€in Journal of Computing Sciences in College, 2004, pp. 55-61.

[14] Linux Central, â€œCDROM with Scyld Beowulf,â€ April 2005, http://linuxcentral.com/catalog/index.php3?prod_code=L000-089.

[15] Open Source Cluster Application Resources, â€œOSCAR: Open Source Cluster Application Resourcesâ€, April 2005, http://oscar.openclustergroup.org/.

[16] A. Perry et al, â€œClusters for Nothing and Nodes for Free,â€ Linux Journal, Vol 2004, Issue 123, July, 2004.

[17] Matt Croydon, â€œOpenMosix Success,â€ April 2005, http://www.postneo.com/2002/11/20/openmosix-success.

[18] openMosix, â€œInstant openMosix, The Fast Path to an openMosix Cluster,â€ April 2005, http://openmosix.sourceforge.net/instant_openmosix_clusters.html.

[19] ClusterKnoppix, â€œClusterKnoppix: Main Page,â€ April 2005 http://bofh.be/clusterknoppix/.

[20] Open Source Initiative, â€œOpen Source Initiative – The GPL:Licensingâ€ April 2005 http://www.opensource.org/licenses/gpl-license.php.

[21] openMosixView, â€œopenMosixView: a cluster-management GUI,â€ April 2005, http://www.openmosixview.com/index.html.

[22] Martin Pool, â€œdistcc: a fast, free distributed C/C++ compiler,â€ April 2005,

http://distcc.samba.org/.

[23] R. Lucke, Building Clustered Linux Systems (Hewlett-Packard Professional Books), Upper Saddle River, New Jersey: Prentice Hall, 2004.

[24] M. Holliday et al, â€œA Geographically-distributed, Assignment-structured, Undergraduate Grid Computing Courseâ€ in Proceedings of the 36th SIGCSE technical symposium on Computer science education, 2005, pp 206-210.

[25] The SETI@home Project, â€œSETI@home: Search for Extraterrestrial Intelligence at Home,â€ April 2005, http://setiathome.ssl.berkeley.edu/.

[26] G. Pfister, In Search of Clusters: The Coming Battle in Lowly Parallel Computing. Upper Saddle River, New Jersey: Prentice Hall, 1998. pp. 184-185.

Comments

96 responses to “Clusters: Open Source Meets Commodity Hardware”

youtube downloader

2/4/2011

bwjmy youtube downloader ktlcuo
teneryfa

2/4/2011

Great blog! I genuinely love how it is easy on my eyes as well as the info are well written.
Natural weight loss pills

2/5/2011

Why there are so many comments in this blog?
vyvoz stroitelnogo musora

2/6/2011

vyvoz stroitelnogo musora spb…

vyvoz musora…
wall safe

2/7/2011

An cool blog post right there mate ! Cheers for the post .
dieta

2/8/2011

Danke fÃ¼r die Design-Tipps. Kann ich gut gebrauchen.
versacheck payroll

2/8/2011

You have done an extraordinary job! we appreciate you your fantastic article as well as blog this makes my day…
pozycjonowanie stron

2/9/2011

Generally I do not post on blogs, but I would like to say that this post really forced me to do so. Really nice post!
dieta

2/10/2011

When I read all comments I want cry…
fheiffh

2/10/2011

This site has plenty of extremely useful information on it! Appreciate informing me. I must admit that I was curious about the feel of your blog but it is really quite appealing. Awesome job, keep it up
budapest reseguide

2/10/2011

continue with the the great work on the site. I appreciate it. Could use some more frequent updates, but i am sure that you got some better stuff to do , hehe. :p
teds woodworking review

2/10/2011

When are you going to post again? You really entertain me!
pozycjonowanie

2/11/2011

Would you be thinking about exchanging links?
link building

2/11/2011

Have you thought about adding some videos to your article? I think it might enhance viewers’ understanding.
zero friction marketing

2/11/2011

You made several nice points there. I did a search on the matter and found nearly all people will go along with with your blog.
dentysta KrakÃ³w

2/11/2011

I added your blog to bookmarks. And iâ€™ll read your articles more often! Before this, it would be possible for the government to arrest you just based on whatever you were saying, if they didn’t like it.
zero friction marketing

2/11/2011

You made several fine points there. I did a search on the matter and found nearly all people will agree with your blog.
zero friction marketing

2/11/2011

You made certain nice points there. I did a search on the subject and found a good number of people will have the same opinion with your blog.
adult jobs

2/11/2011

Great posting keep up the good work
paraprotex

2/12/2011

Wow â€“ now thatâ€™s perspective! I think we often react in agreement or disagreement because of our emotions, but hearing another side, passionately presented, really makes us think!
last minute Turcja

2/12/2011

Your site has helped me a lot to bring back more confidence in myself. Thanks! Ive recommended it to my friends as well.
funny dogs

2/12/2011

keep up the nice work on the blog. I love it. Could use some more frequent updates, but i’m quite sure that you got better stuff to do , hehe. 😉
sterydy

2/12/2011

Mate thanks for such a good article. I completely agree with you
noni

2/13/2011

Good posting, im subscribing to your rss. Many thanks once more.
Issac Maez

2/13/2011

Hey, I searched for this blog on Bing and just wanted to say thanks for the excellent read. I would have to agree with it, thank you again!
Gregory Gautier

2/14/2011

I have found your blog on Google and the contents on your post is very elucidative. I pleased reading this one.
Nelly Batcher

2/14/2011

I have found your blog on Google and the contents on your post is very elucidative. I enjoy reading this one.
Seolinkvine

2/15/2011

I regard something genuinely special in this website .
dance classes

2/15/2011

Your articles are very inspiring indeed. I keep coming back!
Paketresor

2/16/2011

keep up the good work on the blog. I appreciate it. Could maybe use some more updates more often, but i am quite sure you have got other things stuff to do like we all do. 🙂
paraprotex

2/16/2011

Good share, great article, very usefull for usâ€¦thanks.
Lilliana Winikoff

2/16/2011

Your weblog is excellent. Thank you greatly for delivering a lot of very helpful information and facts. Thanks, turah Wiederin
Hank Fredlund

2/16/2011

Wow! Thank you! I always wanted to write in my site some thing like that. Can I take part of your publish to my blog? Thanks, rnelius Ruter
Luella Timper

2/16/2011

Your weblog is excellent. Thank you greatly for delivering a lot of very helpful information and facts. Thanks, rt Thury
Bridgett Mundorf

2/17/2011

Dude, I must say that overall I am fairly taken with this web page. It is apparent that you know you subject matter and you are passionate about it.
Rick Davids

2/17/2011

Iâ€™ll be subscribing to your feed and I hope you post again soon. Thanks , iah Platek
Kira Vanauken

2/17/2011

I think that is an interesting point, it made me think a bit. Thanks for sparking my thinking cap.
studia mba

2/17/2011

Spotted a link to this post over at Delicious. Thanks for posting it.
pozycjonowanie

2/17/2011

Hi there 😀 Is it okay if I go a bit off topic? I’m trying to view your website on my new mobile but it doesn’t display properly, do you have any suggestions?
deskowania

2/17/2011

I was suggested this website by my cousin. I’m not sure whether this post is written by him as nobody else know such detailed about my problem. You are amazing! Thanks!
pozycjonowanie Warszawa

2/17/2011

Loved it, thanks.
drzwi drewniane warszawa

2/17/2011

Spot on with this write-up, I actually think this website needs way more consideration. Iâ€™ll probably be again to learn far more, thanks for that info.
wpolscemamymocneseo

2/18/2011

Wow! Thanx! I always wanted to write in my own site something like that. May I take part of your post to my blog?
learn dance online

2/18/2011

Fantastic write-up, im subscribing to your feed. Cheers yet again.
Paketresorna

2/18/2011

keep up the good work on the site. I love it. Could use some more frequent updates, but i am quite sure you have got more or better stuff to do like we all do. =p
Temple Jansson

2/22/2011

Thank you for this intresting article. :-)y Shutters