Clusters: Open Source Meets Commodity Hardware


Posted

in

, ,

During the last semester I wrote two papers for my Computer Architectures class. I spent quite a bit of time on them and have been thinking about posting them on my weblog for quite some time. I’m a bit worried about plagarism though, and I’m not sure what to do about it. I’m pretty sure that I can submit it to the auto-plagarism-detector service that my university subscribes to, and I’m probably going to do that now that this paper is posted.

Secondly, I’m releasing this paper under the by-nc-sa (Attribution NonCommercial ShareAlike 2.0) license, so unless you can turn in your paper to your teacher with a by-nc-sa license displayed on it, you can’t include it in your paper without proper citation.

PLEASE NOTE: If you are considering plagarising this, please don’t. If your teacher allows you to cite non-academic internet sources, then by all means borrow my ideas and cite me. What I would really suggest doing is taking a look at my primary sources and then heading to your university library or computer system to consult them yourself. All of the ACM journal sources that I cited are available online if your university subscribes to the ACM Portal. This paper was thoroughly researched but there were some late nights involved in the production of it so it is provided WITHOUT WARRANTY against correctness or anything like that. My RAID paper is probably a little better than this one.

Creative Commons License
This work is licensed under a Creative Commons License.

Clusters:

Open Source Meets

Commodity Hardware

Matt Croydon

CMSC 311

May 6, 2005

Before the mid-1980's, most supercomputers were large, monolithic machines. Over time the Top 500 Supercomputers list has seen clusters go from non-existent to being the dominant architecture[1], currently representing 296 out of the top 500 slots (almost 60%)[2]. Compared to monolithic supercomputers such as those from Cray Research, clusters are extremely cheap for the amount of performance realized. When lower cost is combined with cheap off-the-shelf hardware and open source software platforms, clusters can't help but improve and gain popularity.

Different Tools for Different Jobs

The definition of the word “cluster” varies greatly depending on the context in which it is used. A cluster is commonly used in high availability situations when, for example, equipment must gracefully fail over or requests must be divided among the available hardware. For clustered application servers, this can be accomplished by simple round robin DNS entries or more complex load balancing hardware or software.

Clusters are also used when data needs to be stored on multiple machines for redundancy or performance sake. MySQL, an open source database program, can use a cluster of computers for both data replication as well as load balancing[3] in several configurations.

This paper will focus on the most common and most popular form of clustering: clusters used for parallel computing, scientific computing, and simulation in educational, professional, and government organizations. More specifically, it will focus on open source software that is available to make the construction and administration of clusters easier and more powerful.

A Brief History

Cluster computing has its roots in the mid 1980's when developers wanted to tie together multiple computers in order to harness their collective power. In 1985, Amnon Barak developed the first predecessor to Mosix called MOS that ran on a cluster of four Digital Equipment Corporation (DEC) PDP/11 computers[4]. In 1986, DEC decided try try clustering for themselves with VAXCluster[5]. At the time, VAXCluster was able to take advantage of a much higher data rate of 70mbit/sec[5] but because of the proprietary interconnect used, VAXCluster remained much more tightly coupled, while MOS and Mosix decided to use token ring LAN technology[4]. As Mosix was ported to other platforms and improved, it was also available to take advantage of advances in networking technology without its looser coupling being effected. Mosix relied on patches to the Unix kernel in order to allow processes to migrate among nodes in the cluster. Mosix was later ported to the Linux kernel by Moshe Bar[6], where it thrives as an open source project.

Beowulf came on to the scene in the early to mid 1990's with a huge splash. Beowulf allowed users to tie together large numbers of lower cost desktop hardware (486 DXes at the time) rather than the specialized hardware used by Mosix and VAXCluster.[7]. The project originated at NASA's Goddard Space Flight Center and quickly led to a successful open source project[8] as well as a successful business for some of the original developers called Scyld Software.

Beowulf Internals

PVM (Parallel Virtual Machine) and MPI (Message Passing Interface) are standards for preforming parallel operations. Both frameworks have language bindings (for example FORTRAN, C, Perl, and other languages) that abstract the underlying standard in to something easier to work with. The direction of MPI is steered by a group called the MPI Forum. Since the release of the initial specification, the MPI Forum have updated the specification to MPI 2.0, adding features and clarifying issues that were deemed important[9] Each cluster software, tool, or operating system vendor implements their own version of MPI, so cross-platform compability is not guaranteed, but porting between MPI implementations is quite possible.

In contrast to MPI, PVM software is usually provided at the PVM website[10] in either source or binary form. From there users can call the PVM library directly or though third party bindings. PVM provides binaries for Windows, which can allow users to program parallel applications on a platform that they may be more familiar with. However, most Beowulf clusters run standard Linux or some variant thereof. PVM also supports monolithic parallel computers such as Crays and other specialized machines. Further differences and similarities between MPI and PVM can be found in the paper Goals Guiding Design: PVM and MPI by William Gropp and Ewing Lusk[11].

In recent years, another tool called BProc (the Beowulf Distributed Process Space) has expanded the abilities of parallel processing and management of data between nodes. BProc allows a parallel job to be started on the main controlling node and parallel-capable processes are automatically migrated to child nodes[12]. This paradigm is also used by Mosix and OpenMosix, which will be discussed later. BProc is an open source project available at bproc.sourceforge.net.

Parallel processes also need to take in to consideration the amount of time that will be needed for preparation, cleanup, and merging of parallel data. Amdahl's law[26] stipulates that the total execution time for a parallel program is equal to the parallel part of the problem divided by the number of nodes plus the serial part of the program. Even if a cluster contains thousands of nodes, the amount of time ti takes to execute the serial code is going to remain constant.

How to Build a Beowulf[7]

Large Beowulf clusters run complex simulations and crunch teraflops of information per second. At the same time, small 4-16 node clusters are often used in educational settings to teach parallel processing design paradigms to Computer Science students as well as cluster design and implementation to Computer Engineering students.

College Beowulf clusters are often (but not always) comprised of outdated computers and hand-me-down hardware. While extremely fast speeds cannot be obtained with these antiquated clusters, they are valuable in teaching and observing the differences between a program or algorithm written for a single processor machine and the same program/algorithm written for and run on a cluster.

There are several tools available for deploying a Beowulf cluster, but almost all require a basic installation of a compatible Linux distribution on either the mater node or the master and all child nodes. Scyld software makes what is widely considered the easiest to install Beowulf software. All that is needed is a $3 CD[14] containing an unsupported Scyld distribution for the master and each child node. Official copies with commercial support are also available directly from Scyld. Once the CD is booted on the mater note, a simple installation menu is presented. After installing and configuring Scyld on the master node, insert a Scyld CD in each child node and they automatically get their configuration information from the master node and the child nodes can run directly from CD.

Another popular package that runs on top of many modern RPM-based Linux distributions is OSCAR, the Open Source Cluster Application Resources project[15]. OSCAR offers a very simple user interface to install and configure cluster software on the master node. Once that is accomplished, client nodes can be Network booted and the client software automatically installed. OSCAR also supports other installation and boot methods.

While many colleges take the small cluster approach, Virginia Tech has taken advantage of the modern Macintosh platform and created a top 10 supercomputer for a fraction of traditional costs. Virginia Tech started out with desktop machines, but now maintains a cluster of 1100 Apple XServe 1U servers running Mac OS X server (based on an open source BSD-derived core called Darwin).

Another Approach to Clustering: OpenMosix

While most Beowulf clusters are dedicated to cluster-related tasks all the time, clustering does not have to be that way. OpenMosix is a set of patches to the Linux kernel and a few userland monitoring tools for keeping track of where processes are running and how efficiently. OpenMosix is extremely flexible. Nodes can join or leave a cluster whenever they wish. Many programs and algorithms can take advantage of clustering with the automatic node migration built in to OpenMosix. Whenever a new process is spawned or forked (as is common in traditional Unix-like software design) OpenMosix may choose to execute that process locally or on another node.

Many OpenMosix clusters are implemented in a head/client node configuration much like Beowulf clusters, but they are not limited to such configurations. Because OpenMosix is just a patch to the standard kernel, machines in a cluster can have multiple uses. They can run standard graphical window managers and be used as desktop machines while processes are migrated to them if they have computing cycles to spare. OpenMosix does an excellent job at making sure that client nodes still have enough resources to do whatever else they are doing in addition to cluster process execution.

In addition to the mutli-use scenario, OpenMosix cluster nodes can run as true peers. For example, if there are 20 computers currently connected to a dynamic cluster and all but a few of them are idle, processes from the machines being actively used can be automatically migrated for execution throughout the cluster. Similarly, if all computers are heavily used, virtually no process migration will occur since execution will be quicker on the local machine. Also, if 400MHz desktop machine needs to do some complex calculations, as long as the program is written in a way that can take advantage of process migration, those calculations could be run extremely quickly on an idle 3GHz machine. Many of the scenarios above are described in a Linux Journal article entitled Clusters for Nothing and Nodes for Free[16], but also come from my experiences building and experimenting with a 2-3 node OpenMosix cluster a few years ago[17].

Recently the OpenMosix community has embraced “instant clusters,” or the idea that any hardware with local network connections can become a cluster without interfering with its other uses. The OpenMosix website lists a page[18] with several open source “instant cluster” software projects. The most popular project is called ClusterKnoppix[19], a Linux distribution with OpenMosix installed on it that runs directly from CD-ROM. With a minimum of one CD burned on a master node, a 30 seat computer lab can instantly become a 30 node cluster without disturbing the operating system installed on the hard drives.

To share data among nodes, OpenMosix uses the Cluster File System, a concept originally developed for the Mosix project called the Mosix File System[4]. The file system was renamed after the Mosix project closed its source code and Moshe Bar and others began working on the GPL-licensed[20] code which would become OpenMosix between 2001 and 2002. This cluster file system along with the ability to run a cluster as peer-nodes gives OpenMosix quite an advantage over traditional monolithic and cluster systems.

How Open Source Helps Clusters

While some computational clusters run on Windows, the vast majority run on top of an open-source Linux distribution. The Linux Kernel itself is open source and depending on the Linux distribution, all, most, or at least some of the operating system is open source. Sometimes Linux distributions can be open source without being free (as in no cost) such as Red Hat Enterprise Linux. There are many excellent free (open source and no cost) Linux distributions to run Beowulf, OpenMosix, or any other type of clustering software on.

There are many open source applications that help users install, configure, and maintin clusters; many have been mentioned before. These include OSCAR, Beowulf and OpenMosix themselves, various PVM and MPI implmentations, BProc, and more. In addition to the tools already mentioned, there is a suite of open source utilies for OpenMosix called openMosixView[21]. The various programs included in the suite allow for visualization as well as graphical management of the cluster, visual feedback for processes, process migration, load per node, and also allow for logging and analysis of cluster performance.

There are many other interesting open source clustering projects that don't require a Beowulf or OpenMosix frame to run on. One of the most popular examples of this is distcc[22], a program that allows for distributed compilation of C or C++ code. Distcc is quite lightweight and does not require a shared filesystem, it just requires child nodes to be running distcc in daemon mode.

The Future of Clusters

While Robert Lucke considers openMosix the next generation of clustering software because of its flexibility[23], some of the most stunning advances are happening in the world of grid and distributed computing[24]. Grid computing can mean different things to different people, but generally extends computing platforms beyond location and geography.

The SETI@home project[25] has managed to create a very powerful supercomputer by utilizing the spare CPU cycles of thousands of desktop machines spread throughout the world. The program usually runs as a screen saver so that it does not consume computing resources while the machine is being actively utilized. SETI@home and other projects are pushing the envelope of using spare processor cycles to tackle a task that would otherwise require large dedicated clusters or supercomputers.

While grid and distributed computing may take away part of the supercomputing market share that clusters (and particularly those built on open source software using commodity hardware), I believe that clusters are here to stay. Individual component prices continues to drop, network throughput is improving, and cluster software continues to evolve. Expect to hear even more about clusters over the next several years.

References

[1] Top500 Supercomputer Sites, “Charts for November 2004 – Clusters (NOW),” April 2005, http://top500.org/lists/2004/11/overtime.php?c=5.

[2] Top500 Supercomputer Sites, “Highlights from Top500 List for November 2004,” April 2005, http://top500.org/lists/2004/11/trends.php.

[3] J. Zawodny and D. Balling, High Performance MySQL, Sebastapol: O'Reilly and Associates, 2004, chaps. 7 and 8.

[4] A. Barak et al, The Mosix Distributed Operating System: Load Balancing for Unix, Berlin: Springer-Verlag, 1993, pp. 1-18.

[5] N. Kronenberg et al, “VAXcluster: a closely-coupled distributed system,” in ACM Transactions on Computer Systems (TOCS), 1986, pp. 130-146.

[6] The openMosix Project, “openMosix, an Open Source Linux Cluster Project,” April 2005, http://openmosix.sourceforge.net/.

[7] T. Sterling et al, How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters, Cambridge, Mass: The MIT Press, 1999.

[8] The Beowulf Project, “Beowulf.org: The Beowulf Cluster Site,” April 2005, http://www.beowulf.org/.

[9] The MPI Forum, “Message Passing Interface,” April 2005, http://www-unix.mcs.anl.gov/mpi/.

[10] Computer Science and Mathematics Divison, Oak Ridge National Laboratory, “PVM: Parallel Virtual Machine,” April 2005, http://www.csm.ornl.gov/pvm/pvm_home.html.

[11] W. Gropp and E. Lusk, “Goals Guiding Design: PVM and MPI,” in IEEE International Conference on Cluster Computing (CLUSTER'02), 2002, pp. 257-268.

[12] E. Hendricks, “BProc: The Beowulf Distributed Process Space” in Proceedings of the 16
th
international conference on Supercomputing, 2002, pp. 129-136.

[13] P. Prins, “Teaching Parallel Computing Using Beowulf Clusters: A Laboratory Approach,”in Journal of Computing Sciences in College, 2004, pp. 55-61.

[14] Linux Central, “CDROM with Scyld Beowulf,” April 2005, http://linuxcentral.com/catalog/index.php3?prod_code=L000-089.

[15] Open Source Cluster Application Resources, “OSCAR: Open Source Cluster Application Resources”, April 2005, http://oscar.openclustergroup.org/.

[16] A. Perry et al, “Clusters for Nothing and Nodes for Free,” Linux Journal, Vol 2004, Issue 123, July, 2004.

[17] Matt Croydon, “OpenMosix Success,” April 2005, http://www.postneo.com/2002/11/20/openmosix-success.

[18] openMosix, “Instant openMosix, The Fast Path to an openMosix Cluster,” April 2005, http://openmosix.sourceforge.net/instant_openmosix_clusters.html.

[19] ClusterKnoppix, “ClusterKnoppix: Main Page,” April 2005 http://bofh.be/clusterknoppix/.

[20] Open Source Initiative, “Open Source Initiative – The GPL:Licensing” April 2005 http://www.opensource.org/licenses/gpl-license.php.

[21] openMosixView, “openMosixView: a cluster-management GUI,” April 2005, http://www.openmosixview.com/index.html.

[22] Martin Pool, “distcc: a fast, free distributed C/C++ compiler,” April 2005,

http://distcc.samba.org/.

[23] R. Lucke, Building Clustered Linux Systems (Hewlett-Packard Professional Books), Upper Saddle River, New Jersey: Prentice Hall, 2004.

[24] M. Holliday et al, “A Geographically-distributed, Assignment-structured, Undergraduate Grid Computing Course” in Proceedings of the 36th SIGCSE technical symposium on Computer science education, 2005, pp 206-210.

[25] The SETI@home Project, “SETI@home: Search for Extraterrestrial Intelligence at Home,” April 2005, http://setiathome.ssl.berkeley.edu/.

[26] G. Pfister, In Search of Clusters: The Coming Battle in Lowly Parallel Computing. Upper Saddle River, New Jersey: Prentice Hall, 1998. pp. 184-185.

Comments

96 responses to “Clusters: Open Source Meets Commodity Hardware”

  1. Steve Loughran Avatar
    Steve Loughran

    nice paper on clustering.
    I’d argue that one future for clustering involves virtualisation, in unix land that means Xen. This doesnt give you the absolute perf of a custom MPI app, but it means you can use a cluster to host any set of server images you want.

    -stve

  2. paper box Avatar

    I like your article very much, will often come over your websites afterwards , will hope to make a friend, websites inviting me too see that paper bag,thank you!

  3. Emily Avatar

    Great work!
    [url=http://hludddta.com/tohr/bkxz.html]My homepage[/url] | [url=http://uhhvhctj.com/ncou/mcqk.html]Cool site[/url]

  4. Vicky Avatar

    Great work!

  5. Gary Avatar

    Good design!

  6. Thomas Avatar
    Thomas

    Your site looks great! Please visit my homepage:

  7. Dwayne Avatar
    Dwayne

    Good site, nice design! Please visit my site too:

  8. Lawrence Avatar
    Lawrence

    This is a wonderful page for everyone! Please also visit my site:

  9. process simulation…

    Hi. Very nice blog. I\’ve been reading your other entries all day long..lol….

  10. New Jersey custom home…

    Your blog posts are insightful. I will take them into deep thought and consideration. Your point of view is very smart and intellectual. Charlie…

  11. Network Monitoring Software Windows Open Source…

    I don’t agree with you in 100%, but you covered some good points regarding this topic…

  12. how to become a national process server…

    Of course, Sphere will still continue to offer search results for all the latest blog content. foofighters learn to fly mp3 download. wily wonka games com. Interesting how to become a national pro……

  13. Book Reviews Guide Online…

    I couldn’t understand some parts of this article, but it sounds interesting…

  14. Visitor465 Avatar

    I have visited your site 596-times

  15. Visitor706 Avatar

    I could not find this site in the Search Engines index

  16. free music ipods Avatar

    Hi boys!e04499b0e25c8b9eae7ca13a10067334

  17. Orestis Avatar

    Interesting…

  18. f7031dcbc5a1…

    f7031dcbc5a1a82a2701…

  19. Shodaborb Avatar

    Other variant is possible also

  20. Jon Brink Avatar

    I love Wilson Audio. I actually own the very first production pair of Wilson Audio Tiny Tots (WATTs).

  21. laer sprog Avatar

    Can I put part of this on my page if I include a link back to this website?

  22. quick car insurance quotes Avatar

    Do you care if I reference some of this on my webpage if I include a link to this webpage?

  23. skinny jeans Avatar

    Tyvm for the helpful info! I would never have found this by myself!

  24. overnight prints promotional code Avatar

    TYVM, great job! Exactly what I needed.

  25. pedometer, pedometers, pedometer uk, step counter, step counters Avatar

    Thanks for posting this, it was very handy and helped very much

  26. St Francisville B&B Avatar

    Thanks for the great post! I would never have discovered this by myself!

  27. online world news Avatar

    Just thought I would comment and say great theme, did you design it yourself? It’s really awesome!

  28. backlinks kaufen Avatar

    I have wanted to write about something like this on my site and you gave me an idea. Thanks.

  29. Transformers Avatar

    When I click your RSS feed it just gives me a page of trash, is the problem on my side?

  30. pediatrician salary range Avatar

    TYVM, wonderful job! Just the thing I had to know.

  31. Directory Avatar

    Thought I would comment and say cool theme, did you design it yourself? It looks awesome!

  32. BMW Mini Cooper Parts Avatar

    I have wanted to post about something like this on my blog and you gave me an idea. Thank you.

  33. casa Avatar

    Can I reference part of this on my site if I include a link to this site?

  34. atlanta general contractors Avatar

    Thought I would comment and say great theme, did you design it yourself? It’s really great!

  35. http://listonlinetrends.com/k42141/bosch Avatar

    This is an amazing write-up! I simply have one question. How do you do it? Everything on this internet site is so good.

    Keep up the excellent work!

  36. darmowe porno Avatar

    I just wake up from nap and I’m already reading your blog. This signifies something! Really useful post. Thnx!

  37. miasto łęczyca Avatar

    This is truly good content and enlightening weblog, I praise what you’ve carried out right here, along with sharing excellent information with great tips and concepts, I’m really pleased to submit my comment on this weblog, many thank you’s towards the author.

  38. luxury car rental Avatar

    Just discovered this blog through Bing, what a pleasant shock!

  39. buy propecia Avatar

    I will know, many thanks for the information.

  40. filmy tube Avatar

    I feel that is an interesting point, it made me assume a bit. Thank you for sparking my thinking cap. Once in a while I am getting so much in a rut that I just feel like a record.

  41. limos hire Avatar

    How do you make a blog look this good! Email me if you can and share your wisdom. I’d be thankful!

  42. reverse phone search australia Avatar

    This really solved my problem, thank you!

  43. Last minute maroko Avatar

    Wow – now that’s perspective! I think we often react in agreement or disagreement because of our emotions, but hearing another side, passionately presented, really makes us think!

  44. football Avatar

    I like watching football on bbc football chanel. How about you people?

  45. HnUXymK Avatar

    LRZVEj

  46. wpolscemamymocneseo Avatar

    Seo cup from Poland

  47. pozycjonowanie w google Avatar

    Your articles are very inspiring indeed. I keep coming back!