Busy making things: github, links, photos, @mc.

Successful Error

Posted: June 30th, 2005 | Author: | Filed under: Linux | 2 Comments »

Successful Error
Successful Error

I know why this error occured (no network connectivity at the time), but regardless it’s an amusing error. It’s definitely better than seeing LI_ when your machine boots…


Moto Linux Smart(ish)phone

Posted: June 30th, 2005 | Author: | Filed under: Mobile | 4 Comments »

I was poking around the FCC website again this morning and ran across this motorola linux-based smartphone which was approved recently. The beginning of the manual seems to indicate that this particular model does not have Bluetooth, but I’m guessing that there’s another model out there that does, or BT is a feature that had to be cut from this model for whatever reason. This looks a lot like one of the Linux phones announced a year or so ago, though I’m not “with it” enough to remember exactly which one.

They do have a copy of the GPL (see Appendix 32) in the manual, make use of Gnomovision, and have a bit of a bizzarre “offer to customers:”

Until September 30, 2006 you may request from Motorola the source code for any
Portions of this product which are licensed under the GNU General Public license by
writing to the following address:
Beijing Design Center, PCS
Motorola (China) Electronics Ltd.
No. 2 Dong San Huan Nan. Lu
Chao Yang District
Beijing, P.R. China
100022
Tel. 010-65642288
Fax: 010-65642299

I wonder if they would actually pony up the GPL’d code if you asked. What’s up with that date though? Is that admission of mobile phone shelf life? They do seem to be trying to do things right though. Anyway, just because this is approved doesn’t mean you should expect it on the shelves tomorrow. After looking closer at the external photos it looks like this is indeed one of the models that piqued my interest when it was announced.


Hello Sendo

Posted: June 30th, 2005 | Author: | Filed under: Mobile | 3 Comments »

Yesterday Jim broke the news that Sendo was in administration (from what I gather, it’s sort of like Chapter 11 in the US):

The affairs, business and assets of Sendo International Ltd, Sendo Holdings Plc, Sendo Telecommunications Limited and Sendo Limited are being managed by the Joint Administrators Simon Appell and Alastair Beveridge. The Joint Administrators act as agents of the companies without personal liability.

I can only guess that the Sendo X’s delays and lackluster sales have something to do with it. It’s really a shame too, because the Sendo X was an awesome little phone.

Today news comes that Motorola will be buying up large chunks of Sendo: R&D, intellectual property (read: patents), and some other stuff. Reuters UK, El Reg, The Inq, and so on.

Motorola hopes that the move will help them push more units in the UK, but that’s going to require more than shiny but vacuous flip-phones.


Quake On Mobiles: Details Please!

Posted: June 27th, 2005 | Author: | Filed under: Mobile | 2 Comments »

There are some 19 stories on Google News about Quake being released for “a new generation of 3D-enabled mobile handsets,” all of which seem to be a regurgitation of a press release.

Hey, guys, is it too much to ask for a little real content? Some actual information? Like, what are these snazzy new “3D-enabled mobile handsets” that I keep hearing about? I know that there are a lot of “3D-enabled mobile handsets” just hitting or about to hit the market, but which platform is this going to target?

The closest thing to an answer seems to be at the bottom of the press release:

The first 3D-enabled mobile phones will hit the market this summer (Northern hemisphere), with Quake Mobile available as an embedded game on the first of these handsets to be released by a Korean manufacturer in July.

Hmm, okay, so does that mean Quake is going to end up on a bunch of crappy flippy phones that I can safely ignore? I dunno.

The most information can probably be gleaned by following the litle blurbs at the bottom of the press release. The game is published by Pulse Interactive, who offer more information on their E3 Best In Show page. The game was developed (ported?) by Bear Naked Productions. Quake is released under the GPL and has been ported to many different platforms, some of them mobile, but I haven’t seen a port of Quake to commodity phone hardware.

I think something like Commander Keen would be more fun on a standard old mobile phone than something like Quake. The thought of playing Quake on a mobile reminds me of how bad I am playing Doom on Nintendo 64. I tend to suck at first person shooters on game consoles after years and years of keyboard+mouse play.

This whole Quake-in-your-pocket thing may be worth paying attention to. Or not.


Nokia Bluetooth Headset (HS-12W)

Posted: June 23rd, 2005 | Author: | Filed under: Mobile | 6 Comments »

Nokia Bluetooth Headset

I’ve been keeping a keen eye on the FCC website recently looking for approval of the Nokia 6682 (time is running out) as well as the Nokia 770. A few days ago I saw a version of Charlie from Nokia Japan be approved (interesting but not noteworthy since it was the 900/1800/1900 and I don’t see t-mobile picking it up…) Today I stumbled upon an interesting new bluetooth headset that looks almost like an iPod shuffle with stereo headphones.

I haven’t been quite as rabid about product announcements lately, so I might have missed this one fly by. It looks kinda neat, though how excited I would be about it might depend on its price point.


In Defense of the 770

Posted: June 16th, 2005 | Author: | Filed under: Linux, Mobile, Open Source | 50 Comments »

Nancy Gohring was able to get her hands on a Nokia 770 prototype the other day at a press event in Helsinki. She posted a review on Mobile Pipeline and cross posted a shorter review on Wi-Fi Network News. Every postitive thing she said about the device hinted at the fact that she hated it because it was dog slow. One must remember that the processor behind the 770 is a 220MHz ARM, and that a 220MHz arm just isn’t a 2GHz Pentium 4. Having said that, I’m pretty sure that the final device will be a bit zippier than the one she played with. I’m also pretty sure that the boot time will be resolved on the final device and will be quick enough.

I also think that a 800×480 high resolution screen in a device that size is quite an accomplishment and shouldn’t be overlooked. It’s not a 15 inch LCD, but it’s also extremely portable. While the operating system won’t officially include VoIP and instant messaging until the Internet Tablet 2006 update, there is a tutorial on porting Gaim to Maemo on the website, and I know of several people working on VoIP, SIP, and other communications programs for the platform. Silky, a secure internet chat client, has already been ported to Maemo, the development environment for devices like the 770. My guess is that there will be a handful of communications programs available for the 770 at launch time.

While the press junket might be the first time members of the press have been able to play with a Nokia 770, it has been shown (and I believe demoed) at the Linux World Summit in New York and Guadec 6 in Stuttgart.

In conclusion, I think there’s a ton of potential in this device. While the 770 is never going to be a fullsize notebook in terms of performance, I’m pretty sure that the speed and responsiveness will be better on the final device than the one Nancy was able to play with. There are a lot of people around the world working very hard to make sure that the Nokia 770 and the platform it is built on are as solid and fast as can be. I’m definitely looking forward to purchasing my developer device as soon as it’s available and testing a bunch of apps on the device.


Linux Device Drivers, 3rd Edition

Posted: June 14th, 2005 | Author: | Filed under: Linux, Open Source | 6 Comments »

You can find a copy of Linux Device Drivers, 3rd Edition on LWN. The work, current to Linux 2.6.10, is licensed under a Creative Commons Attribution-ShareAlike license. While it’s available for free, if you really dig it, grab a physical copy published by O’Reilly to support the authors and forward-thinking publisher.

I’m definitely not a hardcore kernel hacker, but it’s great to have this resource available in a form that I can graze on if I wanted to get my feet wet. Thanks so much to Thoron on #maemo for pointing this out.


WebCore for Series 60

Posted: June 13th, 2005 | Author: | Filed under: Linux, Mobile, Open Source | 6 Comments »

Dave Hyatt cites a Nokia Press release stating that they’re going to bring WebCore to Series 60. WebCore is the rendering technology behind Apple’s browser, Safari. WebCore is based on the KHTML rendering engine used in Konqurer in KDE.

It will be interesting to see if Nokia have based WebCore for Series 60 on GTK+ WebCore, a port of WebCore using GTK+.

Regardless of how this shakes out, I think it’s a great move by Nokia and a definite win for the open source community.


Whoppix: Knoppix for H4x0rs

Posted: June 12th, 2005 | Author: | Filed under: Linux, Open Source | 16 Comments »

Whoppix is a Knoppix-based (which is in turn Debian-based) Live-CD with lots of, um, “penetration testing” tools included. It’s a heavier weight distro with a lot more tools than PHLAK, but what really amazed me was that my Linksys WPC11 worked just fine with Whoppix. I currently use my internal Broadcomm wireless with NDISWrapper, but I’ve been trying off and on to get the PCMCIA card working so that I have options.

I’m a hardcore Gnome guy as of late, thanks to the wonders of Ubuntu, which does an amazing job of putting a lot of stuff in the “just works” category while getting out of my way. It had been awhile since I had taken a look at KDE, and I’ve got to say that the latest release looks good, though I’m still going to stick with utilitarianism over candy coated widgets.

I will probably use my experience with Whoppix to motivate me to try the latest kernel to see it I can get my WPC11 working with my HP ze4430us laptop under Ubuntu. Currently my WPC11 is like a magic key that brings my system to a complete halt when I plug it in. I will definitely be keeping a copy of Whoppix in my bag though, as it has tons of great tools that can be very useful when doing penetration testing of in-house stuff.


A9 Search Beta Rocks!

Posted: June 10th, 2005 | Author: | Filed under: Web Services | 25 Comments »

A9 Search Beta

I may be the last person on the earth to notice this, but the beta of A9 search absolutely rocks. It’s Ajaxy, interactive, and it lets me choose what “stuff” I want to see in the search results. The Wikipedia checkbox is definitely welcome, as is the Feedster addon tab, as those are two places I often search.

It’s not perfect though (but it’s “beta”), the layout could be a little smarter by default and I would really like the ability to rearrange window (column) location like you can with Google News. All in all though, I like it a lot. I don’t know if it’s enough to break my Google habit though.


Clusters: Open Source Meets Commodity Hardware

Posted: June 10th, 2005 | Author: | Filed under: Linux, Open Source, Projects | 96 Comments »

During the last semester I wrote two papers for my Computer Architectures class. I spent quite a bit of time on them and have been thinking about posting them on my weblog for quite some time. I’m a bit worried about plagarism though, and I’m not sure what to do about it. I’m pretty sure that I can submit it to the auto-plagarism-detector service that my university subscribes to, and I’m probably going to do that now that this paper is posted.

Secondly, I’m releasing this paper under the by-nc-sa (Attribution NonCommercial ShareAlike 2.0) license, so unless you can turn in your paper to your teacher with a by-nc-sa license displayed on it, you can’t include it in your paper without proper citation.

PLEASE NOTE: If you are considering plagarising this, please don’t. If your teacher allows you to cite non-academic internet sources, then by all means borrow my ideas and cite me. What I would really suggest doing is taking a look at my primary sources and then heading to your university library or computer system to consult them yourself. All of the ACM journal sources that I cited are available online if your university subscribes to the ACM Portal. This paper was thoroughly researched but there were some late nights involved in the production of it so it is provided WITHOUT WARRANTY against correctness or anything like that. My RAID paper is probably a little better than this one.

Creative Commons License
This work is licensed under a Creative Commons License.

Clusters:

Open Source Meets

Commodity Hardware

Matt Croydon

CMSC 311

May 6, 2005

Before the mid-1980's, most supercomputers were large, monolithic machines. Over time the Top 500 Supercomputers list has seen clusters go from non-existent to being the dominant architecture[1], currently representing 296 out of the top 500 slots (almost 60%)[2]. Compared to monolithic supercomputers such as those from Cray Research, clusters are extremely cheap for the amount of performance realized. When lower cost is combined with cheap off-the-shelf hardware and open source software platforms, clusters can't help but improve and gain popularity.

Different Tools for Different Jobs

The definition of the word “cluster” varies greatly depending on the context in which it is used. A cluster is commonly used in high availability situations when, for example, equipment must gracefully fail over or requests must be divided among the available hardware. For clustered application servers, this can be accomplished by simple round robin DNS entries or more complex load balancing hardware or software.

Clusters are also used when data needs to be stored on multiple machines for redundancy or performance sake. MySQL, an open source database program, can use a cluster of computers for both data replication as well as load balancing[3] in several configurations.

This paper will focus on the most common and most popular form of clustering: clusters used for parallel computing, scientific computing, and simulation in educational, professional, and government organizations. More specifically, it will focus on open source software that is available to make the construction and administration of clusters easier and more powerful.

A Brief History

Cluster computing has its roots in the mid 1980's when developers wanted to tie together multiple computers in order to harness their collective power. In 1985, Amnon Barak developed the first predecessor to Mosix called MOS that ran on a cluster of four Digital Equipment Corporation (DEC) PDP/11 computers[4]. In 1986, DEC decided try try clustering for themselves with VAXCluster[5]. At the time, VAXCluster was able to take advantage of a much higher data rate of 70mbit/sec[5] but because of the proprietary interconnect used, VAXCluster remained much more tightly coupled, while MOS and Mosix decided to use token ring LAN technology[4]. As Mosix was ported to other platforms and improved, it was also available to take advantage of advances in networking technology without its looser coupling being effected. Mosix relied on patches to the Unix kernel in order to allow processes to migrate among nodes in the cluster. Mosix was later ported to the Linux kernel by Moshe Bar[6], where it thrives as an open source project.

Beowulf came on to the scene in the early to mid 1990's with a huge splash. Beowulf allowed users to tie together large numbers of lower cost desktop hardware (486 DXes at the time) rather than the specialized hardware used by Mosix and VAXCluster.[7]. The project originated at NASA's Goddard Space Flight Center and quickly led to a successful open source project[8] as well as a successful business for some of the original developers called Scyld Software.

Beowulf Internals

PVM (Parallel Virtual Machine) and MPI (Message Passing Interface) are standards for preforming parallel operations. Both frameworks have language bindings (for example FORTRAN, C, Perl, and other languages) that abstract the underlying standard in to something easier to work with. The direction of MPI is steered by a group called the MPI Forum. Since the release of the initial specification, the MPI Forum have updated the specification to MPI 2.0, adding features and clarifying issues that were deemed important[9] Each cluster software, tool, or operating system vendor implements their own version of MPI, so cross-platform compability is not guaranteed, but porting between MPI implementations is quite possible.

In contrast to MPI, PVM software is usually provided at the PVM website[10] in either source or binary form. From there users can call the PVM library directly or though third party bindings. PVM provides binaries for Windows, which can allow users to program parallel applications on a platform that they may be more familiar with. However, most Beowulf clusters run standard Linux or some variant thereof. PVM also supports monolithic parallel computers such as Crays and other specialized machines. Further differences and similarities between MPI and PVM can be found in the paper Goals Guiding Design: PVM and MPI by William Gropp and Ewing Lusk[11].

In recent years, another tool called BProc (the Beowulf Distributed Process Space) has expanded the abilities of parallel processing and management of data between nodes. BProc allows a parallel job to be started on the main controlling node and parallel-capable processes are automatically migrated to child nodes[12]. This paradigm is also used by Mosix and OpenMosix, which will be discussed later. BProc is an open source project available at bproc.sourceforge.net.

Parallel processes also need to take in to consideration the amount of time that will be needed for preparation, cleanup, and merging of parallel data. Amdahl's law[26] stipulates that the total execution time for a parallel program is equal to the parallel part of the problem divided by the number of nodes plus the serial part of the program. Even if a cluster contains thousands of nodes, the amount of time ti takes to execute the serial code is going to remain constant.

How to Build a Beowulf[7]

Large Beowulf clusters run complex simulations and crunch teraflops of information per second. At the same time, small 4-16 node clusters are often used in educational settings to teach parallel processing design paradigms to Computer Science students as well as cluster design and implementation to Computer Engineering students.

College Beowulf clusters are often (but not always) comprised of outdated computers and hand-me-down hardware. While extremely fast speeds cannot be obtained with these antiquated clusters, they are valuable in teaching and observing the differences between a program or algorithm written for a single processor machine and the same program/algorithm written for and run on a cluster.

There are several tools available for deploying a Beowulf cluster, but almost all require a basic installation of a compatible Linux distribution on either the mater node or the master and all child nodes. Scyld software makes what is widely considered the easiest to install Beowulf software. All that is needed is a $3 CD[14] containing an unsupported Scyld distribution for the master and each child node. Official copies with commercial support are also available directly from Scyld. Once the CD is booted on the mater note, a simple installation menu is presented. After installing and configuring Scyld on the master node, insert a Scyld CD in each child node and they automatically get their configuration information from the master node and the child nodes can run directly from CD.

Another popular package that runs on top of many modern RPM-based Linux distributions is OSCAR, the Open Source Cluster Application Resources project[15]. OSCAR offers a very simple user interface to install and configure cluster software on the master node. Once that is accomplished, client nodes can be Network booted and the client software automatically installed. OSCAR also supports other installation and boot methods.

While many colleges take the small cluster approach, Virginia Tech has taken advantage of the modern Macintosh platform and created a top 10 supercomputer for a fraction of traditional costs. Virginia Tech started out with desktop machines, but now maintains a cluster of 1100 Apple XServe 1U servers running Mac OS X server (based on an open source BSD-derived core called Darwin).

Another Approach to Clustering: OpenMosix

While most Beowulf clusters are dedicated to cluster-related tasks all the time, clustering does not have to be that way. OpenMosix is a set of patches to the Linux kernel and a few userland monitoring tools for keeping track of where processes are running and how efficiently. OpenMosix is extremely flexible. Nodes can join or leave a cluster whenever they wish. Many programs and algorithms can take advantage of clustering with the automatic node migration built in to OpenMosix. Whenever a new process is spawned or forked (as is common in traditional Unix-like software design) OpenMosix may choose to execute that process locally or on another node.

Many OpenMosix clusters are implemented in a head/client node configuration much like Beowulf clusters, but they are not limited to such configurations. Because OpenMosix is just a patch to the standard kernel, machines in a cluster can have multiple uses. They can run standard graphical window managers and be used as desktop machines while processes are migrated to them if they have computing cycles to spare. OpenMosix does an excellent job at making sure that client nodes still have enough resources to do whatever else they are doing in addition to cluster process execution.

In addition to the mutli-use scenario, OpenMosix cluster nodes can run as true peers. For example, if there are 20 computers currently connected to a dynamic cluster and all but a few of them are idle, processes from the machines being actively used can be automatically migrated for execution throughout the cluster. Similarly, if all computers are heavily used, virtually no process migration will occur since execution will be quicker on the local machine. Also, if 400MHz desktop machine needs to do some complex calculations, as long as the program is written in a way that can take advantage of process migration, those calculations could be run extremely quickly on an idle 3GHz machine. Many of the scenarios above are described in a Linux Journal article entitled Clusters for Nothing and Nodes for Free[16], but also come from my experiences building and experimenting with a 2-3 node OpenMosix cluster a few years ago[17].

Recently the OpenMosix community has embraced “instant clusters,” or the idea that any hardware with local network connections can become a cluster without interfering with its other uses. The OpenMosix website lists a page[18] with several open source “instant cluster” software projects. The most popular project is called ClusterKnoppix[19], a Linux distribution with OpenMosix installed on it that runs directly from CD-ROM. With a minimum of one CD burned on a master node, a 30 seat computer lab can instantly become a 30 node cluster without disturbing the operating system installed on the hard drives.

To share data among nodes, OpenMosix uses the Cluster File System, a concept originally developed for the Mosix project called the Mosix File System[4]. The file system was renamed after the Mosix project closed its source code and Moshe Bar and others began working on the GPL-licensed[20] code which would become OpenMosix between 2001 and 2002. This cluster file system along with the ability to run a cluster as peer-nodes gives OpenMosix quite an advantage over traditional monolithic and cluster systems.

How Open Source Helps Clusters

While some computational clusters run on Windows, the vast majority run on top of an open-source Linux distribution. The Linux Kernel itself is open source and depending on the Linux distribution, all, most, or at least some of the operating system is open source. Sometimes Linux distributions can be open source without being free (as in no cost) such as Red Hat Enterprise Linux. There are many excellent free (open source and no cost) Linux distributions to run Beowulf, OpenMosix, or any other type of clustering software on.

There are many open source applications that help users install, configure, and maintin clusters; many have been mentioned before. These include OSCAR, Beowulf and OpenMosix themselves, various PVM and MPI implmentations, BProc, and more. In addition to the tools already mentioned, there is a suite of open source utilies for OpenMosix called openMosixView[21]. The various programs included in the suite allow for visualization as well as graphical management of the cluster, visual feedback for processes, process migration, load per node, and also allow for logging and analysis of cluster performance.

There are many other interesting open source clustering projects that don't require a Beowulf or OpenMosix frame to run on. One of the most popular examples of this is distcc[22], a program that allows for distributed compilation of C or C++ code. Distcc is quite lightweight and does not require a shared filesystem, it just requires child nodes to be running distcc in daemon mode.

The Future of Clusters

While Robert Lucke considers openMosix the next generation of clustering software because of its flexibility[23], some of the most stunning advances are happening in the world of grid and distributed computing[24]. Grid computing can mean different things to different people, but generally extends computing platforms beyond location and geography.

The SETI@home project[25] has managed to create a very powerful supercomputer by utilizing the spare CPU cycles of thousands of desktop machines spread throughout the world. The program usually runs as a screen saver so that it does not consume computing resources while the machine is being actively utilized. SETI@home and other projects are pushing the envelope of using spare processor cycles to tackle a task that would otherwise require large dedicated clusters or supercomputers.

While grid and distributed computing may take away part of the supercomputing market share that clusters (and particularly those built on open source software using commodity hardware), I believe that clusters are here to stay. Individual component prices continues to drop, network throughput is improving, and cluster software continues to evolve. Expect to hear even more about clusters over the next several years.

References

[1] Top500 Supercomputer Sites, “Charts for November 2004 – Clusters (NOW),” April 2005, http://top500.org/lists/2004/11/overtime.php?c=5.

[2] Top500 Supercomputer Sites, “Highlights from Top500 List for November 2004,” April 2005, http://top500.org/lists/2004/11/trends.php.

[3] J. Zawodny and D. Balling, High Performance MySQL, Sebastapol: O'Reilly and Associates, 2004, chaps. 7 and 8.

[4] A. Barak et al, The Mosix Distributed Operating System: Load Balancing for Unix, Berlin: Springer-Verlag, 1993, pp. 1-18.

[5] N. Kronenberg et al, “VAXcluster: a closely-coupled distributed system,” in ACM Transactions on Computer Systems (TOCS), 1986, pp. 130-146.

[6] The openMosix Project, “openMosix, an Open Source Linux Cluster Project,” April 2005, http://openmosix.sourceforge.net/.

[7] T. Sterling et al, How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters, Cambridge, Mass: The MIT Press, 1999.

[8] The Beowulf Project, “Beowulf.org: The Beowulf Cluster Site,” April 2005, http://www.beowulf.org/.

[9] The MPI Forum, “Message Passing Interface,” April 2005, http://www-unix.mcs.anl.gov/mpi/.

[10] Computer Science and Mathematics Divison, Oak Ridge National Laboratory, “PVM: Parallel Virtual Machine,” April 2005, http://www.csm.ornl.gov/pvm/pvm_home.html.

[11] W. Gropp and E. Lusk, “Goals Guiding Design: PVM and MPI,” in IEEE International Conference on Cluster Computing (CLUSTER'02), 2002, pp. 257-268.

[12] E. Hendricks, “BProc: The Beowulf Distributed Process Space” in Proceedings of the 16
th
international conference on Supercomputing, 2002, pp. 129-136.

[13] P. Prins, “Teaching Parallel Computing Using Beowulf Clusters: A Laboratory Approach,”in Journal of Computing Sciences in College, 2004, pp. 55-61.

[14] Linux Central, “CDROM with Scyld Beowulf,” April 2005, http://linuxcentral.com/catalog/index.php3?prod_code=L000-089.

[15] Open Source Cluster Application Resources, “OSCAR: Open Source Cluster Application Resources”, April 2005, http://oscar.openclustergroup.org/.

[16] A. Perry et al, “Clusters for Nothing and Nodes for Free,” Linux Journal, Vol 2004, Issue 123, July, 2004.

[17] Matt Croydon, “OpenMosix Success,” April 2005, http://www.postneo.com/2002/11/20/openmosix-success.

[18] openMosix, “Instant openMosix, The Fast Path to an openMosix Cluster,” April 2005, http://openmosix.sourceforge.net/instant_openmosix_clusters.html.

[19] ClusterKnoppix, “ClusterKnoppix: Main Page,” April 2005 http://bofh.be/clusterknoppix/.

[20] Open Source Initiative, “Open Source Initiative – The GPL:Licensing” April 2005 http://www.opensource.org/licenses/gpl-license.php.

[21] openMosixView, “openMosixView: a cluster-management GUI,” April 2005, http://www.openmosixview.com/index.html.

[22] Martin Pool, “distcc: a fast, free distributed C/C++ compiler,” April 2005,

http://distcc.samba.org/.

[23] R. Lucke, Building Clustered Linux Systems (Hewlett-Packard Professional Books), Upper Saddle River, New Jersey: Prentice Hall, 2004.

[24] M. Holliday et al, “A Geographically-distributed, Assignment-structured, Undergraduate Grid Computing Course” in Proceedings of the 36th SIGCSE technical symposium on Computer science education, 2005, pp 206-210.

[25] The SETI@home Project, “SETI@home: Search for Extraterrestrial Intelligence at Home,” April 2005, http://setiathome.ssl.berkeley.edu/.

[26] G. Pfister, In Search of Clusters: The Coming Battle in Lowly Parallel Computing. Upper Saddle River, New Jersey: Prentice Hall, 1998. pp. 184-185.


Marklar

Posted: June 6th, 2005 | Author: | Filed under: Apple | 8 Comments »

MacMerc:

01:27 PM – HOT Here comes Intel: talking about processor transitions now – from 68k to PC. Apple is switching to Intel from PPC. “Time for a brain transplant.” 2006-2006 cited.


PalmOne LifeDrive: Cool But not $500 Cool

Posted: June 5th, 2005 | Author: | Filed under: Mobile | 7 Comments »

LifeDrive

Last week I played with a PalmOne LifeDrive at CompUSA and I’ve gotta say that it’s a nice little device. The screen resolution is definitely a little nicer than budget devices, but at 320×480 isn’t exactly bleeding edge. The design is quite pleasing too; it almost has the feel of a G5 desktop. I also like the ability of switching between portrait and desktop mode quickly. I can do that on my Dell Axim X30 but it’s definitely not as quick as on the LifeDrive. Landscape mode is also perfect for the included Blazer browser, which when coupled with bluetooth or Wi-Fi makes for a good pocket-sized browser. Combine all that with a 4 gig microdrive and you’ve got a pretty nice little platform.

It’s a pretty nice little platform, but is it worth $500? I don’t think so. If you’re really worried about hauling around a bunch of data in your pocket, $500 can get you a 60 gig iPod photo and $50 to spare or a 20 gig Archos AV420. Granted neither of those offer PIM functions, but I’m not sure how compelling PIM + 4 gigs to spare is. I was also a little bummed at how pokey the responsiveness on the LifeDrive was. Click something, wait just a little bit, and there it is. I know that the OS doesn’t run from the MicroDrive but accessing photos and stuff requires a bit of drive spinning. It’s not that the LifeDrive felt slow when accessing photos or media, it felt a bit slow in general, even when doing something that didn’t involve the MicroDrive at all.

Another thing that got me is that it looks like the battery is internal and not user replacable. Having been screwed by a CompUSA warranty and the flaky battery on the Tungsten E, I don’t think I would ever consider buying a Palm device without a user replacable battery.

I could be wrong, Palm could have a big hit on their hands with the LifeDrive, but my guess is that they’ll have to drop the price point a hundred bucks or so before they really start moving units.


Apple Going Intel?

Posted: June 4th, 2005 | Author: | Filed under: Apple | 3 Comments »

I’ve got to say that I won’t belive this one until the man in jeans and a black turtle neck says so. Can the Mac really surive another platform jump? Then again, I’d probably subscribe to the $129 yearly operating system plan if it meant I could run OSX on my x86 hardware.

I’m inclined to believe it more now that Scoble says he got confirmation on the story. I can’t imagine the move being received very well by the developers paying top dollar to attend WWDC. It isn’t over till Sir Steve keynotes, but I’ll definitely be refreshing several Mac news pages like a madman.

We shall see…


RAID: Redundant Array of [Independent|Inexpensive] Disks

Posted: June 3rd, 2005 | Author: | Filed under: Projects | 18 Comments »

During the last semester I wrote two papers for my Computer Architectures class. I spent quite a bit of time on them and have been thinking about posting them on my weblog for quite some time. I’m a bit worried about plagarism though, and I’m not sure what to do about it. I’m pretty sure that I can submit it to the auto-plagarism-detector service that my university subscribes to, and I’m probably going to do that now that this paper is posted.

Secondly, I’m releasing this paper under the by-nc-sa (Attribution NonCommercial ShareAlike 2.0) license, so unless you can turn in your paper to your teacher with a by-nc-sa license displayed on it, you can’t include it in your paper without proper citation.

PLEASE NOTE: If you are considering plagarising this, please don’t. If your teacher allows you to cite non-academic internet sources, then by all means borrow my ideas and cite me. What I would really suggest doing is taking a look at my primary sources and then heading to your university library or computer system to consult them yourself. All of the ACM journal sources that I cited are available online if your university subscribes to the ACM Portal. This paper was thoroughly researched but there were some late nights involved in the production of it so it is provided WITHOUT WARRANTY against correctness or anything like that.

Creative Commons License
This work is licensed under a Creative Commons License.

Matt Croydon

CMSC 311

March 9, 2005

The term RAID originally stood for “Redundant Arrays of Inexpensive Disks” [1], although an effort has been made to replace Inexpensive with Independent [2] in order to deemphasize the importance of cost. In modern practice, the words can be used interchangeably, and in most computer-oriented contexts the meaning is commonly understood. RAID technology was developed to improve upon monolithic SLED (Single Large Expensive Disks) [1] devices. In addition to being large and expensive, these drives have fixed input and output levels and in the late 80’s and early 90’s were not keeping pace with the rest of semiconductor technology [2].

There are several discreet configurations or levels of RAID, each with its advantages and disadvantages. The various levels are conceptual, and not necessarily tied to a specific implementation. RAID can be accomplished on either the hardware or software level. Hardware-based RAID tends to provide higher overall performance while software-based RAID offers lower cost and greater flexibility.

Redundancy is required because as more disks are added to an array, the MTTF (Mean Time to Failure) [1] decreases sharply. For example, if each individual drive is rated for 30,000 hours and if there are 100 disks in the array, the MTTF for the array is the MTTF of each individual drive divided by the number of drives. The MTTF of the 100 drive array is 30 hours, a long cry from the 30,000 hours that each unit is rated for [2].

RAID Level 0 and JBOD

RAID 0 is not part of the original specification [3] and provides absolutely no redundancy; however it does employ data striping. Data striping is an important concept in some RAID configurations. RAID 0 is often implemented in hardware controllers that also support other levels of RAID. RAID 0 allows extremely write performance but does not significantly improve on read access time [2].

The other non-redundant RAID technology is JBOD, which stands for “Just a Bunch of Disks” [4]. JBOD uses either RAID hardware or software to combine multiple disks so that they appear as one logical device to the operating system. JBOD allows for easy storage capacity expansion and is in common usage on both Windows and Linux platforms among others.

RAID Level 1

RAID 1 uses mirroring in order to achieve redundancy [3]. For every disk of data, there is a mirrored disk that contains an exact copy of the original disk [5]. While every write to the array has to be performed twice (fist on the original drive, then to the mirrored drive), read speeds can be improved. Because there are two copies of the data, the drive that can retrieve the data quickest can be used. Both drives may also simultaneously serve read requests thereby increasing the read speed. If one drive in a two drive array fails, the remaining drive can be used for reading and writing until the defective disk can be replaced. Once a new drive is placed in the array, data can be copied over and eventually mirroring once again takes place in real time.

RAID Level 2

RAID 2 uses the same ECC (Error Correcting Code) as ECC memory [2]. In addition to the data disks, a number of check disks are used to store the ECC data. If Hammering ECC is used, an array of 10 data disks would need 4 check disks and an array of 25 data disks would require 5 check disks [1]. The extra disks are required to be able to detect and repair an unrecoverable error. In RAID 2, data is striped bit by bit across the data disks while the ECC data is written to the check disks [1].

RAID Level 3

The next level of RAID assumes that most hardware or software RAID controllers will be able to detect an error. A single check disk can be used to recover from an error, so if we leave the job of error detection to the controller and eliminate all but one of the check disks as compared to RAID 2 [1]. This strategy cuts down on cost without sacrificing redundancy as long as every bit on all of th other data disks and the check disk can be successfully read. The contents of the bad disk can be obtained by finding the parity of the disks that have not failed and comparing each bit to the parity of all of the disks as stored on the check disk. If the values are identical, the bad disk originally held a 0 in that position. If the values differ, it held a 1 [1].

RAID Level 4

RAID 4 also only uses one check disk but stripes data across the data drives in chunks rather than bit by bit. The check disk stores the parity information for each chunk of data. RAID 4 is very efficient for systems such as transaction processing that require many very small reads from the disk array. If the data is smaller than the storage chunk size, the array can furnish multiple request simultaneously [6].

RAID Level 5

RAID 5 is the most commonly deployed configuration [7] in commercial settings and distributes the parity blocks evenly across all disks [2]. Because the data and parity are spread across all disks, RAID 5 excels at both small and large reads, and large writes. RAID 5 requires a “read-modify-write” [2] cycle to calculate and write parity information, so RAID 5 is less than optimal when it comes to many small writes. [2]

Advanced RAID Configurations

There are several hybrid RAID configurations that while not in the original RAID specification, can improve reliability and redundancy in certain situations. RAID 6 employs two distinct parity calculations for each chunk of data stored [4]. RAID 6 appears to be more theoretical than practical; as there are no guidelines for implementing it. RAID 6 differs from most RAID configurations in that it can recover from two unrecoverable errors, as long as the rest of the data and parity information can be read successfully.

While many combinations of RAID components are possible, only a few are common. These include RAID 10, RAID 50, and RAID 0+1. RAID 10 significantly improves reliability by providing “a stripe set across mirrored pairs” [7]. This means that RAID 10 can recover from two total failures as long as the failures are on opposite sides of the mirror. Similarly, RAID 50 combines two RAID 5 arrays. RAID 50 is extremely redundant and not practical for most purposes. RAID 0+1 simply constructs a RAID 1 array out of several RAID 0 arrays. In RAID 0+1, one disk failure brings down the mirror half of the array until the bad disk is replaced [7].

Increasing RAID Throughput

Many modern hardware RAID controllers contain onboard memory caches to speed up input and output. Caching of data and parity blocks was found to increase throughput in the early to mid 90’s [9]. The physical location of parity blocks in RAID 5 has been proven to influence throughput [10]. In their study, Lee and Katz determined that left-symmetric, extended-left-symmetric, and flat-left-symmetric parity configurations were the best for overall use [10]. The absolute best parity configuration for RAID 5 drives depends on the size and number of both reads and writes.

Strategies for Increased Reliability

There are several ways to increase RAID reliability, even in simple arrays. Because the different RAID levels are merely suggestions for how to accomplish redundancy, specific implementations may vary. For a simple 2 disk RAID 1 array, you have the option of placing both disks on one hardware controller or (if supported) you may place each disk on its own controller and have the two controllers coordinate mirroring [7]. In this configuration, the failure of any one RAID controller does not bring down the entire array.

Hybrid arrays (as discussed in the Advanced RAID Configurations section above) can also increase reliability by creating mutli-tiered or multi-leveled arrays. Advanced configurations need to be used with caution, since the MTTF decreases exponentially as the total number of disks increases.

As per-disk capacity increases, it is possible to implement RAIDs with identical storage capacity while using fewer overall disks. If fewer disks are used, the MTTF increases. Unfortunately with increased storage capacity comes an increased need for storage, so decreasing the total number of disks in a RAID may not be possible.

RAID Today

In the early days of RAID research, SCSI was the only technology that easily allowed for RAID configurations. Today that is changing rapidly with the introduction of extremely large capacity IDE and Serial ATA drives as well as lower cost hardware controller cards for them. These lower costs to entry have allowed RAID to spread from university research labs and large corporations all the way down to home users seeking data protection. Many mid-range to high end motherboards have a built-in IDE or Serial ATA RAID controller built in.

RAID technology is also being used extensively in large server farms and storage facilities. Elaborate collections of RAID arrays are often combined with network technology such as SAN (storage area networks) and NAS (network attached storage) to meet the always-on accessible-anywhere needs of today’s customers.

RAID has also become an built-in part of Microsoft’s Windows operating system and has also been incorporated in to the Linux Kernel [11]. Software-based RAID further reduces entry costs, though generic IDE RAID controllers can be found in stores for well below $50. A more well known hardware RAID controller from Adaptec or others can rage from $100 for IDE to several hundred dollars for advanced SCSI Ultra 160 controllers.

Conclusion

Using a RAID may lull users in to a false sense of security. Most RAID configurations protect against only one unrecoverable error and usually require that every other bit be read successfully in order to recover the data. Just because a RAID is in use does not mean that users are invincible. Rigorous and recoverable backups should also be implemented in addition to the use of RAID technology.

With that caution in mind, RAID can provide redundancy that would not otherwise be available. If a specific RAID configuration is tailored to a specific profile (many small writes, continuous large reads, etc) a significant increase in throughput can be realized.

RAID, a technology that started out as graduate and Doctoral research projects, now powers a wide array of technology from home computers to large datacenters. RAID allows advanced research facilities and corporate databanks alike to achieve redundancy on collections of data that commonly reach terabytes and petabytes [12].

References

[1] D. Patterson, G. Gibson, and R. Katz, “A Case for Redundant Arrays of Inexpensive Disks (RAID),” in Proceedings of the 1988 ACM SIGMOD international conference on Management of data, 1988, pp. 109-116.

[2] P. Chen et al, “RAID: High-Performance, Reliable Secondary Storage,” ACM Computing Surveys, Vol 26, pp. 145-185, June 1994.

[3] M. Scnier, Ed., Dictionary of PC Hardware and Data Communications Terms, Sebastopol: O’Reilly and Associates, 1996, pp.362-363.

[4] M. Shooman, Reliability of Computer Systems and Networks, New York: John Wiley and Sons, 2002, pp.119-126.

[5] G. Gibson, Redundant Disk Arrays: Reliable, Parallel Secondary Storage, Cambridge: MIT Press, 1992.

[6] R. Jain et al. Eds., Input/Output in Parallel and Distributed Computing Systems, Boston: Kluwer Academic Publishers, 1996, pp.106-108.

[7] C. Zacker and J. Rourke, PC Hardware: The Complete Reference, Berkeley: Osborne/McGraw Hill, 2001, pp.606-613.

[8] PC Guide, “Multiple (Nested) RAID Levels”, March 2005, http://www.pcguide.com/ref/hdd/perf/raid/levels/mult.htm.

[9] J. Menon and J. Cortney, “The Architecture of a fault-tolerant cached RAID controller,” in Proceedings of the 20th annual international symposium on Computer architecture, 1993, pp.76-87

[10] E. Lee and R. Katz, “Performance consequences of parity placement in disk arrays,” in Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, 1991, pp.190-199.

[11] I. Molnar, G. Oxman, and M. de Icaza, “Kernel Korner: The New Linux RAID Code,” Linux Journal, Vol 1997, Article No. 25, December, 1997.

[12] Los Alamos National Laboratories Networked Systems Research Team, “Announcements,” March 2005, http://public.lanl.gov/netsys/.


DLP on the Big Screen

Posted: June 3rd, 2005 | Author: | Filed under: Web Services | 7 Comments »

Last weekend I saw Star Wars: Revenge of the Sith in one of the handful of theatres (I count 4) in the DC-Virginia-Frederick-Baltimore metro area equipped with DLP (Digital Light Processing) technology. Just like Mike Washlesky at The Mac Observer, I was blown away. I first noticed the crispness and clarity when the first preview splash screen came up and was blown away by the effects and their digital projection throughout the movie. The movie won’t top my greats list but it was a lot of fun and great to see in digital.

Further reading:

  • Episode III Digital Theater List: Make sure you find the one digital theater, buy your tickets online, and show up early for a good seat.
  • DLPMovies: An excellent place to find your local DLP theater (if there is one). DLPMovies found more theaters in my area than the ones showing Star Wars, including one that is currently showing Madagascar in DLP.
  • DLP.com: A lot of marketing, but it boils down to amazing picture quality and an insane contrast ratio.
  • DLP Wikipedia entry: Excellent information as always.

SDLQuake on Maemo

Posted: June 3rd, 2005 | Author: | Filed under: Linux, Mobile, Open Source, Projects | 13 Comments »

SDLQuake on Maemo x86

Yep, it had to be done. Above you can see SDLQuake running on Maemo x86. I haven’t tried it on the ARM target but I heard that it or a port of it should run just fine on ARM. Between various emulators and game engines, it shouldn’t be hard at all to amuse yourself with a Nokia 770.

No changes were required for this x86 build. ./configure, make and run-standalone.sh ./sdlquake.


Python For Series 60: 1.1.0 Pre-Alpha

Posted: June 3rd, 2005 | Author: | Filed under: Mobile, Projects, Python | 9 Comments »

There’s a new version out, 1.1.0 Pre-Alpha. Grab the .SIS installer for first edition devices (3650, N-Gage, etc) or for 2nd edition devices. Don’t forget to pick up the first edition or second edition SDK

I’ll read over the new API docs tonight and hope to find all kinds of juicy morsels.

Update: Erik Smartt fills in some details on his weblog. Thanks again to the whole Python for Series 60 team for all the hard work.


Embedded D-BUS

Posted: June 2nd, 2005 | Author: | Filed under: Linux, Open Source | 9 Comments »

I’ve written about D-BUS before, but I just wanted to say that I love what I’ve seen with what Maemo does with D-BUS. All kinds of great stuff from application launching to state change notification is done with D-BUS. I strongly believe that D-BUS is going to rock both on the desktop and on mobile devices. D-BUS provides the infrastructure needed to build something like Growl for localhost and should allow apps to communicate with each other without having to worry about the fine details. I expect to see lots of advancements involving D-BUS in the next year and it will definitely improve the Linux/Gnome experience.


High Tech Baltimore

Posted: June 2nd, 2005 | Author: | Filed under: Web Services | 8 Comments »

Baltimore Emerging Technology CenterA few weeks ago I saw a spot on TV about the Baltimore Emerging Technology Center, an early stage incubator for local high tech startups. They appear to house a wide range of high tech startups at 3 different locations in Baltimore. The current list of participants shows quite a bit of promise from biotech to IT services.

The ETC is funded by The Baltimore Development Corporation. Other cool stuff can be found at the Greater Baltimore Technology Council.

Looking around these sites definitely gave me a feel for the state of the art (so to speak) in high tech startups in Baltimore. It didn’t get the press that Northern Virginia did back in the dot com days, but things are definitely happening up there.