You are on page 1of 6

Craft a load-balancing cluster with ClusterKnoppix

Page 1 of 6

Craft a load-balancing cluster with ClusterKnoppix


Using Knoppix-based LiveCDs, build your own supercomputing Linux cluster

Level: Introductory Mayank Sharma (geeky_bodhi@yahoo.co.in), Freelance technical writer 22 Dec 2004 The cluster, a collection of computers that work together, is an important concept in leveraging computing resources because of its ability to transfer workload from an overloaded system (or node) to another system in the cluster. This article explains how to set up a load-balancing Linux cluster using Knoppix-based LiveCDs. Supercomputer is a generic term that refers to a computer that can perform far better than an ordinary computer. A cluster is a collection of computers that are capable of (among other things) transferring workload from an overloaded unit to other computers in the cluster. This feature is called load balancing. In this article, you'll learn how to set up a load-balancing cluster. By balancing loads effectively, a cluster improves its efficiency and earns its place in the family of supercomputers. For passing loads, computers in a cluster must be connected to each other. The computers in a cluster are called nodes. One or more master nodes and several drone or slave nodes manage the cluster. In a typical setup, the master node is where the applications are initiated. It's the master node's responsibility to migrate applications to the drones when required. In this article, you'll see how to use a Knoppix-based LiveCD to set up your very own supercomputer. You have probably heard of LiveCDs; they're wonderful try-before-you-install complete Linux systems that boot off your CD drive. Since their inception, individuals and projects have been using LiveCDs as their demonstration platform and LiveCDs have come a long way since the early "DemoLinux" days. But first, some background on the supercomputing cluster.

What is a supercomputer?
A supercomputer is typically used for scientific and engineering applications that perform a large amount of computation, handle massive databases, or both. (The term supercomputer may also refer to systems that are much slower but still impressively quick.) In reality, most supercomputing systems are multiple interlinked computers that perform parallel processing following one of the two general parallel processing approaches: SMP, or symmetric multiprocessing MPP, or massively parallel processing In SMP (also known as "tightly coupled" multiprocessing and a "shared everything" system), processors share memory and the I/O bus or data path, and a single copy of the operating system controls the processors. Sixteen processors is the usual top limit in most SMP systems. SMP systems have an advantage over MPP shows when performing online transaction processing (OLTP) in which many users employ a simple set of transactions to access the same database. Dynamic workload balancing is the facility that allows SMP to shine for this task. MPP systems (also known as a "loosely coupled" or "shared nothing" system) are characterized by a number of processors, each with their own operating system and memory, that process different parts of a single program at the same time. The system uses a messaging interface and a set of data paths that allow processors to communicate with each other. Up to 200 processors can be focused on a single task. Setting up an MPP system can be complicated, since it requires a lot of planning when it comes to parceling system resources and work assignments among processors (remember, nothing is shared). MPP systems have an advantage in applications in which users need to search a tremendous number of databases at the same time. The IBM Blue Pacific is a good example of the high end of supercomputers. The 5,800-processor, 3.9 teraflop system (with 2.6 trillion bytes of memory) was built in partnership with Lawrence Livermore National Laboratory to simulate the physics involved in nuclear reactions. Clustering represents the lower end of supercomputing, a more build-it-yourself approach. One of the most popular and best-

http://www-128.ibm.com/developerworks/linux/library/l-clustknop.html

4/14/2008

Craft a load-balancing cluster with ClusterKnoppix

Page 2 of 6

known examples is the Beowulf Project, which explains how to use off-the-shelf PC processors, Fast Ethernet, and the Linux operating system to handcraft a supercomputer. See the Resources section below for more information on the BeoWulf Project. Now that you have clustering in the proper context, I'll show you how to start the process to set up your own cluster.

The magic tools: ClusterKnoppix and openMosix


One more thing before you start building your cluster: a quick look at the main distribution you'll be using -- ClusterKnoppix, and the cluster-enabling kernel it contains, openMosix.

The main distro: ClusterKnoppix


As the name implies, ClusterKnoppix is a derivative of Knoppix. ClusterKnoppix provides users all the benefits of Knoppix (plethora of applications, run-off-the-CD, magical automatic hardware detection, and support for many peripherals and devices) along with openMosix clustering capabilities. Links to more information on ClusterKnoppix and openMosix are in the Resources section. Various other openMosix CD-based distributions include Bootable Cluster CD (BCCD), ParallelKnoppix, PlumpOS, Quantian, and CHAOS. But ClusterKnoppix is probably the most popular "master node" distribution, because it: Provides a full blown X server running KDE (among other desktops) Offers various applications (such as GIMP) Has adopted security enhancements from the CHAOS distribution. (I will talk about CHAOS later in this article.)

Managing programs across nodes: openMosix


Apart from the hardware and the link between the machines, you need software that can manage the programs spread across various drone nodes. In a non-clustered computer, the operating system allows an application to be launched from the storage media (such as hard disks and CDs) into the memory. The OS sees to it that the application is executed to completion. I chose openMosix -- a Linux kernel extension designed for single-system image clustering -- because it provides the ability for the OS (the Linux kernel, in our case) to launch an application from any cluster node into memory and execute on any node in the cluster. Any given application is migrated to the node with the most available capacity or resources. Once openMosix is installed, the nodes in the cluster start communicating and the cluster continuously adapts itself to the workload by optimizing resource allocation. An openMosix feature, Auto Discovery, allows new nodes to be added while the cluster is operating. According to the openMosix Project, openMosix can scale to more than 1,000 nodes.

Building the cluster


For this I used two boxes. The master node was a Pentium III 1.7-GHz box with 384 MB RAM that it shares with the onboard graphics. The drone is a Pentium III 997-MHz machine with 256 MB dedicated RAM. Both have CD-ROM drives. They are connected through a standard crossed networking wire with RealTek 10/100 Mbps LAN cards on both ends. If you have a home network setup with two (or more) computers with wires connecting them, then your setup is similar to mine. You will also need ClusterKnoppix (the latest version as of this writing is clusterKNOPPIX V3.4-2004-05-10-EN-cl). ClusterKnoppix has the ability to sniff out drones as they boot on the network, but you need special LAN cards and a BIOS that supports booting over the network. Because the cost of replicating CDs is minimal and you want X running on all the nodes, it's easiest to use as many ClusterKnoppix CDs as there are nodes in the cluster. You can use the following network settings for the various nodes on the cluster: Network -- 192.168.1.0

http://www-128.ibm.com/developerworks/linux/library/l-clustknop.html

4/14/2008

Craft a load-balancing cluster with ClusterKnoppix

Page 3 of 6

Netmask -- 255.255.255.0 Default Gateway -- 192.168.1.1 IP address of Master -- 192.168.1.10 IP address of Drone #1 -- 192.168.1.20 I won't go into detail on networking in Linux. There's a lot of information available; see Resources below.

Initializing the master node


openMosix doesn't require the first node initiated to be a master node, but just to keep things straight, set up the master node first. 1. Put the ClusterKnoppix CD in the drive and boot from it. 2. At the boot: prompt, press Enter. Give ClusterKnoppix time to detect your hardware and boot. By default it boots into KDE. 3. Once in, open a root shell. You'll find it inside the second item on the task bar. 4. Now we need to configure the local network interface. First, give your network card, eth0, an IP address: ifconfig eth0 192.168.1.10. 5. Next, specify the route it must follow to the gateway: route add -net 0.0.0.0 gw 192.168.1.1. That sets up your network. 6. Next, initiate the openMosix system: tyd -f init. 7. Last, announce this node as the master node in the cluster: tyd. Next, you'll initialize the drone node.

Initializing the drone node


Setting up a drone is not very different from setting up a master. Repeat the first three steps above for initializing the master node. Try configuring the network card on the drone yourself with the values mentioned previously (ifconfig eth0 192.168.1.20 and ifconfig eth0 192.168.1.20). 8. Now, initialize the openMosix system. Same as last time: tyd -f init. 9. Last, insert this node into the cluster: tyd -m 192.168.1.10. That's it! Your cluster's up and running.

Getting familiar with tracking tools


You'll need to check the status of the cluster; ClusterKnoppix packs the following tools for tracking status:

openMosixview
Bring up the utility by typing its name on the root shell. It will detect the number of nodes in the cluster and present you with a nice, geeky-looking interface. At a glance, you can see the efficiency of the cluster, the load on the cluster, the memory available to the cluster, the percentage of memory used, and other information. You won't see much activity at this point, since the cluster is hardly being used. Spend some time familiarizing yourself with this application.

openMosixmigmon
This application shows the processes that have been migrated from the master node to the drone. Move your mouse over the square surrounding the circle in the center. You'll be told the name of the process and its ID. To "migrate" a particular process from the master, you can drag a square and drop it in the smaller circle (the drone).

http://www-128.ibm.com/developerworks/linux/library/l-clustknop.html

4/14/2008

Craft a load-balancing cluster with ClusterKnoppix

Page 4 of 6

OpenMosixAnalyzer
This simple application reports on the load of the cluster as well as individual nodes from its initialization till the time the cluster is up.

mosmon
This command-line-based monitor shows you the load on the cluster, the memory available, memory being used, and other things in real time. Review its man page to understand how you can tailor the view.

mtop
This tool is of interest to people who are familiar with top. top keeps track of each and every process running on the computer. mtop, a cluster-aware variant of top, also displays each and every process, but with the additional information of the node in which the process is running.

Testing the cluster


Now that the cluster is up and running, it's time to overload it. To do this, you'll borrow a script written by the good people of the CHAOS distribution: Listing 1. Cluster test script
// testapp.c Script for testing load-balancing clusters #include <stdio.h> int main() { unsigned int o = 0; unsigned int i = 0; unsigned int max = 255 * 255 * 255 * 128; // daemonize code (flogged from thttpd) switch ( fork() ) { case 0: break; case -1: // syslog( 1, "fork - %m" ); exit( 1 ); default: exit( 0 ); } // incrementing counters is like walking to the moon // its slow, and if you don't stop, you'll crash. while (o < max) { o++; i = 0; while (i < max) { i++; } } return 0; }

Open any word processor, copy this script, and save it as testapp.c. Make this script available on all the nodes in the cluster. To execute this script, issue these commands on all the nodes. First, compile the C program:
gcc testapp.c -o testapp

Then, execute ./testapp. Execute the script at least once on all the nodes. I executed three instances on both nodes. After executing each instance, toggle back to the applications described above. Notice the spur of activity. Enjoy watching

http://www-128.ibm.com/developerworks/linux/library/l-clustknop.html

4/14/2008

Craft a load-balancing cluster with ClusterKnoppix

Page 5 of 6

your drawing room cluster migrate processes from one node to another. Look Ma, it's balancing loads!

What did you just do?


Now that everything is up and running, let's review what you did. You started by configuring the network cards on the machines and giving them individual IP addresses. Then, you gave them a common route along which to talk. Finally, you initialized the openMosix system by using the command tyd. (ClusterKnoppix borrows tyd, pronounced "tidy," from the CHAOS project.) You execute tyd without any switches only in the first node in the cluster. This node doesn't have to be a master node. All subsequent nodes are added by using the -m switch followed by an IP address. While initializing the second node, the IP address has to be of the first node. But when you initialize a third node, you have an option between two IP addresses from node one and node two.

What's next?
Instead of setting up a cluster using ClusterKnoppix on both nodes, you can also set up a heterogeneous cluster once you get the hang of it. In such a cluster, apart from the master node, you don't need to run a GUI on the slaves. You can run a distribution that is openMosix-aware but as small as the Linux kernel. CHAOS is probably the most popular choice of distribution to run on a drone node. It has a small memory footprint, which helps you save memory for the cluster, yet it's secure, reliable, and fast. So what are you waiting for? Show off with your drawing room cluster!

Resources
Beowulf clusters: e pluribus unum (developerWorks, September 2001) is a fine introduction to Beowulf-style clustering. The tutorial Linux clustering with MOSIX (developerWorks, December 2001) explains what it is, how you go about cluster-enabling a Linux system, and what benefits you derive from setting up a cluster. Creating a WebSphere Application Server V5 cluster (developerWorks, January 2004) introduces clusters for load balancing and failover support and describes how to set up a cluster with IBM WebSphere Application Server for Linux. Find information on Linux networking basics in the developerWorks tutorial series on Linux-powered networking. The openMosix Project provides details and updates on this kernel extension. The ClusterKnoppix site explains the distribution and offers an ongoing forum for posing questions. Wikipedia offers lots of information on LiveCDs. Find more resources for Linux developers in the developerWorks Linux zone. Download no-charge trial versions of IBM middleware products that run on Linux, including WebSphere Studio Application Developer, WebSphere Application Server, DB2 Universal Database, Tivoli Access Manager, and Tivoli Directory Server, and explore how-to articles and tech support, in the Speed-start your Linux app section of developerWorks. Get involved in the developerWorks community by participating in developerWorks blogs.

http://www-128.ibm.com/developerworks/linux/library/l-clustknop.html

4/14/2008

Craft a load-balancing cluster with ClusterKnoppix

Page 6 of 6

Browse for books on these and other technical topics.

About the author


Mayank Sharma has been writing about technology, especially free and open software, for the past five years. He helped launch South Asia's leading FLOSS monthly LINUX For You (as its Assistant Editor) and is currently busy putting together a Web-based publication devoted to localization, education, and FLOSS migration. Besides writing, Mayank loves to hack also; his most recent contribution is an installer for the Utkarsh localization project. Still struggling for a computer science degree, he loves Formula One car racing.

http://www-128.ibm.com/developerworks/linux/library/l-clustknop.html

4/14/2008

You might also like