You are on page 1of 6

Creating a Red Hat Cluster: Part 1

This is the first of a series of articles that will demonstrate how to create a Linux cluster using Red Hat/CentOS 5.5 distribution. When we created our first cluster at the office, we were searching for some Red Hat cluster setup information on the internet. To my surprise, we could not find a lot of article or user experience on that topic. So I hope, this series of articles will benefit to the community of users trying or wanting to create their own cluster.

The cluster hardware


Our cluster will have 3 HP servers, they will all have 4GB of memory, 36GB of mirrored internal disks, one Qlogic fiber card connected to our SAN system, 2 network cards and for our fencing device we will use the on-board HP ILO (Integrated Light Out). This is my setup, yours does not need to be the same. You do not need to use HP server to build a cluster, you do not need to have mirrored disks (although recommended) and you do not need to have to use a SAN infrastructure either (NFS share can also be used). One thing I would recommend is a fencing device, on the HP server there is a network port on the back of each server called ILO. This will allow the cluster software to power on, power off or restart a server remotely. Red Hat cluster package allow you to use a lot of similar fencing devices. This part of the cluster is important because this will prevent at one point some nodes in the cluster to write to a non-shareable filesystem and create data corruption. If you do not have a fencing device, you can always use the manual fencing method it works, but it not supported.

The Setup
Our cluster will be contains 3 servers, we will have 2 actives servers and one passive. For our example, we will have one server running an HTTP service, the second server will be an FTP server. The third server will be use as a fail-over server, if the first or second server have network, SAN, or hardware problem, the service it is running will move the third server. Although not required, having a passive server offer some advantage, first it make sure that if one server have problem the passive server will be able to handle the load of any of your server. If we did not have that passive server we would need to make sure that either of the server would be capable of handling the load of the two servers on one server. Clustering environment offer some other advantage when time come to do hardware update on a server. Let say we need to add memory to the first server, we could move the HTTP service from one server to the passive node, add memory to the first server and then move back to the service on the original node when ready. Another advantage of having a passive server is that you can update the OS of your node one by one without affecting the service (if reboot is necessary). So the name of our node will be bilbo the will host the http service, gollum the will host the FTP service and gandalf will be our passive node. As you can see below image, each server use 3 network cards. The first NIC (eth0), is use to offer the service HTTP and FTP to the users and it is the host main network card. The second network card (eth1) is used by cluster software for his heartbeat. In a corporate environment, this network should be isolated from the rest of your network. There will be a lot of broadcast on it and the response time is important. The ILO network will be used by the cluster software to remotely poweron/poweroff server in our cluster, when a problem arise.

So out /etc/hosts file look like this :

# 127.0.0.1 # # Host Real IP 192.168.1.103 192.168.1.104 192.168.1.111 # # Service Virtual IP 192.168.1.204 192.168.1.211 # # HeartBeat IP Address 10.10.10.103 10.10.10.104 10.10.10.111 # # HP ILO IP Address 192.168.1.3 192.168.1.4 192.168.1.11 # ilo_gandalf.maison.ca ilo_gollum.maison.ca ilo_bilbo.maison.ca ilo_gandalf ilo_gollum ilo_bilbo hbgandalf.maison.ca hbgollum.maison.ca hbbilbo.maison.ca hbgandalf hbgollum hbbilbo ftp.maison.ca www.maison.ca ftp www gandalf.maison.ca gollum.maison.ca bilbo.maison.ca gandalf gollum bilbo localhost.localdomain localhost

We should define all the ip address that our cluster need to function properly in your /etc/hosts. Of course they should also be define in your DNS, but in case your DNS does not respond, the /etc/hosts will assure you that the cluster will continue to work properly. We will use the domain name maison.ca through out these articles.

Creating a Red Hat Cluster: Part 2


In this article, we continue our journey on how-to build our cluster

Installing the cluster software


If you are using Red Hat, you need to register your server in order to download new software or update. You also need to subscribe to the Clustering and Cluster Storage channel in order to install these groups of software. With CentOS, this is not needed since we can download these groups of software without registration. Lets install the clustering software, by typing this command; # yum groupinstall Clustering Since we will be using GFS filesystem, we will need the Cluster Storage software group. # yum groupinstall Cluster Storage I also found out that this package is also needed by the cluster software, so let install it. # yum install perl-Crypt-SSLeay If you are using a 32 bits kernel and you have more than 4 GB of memory on it, you need to install the PAE kernel and GFS modules. This will ensure that you are using all the memory available on the server. # yum install kmod-gnbd-PAE kmod-gfs-PAE kernel-PAE Finally, lets make sure that you have that the servers have the latest OS update. # yum y update

Setting the locking type for GFS filesystem


To use the GFS (Global File System) with the cluster you need to activate the GFS locking in the /etc/lvm/lvm.conf file. We need to change the locking_type variable from 0 to 3, to inform LVM that we will be dealing with GFS volume group and GFS filesystem. This command needs to be run on all the servers.

# grep -i locking_type /etc/lvm/lvm.conf locking_type = 0 # lvmconf --enable-cluster # grep -i locking_type /etc/lvm/lvm.conf locking_type = 3

Making sure SELinux and firewall are disable


We do not want to deal with SELinux and the firewall in our cluster, so we will disable them. From the gnome desktop run the following the following command ;

# system-config-securitylevel Disable the SELinux need reboot and Disable the firewall

Activating cluster service


Lets make sure that the cluster services are started each time the server is started. Activate cluster services.

We now we can start the cluster services manually or reboot the server. Start Cluster Services.

Starting cluster configuration


There are two tools you can use to configure and maintain your cluster. The first one is a web interface called Conga that require the installation of a agent on each node (ricci) and a centralize configuration center named (lucy). If you want to use the web interface (Seem to be working a lot better now), it is advisable to install the configuration center on a Linux server outside of the cluster. This was fairly new when I first created my first cluster and we decided to used the second interface name Cluster Configuration. This tool can be started from any node within the cluster and does not require installing any additional software. To start the cluster configuration GUI, type the following command using the root account; Start cluster GUI from command line:

New cluster config warning The first time you run the cluster configuration GUI, a warning message may display. It is just informing us that the cluster configuration file /etc/cluster/cluster.conf was not found. Click on the Create New Configuration button and lets move on.

General cluster setting: Next we need to enter the name of our cluster, we have choose to name it our_cluster. The name of a cluster cannot be change. The only way to change the name of the cluster is to create a new one, so choose it wisely. We use the recommend Lock Manager (Dynamic Lock Manager), the GULM is depreciated.

The multicast address we used for the heartbeat will be 239.1.1.1. The usage of the Quorum Disk is outside the scope of this article. But basically, you define a small disk on the SAN that is shared among the nodes in the cluster and node status is regularly written to that disk. If a node hasnt updated its status for a period of time, it will be considered down and the cluster will then fence that node. If you are interested to use the Quorum Disk there is an article here that describe how to set it up.

Cluster Properties
Now lets check some of the default setting given to our cluster. Click on Cluster on the left hand side of the screen and then click on the Edit Cluster Properties. The Configuration Version value is by default to 1 and automatically incremented each time you modify your cluster configuration. The Post-Join Delay parameter is the number of seconds the fence daemon (fenced) waits before fencing a node after the node joins the fence domain. The Post-Join Delay default value is 3. A typical setting for Post-Join Delay is between 20 and 30 seconds, but can vary according to cluster and network performance. The Post-Fail Delay parameter is the number of seconds the fence daemon (fenced) waits

before fencing a node (a member of the fence domain) after the node has failed. The Post-Fail Delay default value is 0. Its value may be varied to suit cluster and network performance. CHANGE THE POST FAIL DELAY to 30 Seconds.

Adding our nodes to cluster


To add node into the cluster lets select Cluster Nodes on the upper left side of the screen and click on the Add a Cluster Node button. You will be then presented the Node Properties screen. Enter the node name, the Quorum Votes and the name of the interface used for the Multicast (Heartbeat), in our case it is eth0. The avoid problem, I always used the first interface for the network cluster heartbeat.

Enter the name of the host name used for the heartbeat, for our first node it will be hbbilbo.maison.ca. Remember that this name MUST be defined in our hosts file and in your DNS (if you have one).


Quorum is a voting algorithm used by the cluster manager. We say a cluster has quorum if a majority of nodes are alive, communicating, and agree on the active cluster members. So in a thirteen-node cluster, quorum is only reached if seven or more nodes are communicating. If the seventh node dies, the cluster loses quorum and can no longer function. For our cluster we will leave the Quorum Votes to the default value of 1. If we would have a two-node clusters, we would need to make a special exception to the quorum rules. There is a special setting two_node in the /etc/cluster.conf file that looks like this: <cman expected_votes=1 two_node=1/> Repeat operation for every node you want to include in the cluster

Insert gandalf and gollum node in our cluster.

You might also like