Professional Documents
Culture Documents
Pacemaker Remote
Extending High Availablity into Virtual Nodes Edition 1
David Vossel
Primary autho r Red Hat dvossel@redhat.com
Legal Notice
Co pyright 2009-2013 David Vo ssel. The text o f and illustratio ns in this do cument are licensed under a Creative Co mmo ns Attributio nShare Alike 3.0 Unpo rted license ("CC-BY-SA")[1]. In acco rdance with CC-BY-SA, if yo u distribute this do cument o r an adaptatio n o f it, yo u must pro vide the URL fo r the o riginal versio n. In additio n to the requirements o f this license, the fo llo wing activities are lo o ked upo n favo rably: 1. If yo u are distributing Open Publicatio n wo rks o n hardco py o r CD-ROM, yo u pro vide email no tificatio n to the autho rs o f yo ur intent to redistribute at least thirty days befo re yo ur manuscript o r media freeze, to give the autho rs time to pro vide updated do cuments. This no tificatio n sho uld describe mo dificatio ns, if any, made to the do cument. 2. All substantive mo dificatio ns (including deletio ns) be either clearly marked up in the do cument o r else described in an attachment to the do cument. 3. Finally, while it is no t mandato ry under this license, it is co nsidered go o d fo rm to o ffer a free co py o f any hardco py o r CD-ROM expressio n o f the autho r(s) wo rk.
Abstract
The do cument exists as bo th a reference and deplo yment guide fo r the Pacemaker Remo te service. The KVM and Linux Co ntainer walk-thro ugh tuto rials will use: 1. Fedo ra 18 as the ho st o perating system 2. Pacemaker Remo te to perfo rm reso urce management within virtual no des 3. libvirt to manage KVM and LXC virtual no des 4. Co ro sync to pro vide messaging and membership services o n the ho st no des 5. Pacemaker to perfo rm reso urce management o n ho st no des
Pacemaker Remote
Table o f Co ntents 1. Extending High Availability Cluster into Virtual No des 1.1. Overview 1.2. Terms 1.3. Virtual Machine Use Case 1.4. Linux Co ntainer Use Case 1.5. Expanding the Cluster Stack 1.5.1. Traditio nal HA Stack 1.5.2. Remo te-No de Enabled HA Stack 2. Quick Example 2.1. Mile High View o f Co nfiguratio n Steps 2.2. What tho se steps just did 2.3. Accessing Cluster fro m Remo te-no de 3. Co nfiguratio n Explained 3.1. Reso urce Optio ns 3.2. Ho st and Guest Authenticatio n 3.3. Pacemaker and pacemaker_remo te Optio ns 4. KVM Walk-thro ugh 4.1. Step 1: Setup the Ho st 4.1.1. SElinux and Firewall 4.1.2. Install Cluster So ftware 4.1.3. Setup Co ro sync 4.1.4. Verify Cluster So ftware 4.1.5. Install Virtualizatio n So ftware 4.2. Step2: Create the KVM guest 4.2.1. Setup Guest Netwo rk 4.2.2. Setup Pacemaker Remo te 4.2.3. Verify Ho st Co nnectio n to Guest 4.3. Step3: Integrate KVM guest into Cluster. 4.3.1. Start the Cluster 4.3.2. Integrate KVM Guest as remo te-no de 4.3.3. Starting Reso urces o n KVM Guest 4.3.4. Testing Remo te-no de Reco very and Fencing 4.3.5. Accessing Cluster To o ls fro m Remo te-no de 5. Linux Co ntainer (LXC) Walk-thro ugh 5.1. Step 1: Setup LXC Ho st 5.1.1. SElinux and Firewall Rules 5.1.2. Install Cluster So ftware o n Ho st 5.1.3. Co nfigure Co ro sync 5.1.4. Verify Cluster 5.2. Step 2: Setup LXC Enviro nment 5.2.1. Install Libvirt LXC so ftware 5.2.2. Generate Libvirt LXC do mains 5.2.3. Generate the Authkey 5.3. Step 3: Integrate LXC guests into Cluster.
5.3.1. Start Cluster 5.3.2. Integrate LXC Guests as remo te-no des 5.3.3. Starting Reso urces o n LXC Guests 5.3.4. Testing LXC Guest Failure A. Revisio n Histo ry Index List o f T able s 3.1. Me t adat a Opt io ns fo r co nfigurint KVM/LXC re so urce s as re m o t e -no de s
Pacemaker Remote
1.1. Overview
The recent additio n o f the pacemaker_remote service suppo rted by Pacemaker version 1.1.10 and greater allo ws no des no t running the cluster stack (pacemaker+co ro sync) to integrate into the cluster and have the cluster manage their reso urces just as if they were a real cluster no de. This means that pacemaker clusters are no w capable o f managing bo th launching virtual enviro nments (KVM/LXC) as well as launching the reso urces that live withing tho se virtual enviro nments witho ut requiring the virtual enviro nments to run pacemaker o r co ro sync.
1.2. Terms
cluster-node - A baremetal hardware no de running the High Availability stack (pacemaker + co ro sync) remote-node - A virtual guest no de running the pacemaker_remo te service. pacemaker_remote - A service daemo n capable o f perfo rming remo te applicatio n management within virtual guests (kvm and lxc) in bo th pacemaker cluster enviro nments and standalo ne (no n-cluster) enviro nments. This service is an enhanced versio n o f pacemakers lo cal reso urce manage daemo n (LRMD) that is capable o f managing and mo nito ring LSB, OCF, upstart, and systemd reso urces o n a guest remo tely. It also allo ws fo r mo st o f pacemakers cli to o ls (crm_mo n, crm_reso urce, crm_master, crm_attribute, ect..) to wo rk natively o n remo te-no des. LXC - A Linux Co ntainer defined by the libvirt-lxc Linux co ntainer driver. http://libvirt.o rg/drvlxc.html
cluster is fully capable o f managing and mo nito ring reso urces o n each remo te-no de. Yo u can build co nstraints against remo te-no des, put them in standby, o r whatever else yo ud expect to be able to do with no rmal cluster-no des. They even sho w up in the crm_mo n o utput as yo u wo uld expect cluster-no des to . To so lidify the co ncept, an example cluster deplo yment integrating remo te-no des co uld lo o k like this. 16 cluster-no des running co ro sync+pacemaker stack. 64 pacemaker managed virtual machine reso urces running pacemaker_remo te co nfigured as remo te-no des. 64 pacemaker managed webserver and database reso urces co nfigured to run o n the 64 remo te-no des. With this deplo yment yo u wo uld have 64 webservers and databases running o n 64 virtual machines o n 16 hardware no des all o f which are managed and mo nito red by the same pacemaker deplo yment.
Pacemaker Remote
Install pacemaker_remote packages every virtual machine, enable pacemaker_remote on startup, and poke hole in firewall for tcp port 3121.
yum install pacemaker-remote resource-agents systemctl enable pacemaker_remote # If you just want to see this work, disable iptables and ip6tables on most distros. # You may have to put selinux in permissive mode as well for the time being. firewall-cmd --add-port 3121/tcp --permanent
Give each virtual machine a static network address and unique hostname Tell pacemaker to launch a virtual machine and that the virtual machine is a remote-node capable of running resources by using the "remote-node" meta-attribute. with pcs
# pcs resource create vm-guest1 VirtualDomain hypervisor="qemu:///system" config="vmguest1.xml" meta +remote-node=guest1+
raw xml
<primitive class="ocf" id="vm-guest1" provider="heartbeat" type="VirtualDomain"> <instance_attributes id="vm-guest-instance_attributes"> <nvpair id="vm-guest1-instance_attributes-hypervisor" name="hypervisor" value="qemu:///system"/> <nvpair id="vm-guest1-instance_attributes-config" name="config" value="guest1.xml"/> </instance_attributes> <operations> <op id="vm-guest1-interval-30s" interval="30s" name="monitor"/> </operations> <meta_attributes id="vm-guest1-meta_attributes"> <nvpair id="vm-guest1-meta_attributes-remote-node" name="remote-node" value="guest1"/> </meta_attributes> </primitive>
In the example abo ve the meta-attribute remote-node=guest1 tells pacemaker that this reso urce is a remo te-no de with the ho stname guest1 that is capable o f being integrated into the cluster. The cluster will attempt to co ntact the virtual machines pacemaker_remo te service at the ho stname guest1 after it launches.
Pacemaker Remote
Last updated: Wed Mar 13 13:52:39 2013 Last change: Wed Mar 13 13:25:17 2013 via crmd on node1 Stack: corosync Current DC: node1 (24815808) - partition with quorum Version: 1.1.10 2 Nodes configured, unknown expected votes 2 Resources configured. Online: [ node1 guest1] vm-guest1 (ocf::heartbeat:VirtualDomain): Started node1
No w the crm_mo n o utput wo uld sho w a webserver launched o n the guest1 remo te-no de.
Last updated: Wed Mar 13 13:52:39 2013 Last change: Wed Mar 13 13:25:17 2013 via crmd on node1 Stack: corosync Current DC: node1 (24815808) - partition with quorum Version: 1.1.10 2 Nodes configured, unknown expected votes 2 Resources configured. Online: [ node1 guest1] vm-guest1 webserver (ocf::heartbeat:VirtualDomain): Started node1 (ocf::heartbeat::apache): Started guest1
remote-port remote-addr
remoteconnecttimeout
10
Pacemaker Remote
11
# echo $corosync_addr
In many cases the address will be 192.168.1.0 if yo u are behind a standard ho me ro uter. No w co py o ver the example co ro sync.co nf. This co de will inject yo ur bindaddress and enable the vo te quo rum api which is required by pacemaker.
# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf # sed -i.bak "s/.*\tbindnetaddr:.*/bindnetaddr:\ $corosync_addr/g" /etc/corosync/corosync.conf # cat << END >> /etc/corosync/corosync.conf quorum { provider: corosync_votequorum expected_votes: 2 } END
Verify pacemaker status. At first the pcs cluster status o utput will lo o k like this.
# pcs status Last updated: Thu Mar 14 12:26:00 2013 Last change: Thu Mar 14 12:25:55 2013 via crmd on example-host Stack: corosync Current DC: Version: 1.1.10 1 Nodes configured, unknown expected votes 0 Resources configured.
rebo o t the ho st
12
Pacemaker Remote
To simplify the tuto rial well go ahead and disable selinux o n the guest. Well also need to po ke a ho le thro ugh the firewall o n po rt 3121 (the default po rt fo r pacemaker_remo te) so the ho st can co ntact the guest.
# setenforce 0 # sed -i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" /etc/selinux/config # firewall-cmd --add-port 3121/tcp --permanent
If yo u still enco unter co nnectio n issues just disable iptables and ipv6tables o n the guest like we did o n the ho st to guarantee yo ull be able to co ntact the guest fro m the ho st. At this po int yo u sho uld be able to ssh into the guest fro m the ho st.
No w o n the GUEST install pacemaker-remo te package and enable the daemo n to run at startup. In the co mmands belo w yo u will no tice the pacemaker and pacemaker_remote packages are being installed. The pacemaker package is no t required. The o nly reaso n it is being installed fo r this tuto rial is because it co ntains the a Dummy reso urce agent we will be using later o n to test the remo te-no de.
# yum install -y pacemaker paceamaker-remote resource-agents # systemctl enable pacemaker_remote.service
No w start pacemaker_remo te o n the guest and verify the start was successful.
13
# systemctl start pacemaker_remote.service # systemctl status pacemaker_remote pacemaker_remote.service - Pacemaker Remote Service Loaded: loaded (/usr/lib/systemd/system/pacemaker_remote.service; enabled) Active: active (running) since Thu 2013-03-14 18:24:04 EDT; 2min 8s ago Main PID: 1233 (pacemaker_remot) CGroup: name=systemd:/system/pacemaker_remote.service 1233 /usr/sbin/pacemaker_remoted Mar 14 Mar 14 Mar 14 Starting 18:24:04 guest1 systemd[1]: Starting Pacemaker Remote Service... 18:24:04 guest1 systemd[1]: Started Pacemaker Remote Service. 18:24:04 guest1 pacemaker_remoted[1233]: notice: lrmd_init_remote_tls_server: a tls listener on port 3121.
If running the telnet co mmand o n the ho st results in this o utput befo re disco nnecting, the co nnectio n wo rks.
# telnet guest1 3121 Trying 192.168.122.10... Connected to guest1. Escape character is '^]'. Connection closed by foreign host.
Once yo u can successfully co nnect to the guest fro m the ho st, shutdo wn the guest. Pacemaker will be managing the virtual machine fro m this po int fo rward.
Wait fo r the ho st to beco me the DC. The o utput o f pcs status sho uld lo o k similar to this after abo ut a minute.
Last updated: Thu Mar 14 16:41:22 2013 Last change: Thu Mar 14 16:41:08 2013 via crmd on example-host Stack: corosync Current DC: example-host (1795270848) - partition WITHOUT quorum Version: 1.1.10 1 Nodes configured, unknown expected votes 0 Resources configured. Online: [ example-host ]
No w enable the cluster to wo rk witho ut quo rum o r sto nith. This is required just fo r the sake o f getting this tuto rial to wo rk with a single cluster-no de.
14
Pacemaker Remote
We will use the VirtualDomain reso urce agent fo r the management o f the virtual machine. This agent requires the virtual machines xml co nfig to be dumped to a file o n disk. To do this pick o ut the name o f the virtual machine yo u just created fro m the o utput o f this list.
# virsh list --all Id Name State ______________________________________________ guest1 shut off
In my case I named it guest1. Dump the xml to a file so mewhere o n the ho st using the fo llo wing co mmand.
# virsh dumpxml guest1 > /root/guest1.xml
No w just register the reso urce with pacemaker and yo ure set!
# pcs resource create vm-guest1 VirtualDomain hypervisor="qemu:///system" config="/root/guest1.xml" meta remote-node=guest1
Once the vm-guest1 reso urce is started yo u will see guest1 appear in the pcs status o utput as a no de. The final pcs status o utput sho uld lo o k so mething like this.
Last updated: Fri Mar 15 09:30:30 2013 Last change: Thu Mar 14 17:21:35 2013 via cibadmin on example-host Stack: corosync Current DC: example-host (1795270848) - partition WITHOUT quorum Version: 1.1.10 2 Nodes configured, unknown expected votes 2 Resources configured. Online: [ example-host guest1 ] Full list of resources: vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
No w check yo ur pcs status o utput. In the reso urce sectio n yo u sho uld see so mething like the fo llo wing, where so me o f the reso urces go t started o n the cluster-no de, and so me started o n the remo te-no de.
15
Full list of resources: vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host FAKE1 (ocf::pacemaker:Dummy): Started guest1 FAKE2 (ocf::pacemaker:Dummy): Started guest1 FAKE3 (ocf::pacemaker:Dummy): Started example-host FAKE4 (ocf::pacemaker:Dummy): Started guest1 FAKE5 (ocf::pacemaker:Dummy): Started example-host
The remo te-no de, guest1 , reacts just like any o ther no de in the cluster. Fo r example, pick o ut a reso urce that is running o n yo ur cluster-no de. Fo r my purpo ses I am picking FAKE3 fro m the o utput abo ve. We can fo rce FAKE3 to run o n guest1 in the exact same way we wo uld any o ther no de.
# pcs constraint FAKE3 prefers guest1
No w lo o king at the bo tto m o f the pcs status o utput yo ull see FAKE3 is o n guest1 .
Full list of resources: vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host FAKE1 (ocf::pacemaker:Dummy): Started guest1 FAKE2 (ocf::pacemaker:Dummy): Started guest1 FAKE3 (ocf::pacemaker:Dummy): Started guest1 FAKE4 (ocf::pacemaker:Dummy): Started example-host FAKE5 (ocf::pacemaker:Dummy): Started example-host
After a few seco nds o r so yo ull see this in yo ur pcs status o utput. The guest1 no de will be sho w as o ffline as it is being reco vered.
Last updated: Fri Mar 15 11:00:31 2013 Last change: Fri Mar 15 09:54:16 2013 via cibadmin on example-host Stack: corosync Current DC: example-host (1795270848) - partition WITHOUT quorum Version: 1.1.10 2 Nodes configured, unknown expected votes 7 Resources configured. Online: [ example-host ] OFFLINE: [ guest1 ] Full list of resources: vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host FAKE1 (ocf::pacemaker:Dummy): Stopped FAKE2 (ocf::pacemaker:Dummy): Stopped FAKE3 (ocf::pacemaker:Dummy): Stopped FAKE4 (ocf::pacemaker:Dummy): Started example-host FAKE5 (ocf::pacemaker:Dummy): Started example-host Failed actions: guest1_monitor_30000 (node=example-host, call=3, rc=7, status=complete): not running
Once reco very o f the guest is co mplete, yo ull see it auto matically get re-integrated into the cluster. The final pcs status o utput sho uld lo o k so mething like this.
16
Pacemaker Remote
Last updated: Fri Mar 15 11:03:17 2013 Last change: Fri Mar 15 09:54:16 2013 via cibadmin on example-host Stack: corosync Current DC: example-host (1795270848) - partition WITHOUT quorum Version: 1.1.10 2 Nodes configured, unknown expected votes 7 Resources configured. Online: [ example-host guest1 ] Full list of resources: vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host FAKE1 (ocf::pacemaker:Dummy): Started guest1 FAKE2 (ocf::pacemaker:Dummy): Started guest1 FAKE3 (ocf::pacemaker:Dummy): Started guest1 FAKE4 (ocf::pacemaker:Dummy): Started example-host FAKE5 (ocf::pacemaker:Dummy): Started example-host Failed actions: guest1_monitor_30000 (node=example-host, call=3, rc=7, status=complete): not running
17
18
Pacemaker Remote
# echo $corosync_addr
In mo st cases the address will be 192.168.1.0 if yo u are behind a standard ho me ro uter. No w co py o ver the example co ro sync.co nf. This co de will inject yo ur bindaddress and enable the vo te quo rum api which is required by pacemaker.
# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf # sed -i.bak "s/.*\tbindnetaddr:.*/bindnetaddr:\ $corosync_addr/g" /etc/corosync/corosync.conf # cat << END >> /etc/corosync/corosync.conf quorum { provider: corosync_votequorum expected_votes: 2 } END
Verify pacemaker status. At first the pcs cluster status o utput will lo o k like this.
# pcs status Last updated: Thu Mar 14 12:26:00 2013 Last change: Thu Mar 14 12:25:55 2013 via crmd on example-host Stack: corosync Current DC: Version: 1.1.10 1 Nodes configured, unknown expected votes 0 Resources configured.
19
After executing the script yo u will see a bunch o f directo ries and xml files are generated. Tho se xml files are the libvirtlxc do main definitio ns, and the directo ries are used as so me special mo unt po ints fo r each co ntainer. If yo u o pen up o ne o f the xml files yo ull be able to see ho w the cpu, memo ry, and filesystem reso urces fo r the co ntainer are defined. Yo u can use the libvirt-lxc drivers do cumentatio n fo und here, http://libvirt.o rg/drvlxc.html, as a reference to help understand all the parts o f the xml file. The lxc-auto gen script is no t co mplicated and is wo rth explo ring in o rder to grasp ho w the enviro nment is generated. It is wo rth no ting that this enviro nment is dependent o n use o f libvirts default netwo rk interface. Verify the co mmands belo w lo o k the same as yo ur enviro nment. The default netwo rk address 192.168.122.1 sho uld have been generated by auto matically when yo u installed the virtualizatio n so ftware.
# virsh net-list Name State Autostart Persistent ________________________________________________________ default active yes yes # virsh net-dumpxml default | grep -e "ip address=" <ip address='192.168.122.1' netmask='255.255.255.0'>
Wait fo r the ho st to beco me the DC. The o utput o f pcs status sho uld lo o k similar to this after abo ut a minute.
Last updated: Thu Mar 14 16:41:22 2013 Last change: Thu Mar 14 16:41:08 2013 via crmd on example-host Stack: corosync Current DC: example-host (1795270848) - partition WITHOUT quorum Version: 1.1.10 1 Nodes configured, unknown expected votes 0 Resources configured. Online: [ example-host ]
No w enable the cluster to wo rk witho ut quo rum o r sto nith. This is required just fo r the sake o f getting this tuto rial to wo rk with a single cluster-no de.
# pcs property set stonith-enabled=false # pcs property set no-quorum-policy=ignore
20
Pacemaker Remote
After creating the co ntainer reso urces yo u pcs status sho uld lo o k like this.
Last updated: Mon Mar 18 17:15:46 2013 Last change: Mon Mar 18 17:15:26 2013 via cibadmin on guest1 Stack: corosync Current DC: example-host (175810752) - partition WITHOUT quorum Version: 1.1.10 4 Nodes configured, unknown expected votes 6 Resources configured. Online: [ example-host lxc1 lxc2 lxc3 ] Full list of resources: container3 container1 container2 (ocf::heartbeat:VirtualDomain): Started example-host (ocf::heartbeat:VirtualDomain): Started example-host (ocf::heartbeat:VirtualDomain): Started example-host
After creating the Dummy reso urces yo u will see that the reso urce go t distributed amo ng all the no des. The pcs status o utput sho uld lo o k similar to this.
21
Last updated: Mon Mar 18 17:31:54 2013 Last change: Mon Mar 18 17:31:05 2013 via cibadmin on example-host Stack: corosync Current DC: example=host (175810752) - partition WITHOUT quorum Version: 1.1.10 4 Nodes configured, unknown expected votes 11 Resources configured. Online: [ example-host lxc1 lxc2 lxc3 ] Full list of resources: container3 (ocf::heartbeat:VirtualDomain): Started example-host container1 (ocf::heartbeat:VirtualDomain): Started example-host container2 (ocf::heartbeat:VirtualDomain): Started example-host FAKE1 (ocf::pacemaker:Dummy): Started lxc1 FAKE2 (ocf::pacemaker:Dummy): Started lxc2 FAKE3 (ocf::pacemaker:Dummy): Started lxc3 FAKE4 (ocf::pacemaker:Dummy): Started lxc1 FAKE5 (ocf::pacemaker:Dummy): Started lxc2
To witness that Dummy agents are running within the lxc guests bro wse o ne o f the lxc do mains filesystem fo lders. Each lxc guest has a custo m mo unt po int fo r the '/var/run/'directo ry, which is the lo catio n the Dummy reso urces write their state files to .
# ls lxc1-filesystem/var/run/ Dummy-FAKE4.state Dummy-FAKE.state
In o rder to see ho w the cluster reacts to a failed lxc guest. Try killing o ne o f the pacemaker_remo te instances.
# kill -9 9142
After a few mo ments the lxc guest that was running that instance o f pacemaker_remo te will be reco vered alo ng with all the reso urces running within that co ntainer.
Revision History
Re visio n 1 Impo rt fro m Pages.app T ue Mar 19 2013 David Vo sse l
Index