Building a redundant iSCSI and NFS cluster with Debian - Part 3
Updated:
Note : This page may contain outdated information and/or broken links; some of the formatting may be mangled due to the many different code-bases this site has been through in over 20 years; my opinions may have changed etc. etc.
This is part 3 of a series on building a redundant iSCSI and NFS SAN with Debian.
Part 1 - Overview, network layout and DRBD installation
Part 2 - DRBD and LVM
Part 3 - Heartbeat and automated failover
Part 4 - iSCSI and IP failover
Part 5 - Multipathing and client configuration
Part 6 - Anything left over!
Introduction
In the last two guides, we set up a DRBD resource and LVM volume group which we could manually migrate between the two cluster nodes. In this guide, we’ll set up the Heartbeat cluster software to handle automatic migration of services between the two nodes in our cluster (“failover”).
The version of Heartbeat included in Debian Etch is 1.x. It is a very simple system, and is limited to two node clusters, making it ideal for something simple such as failover for services between two nodes. The current 2.x branch is a lot more complicated, and has a new XML configuration format, although it can still be used with the original 1.x format files. Although it adds many useful features, it’s overkill for our needs at the moment - plus, sticking to 1.x avoids the need to install software not included in the current stable distribution.
Preparation
Before we set up Heartbeat, we’ll need to ensure the communication channels the cluster will be using are configured. If you refer back to the original network diagram, you’ll see that we’re using two different interconnects: A serial cable, and a network connection across eth1. To recap: the reason for this is that the interconnects are vital to the functioning of the cluster. If one node cannot “see” the other, it will assume control of the resources. If this was due to a faulty interconnect (or, a network misconfiguration), you would end up with a “split-brain” scenario in which both nodes try to gain control over the resources. At best, this would lead to service outages and confusion; at worst, you could be facing total data loss.
Hence, the two channels for cluster communication - the null-modem serial cable is a great “fallback” channel, which should always be available even if you do something like apply an erroneous firewall rule blocking the communication over eth1. If you have been following the instructions up until now, you should already be able to send data between the hosts over the serial connection, and ping each node from the other over their eth1 interfaces (we’ve already been using this interface for the DRBD synching). Assuming this all works, you’re good to proceed.
Installation
Simply install from apt-get on both nodes :
# apt-get install heartbeatThis will give an error warning at the end ("Heartbeat not configured"), which you can ignore. You now need to setup authentication for both nodes - this is very simple, and just uses a shared secret key. Create /etc/ha.d/authkeys on both systems with the following content:
auth 1In this sample file, the auth 1 directive says to use key number 1 for signing outgoing packets. The 1 sha1... line describes how to sign the packets. Replace the word "secret" with the passphrase of your choice. As this is stored in plaintext, make sure that it is owned by root and has a restrictive set of permissions on it :
1 sha1 secret
# chown root:root /etc/ha.d/authkeysMake sure that copies of this file are identical across both nodes, and don't have any blank lines etc. in them. Now, we need to set up the global cluster configuration file. Create the /etc/ha.d/ha.cf file on both nodes as follows :
# chmod 600 /etc/ha.d/authkeys
# Interval in seconds between heartbeat packets
keepalive 1
# How long to wait in seconds before deciding node is dead
deadtime 10
# How long to wait in seconds before warning node is dead
warntime 5
# How long to wait in seconds before deciding node is dead
# When heartbeat is first started
initdead 60
# If using serial port for heartbeat
baud 9600
serial /dev/ttyS0
# If using network for heartbeat
udpport 694
# eth1 is our dedicated cluster link (see diagram in part 1)
bcast eth1
# Don't want to auto failback, let admin check and do it manually if needed
auto_failback off
# Nodes in our clusterResources
node otter
node weasel
We now need to tell Heartbeat about what resources we want it to manage. This is configured in the /etc/ha.d/haresources file. The format for this is again very simple - it just takes the form :
<hostname> resource[::arg1:arg2:arg3:........:argN]Resources can either be one of the supplied scripts in /etc/ha.d/resource.d :
# ls /etc/ha.d/resource.dOr, they can be one of the init scripts in /etc/init.d, and Heartbeat will search those locations in that order. To start with, we'll want to move the DRBD resource we configured in part 2 between the two nodes. This can be accomplished via the "drbddisk" script, provided by the drbd0.7-utils package. The configuration /etc/ha.d/haresources file should therefore look like the following :
AudibleAlarm db2 Delay drbddisk Filesystem ICP IPaddr IPaddr2 IPsrcaddr
IPv6addr LinuxSCSI LVM LVSSyncDaemonSwap MailTo OCF portblock SendArp ServeRAID
WAS WinPopup Xinetd
weasel drbddisk::r0This says that the node "weasel" should be the preferred node for this service. The resource script is "drbddisk", which can be found under /etc/ha.d/resource.d, and we're passing it the argument "r0", which is our DRBD resource configured in part 2. To test this out, make the DRBD resource secondary by running the following on both nodes :
# drbdadm secondary r0And then start the cluster on both nodes :
# /etc/init.d/heartbeat startOnce they've started up, check the cluster status using the cl_status tool. First, let's check which nodes Heartbeat thinks are in the cluster :
Starting High-Availability services:
Done.
# cl_status listnodesNow, check both nodes are up :
weasel
otter
# cl_status nodestatus weaselWe can also use the cl_status tool to see which cluster links are available (which should be eth1 and /dev/ttyS0) :
active
# cl_status nodestatus otter
active
# cl_status listhblinks otterAnd we can also use it to check which resources each node has :
eth1
/dev/ttyS0
# cl_status hblinkstatus otter eth1
up
# cl_status hblinkstatus otter /dev/ttyS0
up
[root@otter] # cl_status rscstatusYou should be able to check the output of /proc/drbd on both systems and see that r0 has been made the master on weasel. To failover to otter, simply restart the Heartbeat services on weasel :
none
[root@weasel] # cl_status rscstatus
all
# /etc/init.d/heartbeat restart
Stopping High-Availability services:
Done.
Waiting to allow resource takeover to complete:
Done.
Starting High-Availability services:Now, check /proc/drbd and you should see that it is now the master on otter. You can confirm this with cl_status :
Done.
[root@otter] # cl_status rscstatusIf you want to try a more dramatic approach, try yanking the power out of otter. You should see output similar to the following appear in /var/log/ha-log on weasel :
all
[root@weasel] # cl_status rscstatus
none
heartbeat: 2009/02/03_15:06:29 info: Resources being acquired from otter.Play around with this a few times, and make sure you're familiar with your resource moving between systems. Once you're happy with this, we'll add our LVM volume group into the configuration. Edit the /etc/ha.d/haresources file, and modify it so that it looks like the following :
heartbeat: 2009/02/03_15:06:29 info: acquire all HA resources (standby).
heartbeat: 2009/02/03_15:06:29 info: Acquiring resource group: weasel drbddisk::r0
heartbeat: 2009/02/03_15:06:29 info: Local Resource acquisition completed.
...
...
heartbeat: 2009/02/03_15:06:29 info: all HA resource acquisition completed (standby).
heartbeat: 2009/02/03_15:06:29 info: Standby resource acquisition done [all].
heartbeat: 2009/02/03_15:06:29 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2009/02/03_15:06:29 info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
...
...
heartbeat: 2009/02/03_15:06:39 WARN: node otter: is dead
heartbeat: 2009/02/03_15:06:39 info: Dead node otter gave up resources.
weasel drbddisk::r0 \The backslash (\) character just tells Heartbeat that this should all be treated as one resource group - the same as a backslash indicates a line continuation in a shell script. Things can be on just one line, but I find it easier to read when it's split up like this. Restart Heartbeat on each node in turn, and you should then be able to see the DRBD resource and the LVM volume group move between systems. The next step will cover setting up an iSCSI target, and adding that into the cluster configuration along with a group of managed IP addresses.
LVM::storage