XCP/XenServer on RAID1: booting pitfalls

Getting XCP to boot from a RAID1 partition has been worked out in a nice article: http://blog.codeaddict.org/?p=5

I discovered a few things on the way:

  • parted does nasty things to the boot configuration, and
  • what to do if the XCP install becomes unbootable;
  • extlinux is RAID-aware;
  • how to bootstrap if the BIOS won't allow you to select a boot disk and you don't want to swap cables

If the XCP install will not enter extlinux or boot the kernel

It turns out that using parted for any of the partitioning work will do at least two of: overwrite the MBR, unset the root partition's legacy BIOS bootable flag and otherwise foul the extlinux install.

Aside from following the cookbook, there are a few things to check to ensure the system is bootable. This setup should ensure that XCP will boot even with a root disk missing.

Be sure the MBR is installed.

Outlined in the cookbook, but it doesn't hurt to do it again if the partition configuration has changed at all.

dd if=/usr/share/syslinux/gptmbr.bin > /dev/sda
dd if=/usr/share/syslinux/gptmbr.bin > /dev/sdb

Be sure the boot partition's bootable flag is set

extlinux will not consider a boot partition without the bootable flag set. If any partitions are modified with parted, it will clear any bootable flags. Set the flag and check it for both partiions:

sgdisk --attributes=1:set:2 /dev/sda  # set the flag for sda
sgdisk --attributes=1:show /dev/sda  # show the flag for sda
sgdisk --attributes=1:set:2 /dev/sdb  # set the flag for sdb
sgdisk --attributes=1:show /dev/sdb  # show the flag for sdb

Install extlinux

It doesn't hurt to install extlinux repeatedly, so do it again after any partitioning.

The codeaddict.org cookbook doesn't mention the --raid flag to extlinux: if the MBR on sda is ok, but the filesystem on sda1 is corrupt (or the bootable flag was accidentally wiped!), it will try the next disk.

Be sure that at least /dev is mounted on /mnt/dev before running this.

chroot /mnt extlinux -i -r /boot

If extlinux boots linux, but linux cannot mount the root partition

There are a few things to check here.

Check the partition's RAID flag

If the partition's RAID flag is not set, the kernel will not automatically assemble the RAID array. Install the flag:

sgdisk --typecode=1:fd00 /dev/sda  # set the flag for sda1
sgdisk --info=1 /dev/sda  # check first line for 'Linux RAID'
sgdisk --typecode=1:fd00 /dev/sdb  # set the flag for sdb1
sgdisk --info=1 /dev/sdb  # check first line for 'Linux RAID'

Check /boot/extlinux.conf

Be sure that the root device is specified correctly as root=/dev/md0 in the label xe section, append line, /boot/vmlinuz-2.6-xen module.

Check the initrd

Build a new initrd, ensuring that the RAID1 kernel module is included.

Assuming the XCP root filesystem is mounted on /mnt:

mkinitrd -v -f --without-multipath --fstab=/mnt/etc/fstab \
/mnt/boot/initrd-`uname -r`.img `uname -r`

Be sure you see a line Adding module raid1 near the end of the output.

What to do if you get stuck

The install may be rescued using the XCP install image.

Boot the XCP install image.

When presented with the keyboard selection screen, switch to the shell <Alt>-<F2>

Mount half of the mirror read-only and copy the mdadm tool into the miniroot.

mkdir /mnt
mount -o ro /dev/sda1 /mnt
cp /mnt/sbin/mdadm.static /sbin/mdadm
umount /mnt

Assemble the RAID array, and mount it and the supporting filesystems.

mdadm --examine --scan > /etc/mdadm.conf
mdadm -A /dev/md0
mount /dev/md0 /mnt
mount -o bind /dev /mnt/dev
mount -o bind /sys /mnt/sys
# and optionally,
chroot /mnt

Perform needed repairs

At this point, the system is ready to execute any of the commands listed in earlier sections.

When finished, be sure to unmount things, because XCP doesn't know how.

umount /mnt/dev
umount /mnt/sys
umount /mnt

Good luck.

Please post any problems or success stories in the comments below. Thanks for reading.

3 Comments

SimonMay 15th, 2013 at 3:19 pm

Hi,
It’s great article. I did everything as you said and RAID working perfectly. But, if I can ask about something…

Please, let me explain: a few days ago I had to replace one of my disk (sda), beacause it’s broken.
First of all, I did:
mdadm /dev/md0 –fail /dev/sda1.
mdadm /dev/md0 –remove /dev/sda1
(both command for every partition)

I turned off server, replace sda disk and I booted from second disk - sdb without any problem.
cat /proc/mdstat showed me that arrays works only on one disk sdb.

I copied: sgdisk -R /dev/sda /dev/sdb and sgdisk -G /dev/sda and after that I did:
cat /usr/share/syslinux/gptmbr.bin > /dev/sda - what I have red in the internet this command should install bootloader on the disk, right? (sgdisk -p /dev/sda showed me the same information what are on the second disk)

After that I added sda disk to arrays and RAID started to recovery - of course with successful. ;-)

But I’m not sure about install bootloader. Is it enough what I did? Or do I have to install extlinux –install /boot? Or maybe I have to: cat /usr/share/syslinux/gptmbr.bin > /dev/sda after rebuild arrays?

I don’t have any guarantee, that system will UP from sda disk when the sdb will down.

If I understand well idea of RAID-1 bot of disks are 1:1, so MBR should be copied from one disk to second. Even that, I checked sha1sum (dd if=/dev/sda(sdb) of=some_file bs=512 count=1) and the sums are the same. It shoud,right?

Thank you for your help,
Simon

jmanMay 15th, 2013 at 4:16 pm

Hi Simon, thanks for the comment!

If you think you need a guarantee that the server will still boot from sda, you’re thinking smart! Simply shut down & power off, unplug sdb, and power on, booting into single-user mode. You’ve proven the system can boot from sda only. After shutting down & powering off, replugging sdb, and powering back on, check your mirrors one more time to be sure the mirror is still intact.

As for your questions,

About the bootloader: Yes, cat /usr/share/syslinux/gptmbr.bin > /dev/sda installs the bootloader. No need to run extlinux again, but it will not hurt if you do, as the article says.

When you run that ‘cat’ command, you’re copying only 440 bytes (the size of the gptmbr.bin file) to the very beginning of sda. This is outside of any partition, and even before the partition table. Your RAID setup is mirroring partitions only, sda1 etc. So installing the bootloader and rebuilding the disk arrays do not affect each other.

As for your sha1sum question: In our case, we’re using RAID-1 to mirror *partitions*, like /dev/sda1 and /dev/sdb1. It will mirror everything inside the partitions, but nothing else. The bootloader and partition table, and anything else, are not mirrored. This is why you have to copy the partition table (gparted -R) and boot block (cat gptmbr.bin > /dev/sda) separately.

You can see how the GPT partition table is laid out on disk here: http://en.wikipedia.org/wiki/GUID_Partition_Table . The first block ‘LBA 0′ is reserved for the boot block, and the partition table starts from the second block ‘LBA 1′. Each ‘LBA’ is 512 bytes. Since you’re copying only 440 bytes from gptmbr.bin, there are an additional 512-440 = 72 bytes at the end of LBA 0 that could contain garbage. Therefore, no, your sha1sum comparison of the first 512 bytes of the two disks may not always work, even though it worked this time. If that area on both disks has never been touched, perhaps they still contain all zeros, which is why the sha1sum matched.

SimonMay 21st, 2013 at 9:16 pm

Hi,

Thank you for explanation! I know everythink what I wanted. ;-)
Just now, I have to check that system boot from my sda disk. ;-)

Many thanks!

Regards
Simon

Leave a comment

Your comment