Tag Archives: dd

Migrate from Xenserver to Proxmox

I was dismayed to see Citrix’s recent announcement about Xenserver 7.3 removing several key features from the free version. Xenserver’s free features are the reason I switched over to them in the first place back in 2014. Xenserver has been rock solid; I haven’t had any complaints until now. Their removal of xenmotion and migration in the free version forced me to look elsewhere for my virtualization needs.

I’ve settled on ProxMox, which is KVM based. Their documentation is excellent and it has all the features I need – for free. I’m also in love with their web based management – no more Windows fat client!

Below are my notes on how I successfully migrated all my Xenserver VMs over to the ProxMox Virtual Environment (PVE).

  • Any changes to network interfaces, such as bringing them up, require a reboot of the host
  • If you have an existing ISO share, you can create a directory called  “template” in your ISO repository folder, then inside symlink “iso” back to your ISO folder. Proxmox looks inside template/iso for ISO images for whatever storage you configure.
  • Do not create your ProxMox host with ZFS unless you have tons of RAM. If you don’t have enough RAM you will run into huge CPU load times making the system unresponsive in cases of high disk load, such as VM copies / backups. More reading here.

Cluster of two:

ProxMox’s clustering is a bit different – better, in my opinion. No more master, slave dynamic – ever node is a master. Important reading: https://pve.proxmox.com/wiki/Proxmox_VE_4.x_Cluster

If you have two node cluster, like I do, it creates some problems, though. If one goes down, the other can’t do anything to the pool (create VM, backup) until it comes back up. In my situation I have one primary host that is up all the time and I bring the secondary host up only when I want to do maintenance on the first.

In that specific situation you can still designate a “master” of sorts by increasing the number of quorum votes it gets from 1 to 2.  That way when the secondary node is down, the primary node can still do cluster operations because the default number of votes to stay quorate is 2. See here for more reading on the subject.

On either host (they must both be up and in the cluster for this to work)

vi /etc/pve/corosync.conf

Find your primary server in the nodelist settings and change

quorum_votes: 2

Also find the quorum section and add expected_votes: 2

Make sure to increment config_version number (bottom of the file.) Now if your secondary is down you can still operate the primary.

Migrating VMs

I migrated my Xen VMs to KVM by creating VMs with identical specs in PVE, copying the VHD files from the Xen host to the new PVE host, running qemu-img to convert them to RAW format, and then using dd to copy the raw information over to corresponding empty VM  disks. Depending on the OS of the VM there was some after-copy tweaking I also had to do.

From shared storage

Grab the VHD file (quiesce any snapshots away first) of each xen VM and convert them to raw format

qemu-img convert <VHD_FILE_NAME>.vhd -O raw <RAW_FILE_NAME>.raw

Create a new VM with identical configuration, especially disk size. Go to the hardware tab and take note of the name of the disk. For example, one of mine was:

local-zfs:vm-100-disk-1,discard=on,size=40G

The interesting part is between local-zfs and discard=on, namely vm-100-disk-1. This is the name of the disk we want to overwrite with data from our Xenserver VM’s disk.

Next figure out the full path of this disk on your proxmox host

find / -name vm-100-disk-1*

The result in my case was /dev/zvol/rpool/data/vm-100-disk-1

Take the name and put it in the following command to complete the process:

dd if=<RAW_FILE_NAME>.raw of=/dev/zvol/rpool/data/vm-100-disk-1 bs=16M

Once that’s done you can delete your .vhd and .raw files.

From local / LVM storage

In case your Xen VMs are stored in LVM device format instead of a VHD file, get UUID of storage by doing xe vdi-list and finding the name of the hard disk from the VM you want. It’s helpful to rename the hard disks to something easy to spot. I chose the word migrate.

xe vdi-list|grep -B3 migrate
uuid ( RO) : a466ae1b-80c7-4ef2-91a3-5c1ba1f6fc2f
 name-label ( RW):  migrate

Once you have the UUID of the drive, you can use lvscan to find the full LVM device path of that disk:

lvscan|grep a466ae1b-80c7-4ef2-91a3-5c1ba1f6fc2f
 inactive '/dev/VG_XenStorage-1ada0a08-7e6d-a5b6-d0b4-515e251c0c75/VHD-a466ae1b-80c7-4ef2-91a3-5c1ba1f6fc2f' [10.03 GiB] inherit

Shut down the corresponding VM and reactivate its logical volume (xen deactivates LVMs if the VM is shut off:

lvchange -ay <full /dev/VG_XenStorage path discovered above>

Now that we have the full LVM path and the volume is active, we can use dd over SSH to transfer the image to our proxmox server:

sudo dd if=<full /dev/VG/Xenstorage path discovered above> | ssh <IP_OF_PROXMOX_SERVER> dd of=<LOCATION_ON_PROXMOX_THAT_HAS_ENOUGH_SPACE>/<NAME_OF_VDI_FILE>.vhd

then follow vhd -> raw -> dd to proxmox drive process described in the From Shared Storage section.

Post-Migration tweaks

For the most part Debian-based systems moved over perfectly without any needed tweaks; Some VMs changed interface names due to network device changes. eth0 turned into ens8. I had to modify /etc/network/interfaces to change eth0 to ens8 to get virtio networking working.

CentOS

All my CentOS VMs failed to boot after migration due to a lack of virtio disk drivers in the initial RAM disk. The fix is to change the disk hardware to IDE mode (they boot fine this way) and then modify the initrd of each affected host:

sudo dracut --add-drivers "virtio_pci virtio_blk virtio_scsi virtio_net virtio_ring virtio" -f -v /boot/initramfs-`uname -r`.img `uname -r`
sudo sh -c "echo 'add_drivers+=\" virtio_pci virtio_blk virtio_scsi virtio_net virtio_ring virtio \"' >> /etc/dracut.conf"
sudo shutdown -h now

Once that’s done you can detach the hard disk and re-attach it back as SCSI (virtio) mode. Don’t forget to modify the options and change the boot order from ide0 to scsi0

Arch Linux

One of my Arch VMs had UUID configured which complicated things. The root device UUID changes in KVM virtio vs IDE mode. The easiest way to fix it is to boot this VM into an Arch install CD. Mount the root partition and then run arch-chroot /mnt/sda1. Once in the chroot runpacman -Sy kernel to reinstall the kernel and generate appropriate kernel modules.

mount /dev/sda1 /mnt
arch-chroot /mnt
pacman -Sy kernel

Also make sure to modify /etc/fstab to reflect appropriate device id or UUID (xen used /dev/xvda1, kvm /dev/sda1)

Windows

Create your Windows VM using non-virtio drivers (default settings in PVE.) Obtain the latest windows virtio drivers here and extract them somewhere memorable. Switch everything but the disk over to Virtio in the VM’s hardware config and reboot the VM. Go into device manager and point to extracted driver location for each unknown device.

To get Virtio disk to work, add a new disk to the VM of any size and SCSI (virtio) type. Boot the Windows VM and install drivers for that drive. Then shut down, remove that second drive, detach the primary drive and change to virtio SCSI. It should then come up with full virtio drivers.

All hosts

KVM has a guest agent like xenserver does called qemu-agent. Turn it on in VM options and install qemu-guest-agent in your guest. This KVM a bit more insight into your host.

Determine which VMs need guest agent installed:

qm agent $id ping

If nothing is returned, it means qemu-agent is working. You can test all your VMs at once with this one-liner (change your starting and finishing VM IDs as appropriate)

for id in {100..114}; do echo $id; qm agent $id ping; done

This little one-liner will output the VM ID it’s trying to ping and will return any errors it finds. No errors means everything is working.

Disable support nag

PVE has a support model and will nag you at each login. If you don’t like this you can change it like so (the line number might be different depending on which version you’re running:

vi +850 /usr/share/pve-manager/js/pvemanagerlib.js

Modify the line if (data.status !== ‘Active’); change it to

if (false)

Troubleshooting

Remove a failed node

See here: https://pve.proxmox.com/wiki/Proxmox_VE_4.x_Cluster#Remove_a_cluster_node

systemctl stop pvestatd.service
systemctl stop pvedaemon.service
systemctl stop pve-cluster.service
rm -r /etc/corosync/*
rm -r /var/lib/pve-cluster/*
reboot

Quorum never establishes / takes forever

I had a really strange issue where I was able to establish quorum with a second node, but after a reboot quorum never happened again. I re-installed that second node and re-joined it several times but I never got past the “waiting for quorum….” stage.

After much research I came across this article which explained what was happening. Corosync uses multicast to establish cluster quorum. Many switches (including mine) have a feature called IGMP snooping, which, without an IGMP querier, essentially means multicast never happens. Sure enough, after logging into my switches and disabling IGMP snooping, quorum was instantly established. The article above says this is not recommended, but in my small home lab it hasn’t produced any ill effects. Your mileage may vary. You can also configure your cluster to use unicast instead.

USB Passthrough not working properly

With Xenserver I was able to pass through the USB controller of my host to the guest (a JMICRON USB to ATAATAPI bridge holding a 4 disk bay.) I ran into issues with PVE, though. Using the GUI to pass the USB device did not work. Manually adding PCI passthrough directives (hostpci0: 00:14.1) didn’t work. I finally found on a little nugget on the PCI Passthrough page about how you can simply pass the entire device and not the function like I had in Xenserver. So instead of doing hostpci0: 00:14.1, I simply did hostpci0: 00:14 . That  helped a little bit, but I was still unable to fully use these drives simultaneously.

My solution was eventually to abandon PCI passthrough altogether in favor of just passing individual disks to the guest as outlined here.

Find the ID of the desired disks by issuing ls -l /dev/disk/by-id. You only need to know the UUIDs of the disks, not the partitions. Then modify the KVM config of your desired host (mine was located at /etc/pve/qemu-server/101.conf) and a new line for each disk, adjusting scsi device numbers and UUIDs to match:

scsi5: /dev/disk/by-id/scsi-SATA_ST5000VN000-1H4_Z111111

With that direct disk access everything is working splendidly in my FreeNAS VM.

Create & Mount disc images in Linux

When working with hard drives it is always a good idea to back the entire thing up before proceeding. I wanted to write down the procedure so I don’t keep forgetting it.

Create disc image

dd does the trick here.

sudo dd if=/dev/<drive device file> of=image.img bs=64M

If you wish to see the progress of the above dd command you can open up a separte window and issue the kill command

kill -USR1 `pidof dd`

Mount disc image read only

You can now disconnect the drive and work with its image instead (great for forensics or dealing with a dying drive.)

In later versions of Linux you can do this with losetup and partprobe.

sudo losetup -Pr -f <path to image file>
sudo losetup #find which loop device file corresponds with your image here
sudo mount -o ro /dev/<loopdevice>p<partition number> <mountpoint>

For example, this is what I did on my system for my aunt’s laptop (I was interested in the 2nd partition on her drive, the one containing Windows files)

sudo losetup -Pr -f susan-ssd.img
sudo losetup

NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO
/dev/loop0 0 0 0 1 /home/partimag/susan-ssd.img 0

sudo mount -o ro /dev/loop0p2 mount/

When you’re done, unmount the image and delete the image mapping:

umount <path to mount directory>
sudo losetup -d <loop file obtained earlier>

Quickly create xen disk images with dd

To this point I have been creating disk images with dd – pretty standard. The command I traditionally use is this:

sudo dd if=/dev/zero of=pc-bsd-10.1.img bs=1M count=40000

This command works but it takes some time as dd copies zeroes for every single part of the image.

It turns out that there is a much faster way to create a disk image, which I found out thanks to this article. Still using dd, you can simply use the -seek parameter to very quickly create your image

sudo dd if=/dev/zero of=pc-bsd-10.1.img bs=1 count=1 seek=40G

By setting bs and count to 1 but seek to 40G I have created a 40 gigabyte empty file image without having to wait for all those pesky zeroes to be written. Awesome.

Manipulate dd output with sed, tr, and awk

While improving a backup script I came across a need to modify the information output by the dd command. FreeNAS, the system the script runs on, does not have a version of dd that has the human readable option. My quest: make the output from dd human readable.

I’ve only partially fulfilled this quest at this point, but the basic functionality is there. The first step is to redirect all output of the dd command (both stdout and stderr) to a variable (this particular syntax is for bash)

DD_OUTPUT=$(dd if=/dev/zero of=test.img bs=1000000 count=1000 2>&1)

The 2>&1 redirects all output to the variable instead of the console.

The next step is to use sed to remove unwanted information (number records in and out)

sed -r '/.*records /d'

Next, remove any parenthesis. This is due to the parenthesis from dd output messing up the math that’s going to be done later.

tr -d '()'

-d ‘()’ simply tells tr to remove any instance of the characters between the single quotes.

And now, for awk. Awk allows you to manipulate certain pieces of a text file. Each section, separated by a space, is known as a record. Awk lets us edit some parts of text while keeping others intact. My expertise with awk is that of a n00b so I’m sure there is a more efficient way of doing this; nevertheless here is my solution:

awk '{print ($1/1024)/1024 " MB" " " $3 " " $4 " " $5*60 " minutes"  " (" ($7/1024)/1024 " MB/sec)" }'

Since dd outputs its information in bytes I’m having it divide by 1024 twice so the resulting number is now in megabytes. I also have it divide the seconds by 60 to return the number of minutes dd took. Additionally I’m re-inserting the parenthesis removed by the tr command now that the math is correctly done.

The last step is to pipe all of this together:

DD_OUTPUT=$(dd if=/dev/zero of=test.img bs=1000000 count=10000 2>&1)
DD_HUMANIZED=$(echo "$DD_OUTPUT" | sed -r '/.*records /d' | tr -d '()' | awk '{print ($1/1024)/1024 " MB" " " $3 " " $4 " " $5/60 " minutes"  " (" ($7/1024)/1024 " MB/sec)" }')

After running the above here are the results:

Original output:

echo "$DD_OUTPUT"
10000+0 records in
10000+0 records out
10000000000 bytes transferred in 103.369538 secs (96740299 bytes/sec)

Humanized output:

echo "$DD_HUMANIZED"
9536.74 MB transferred in 1.72283 minutes (92.2587 MB/sec)

The printf function of awk would allow us to round the calculations but I couldn’t quite get the syntax to work and have abandoned the effort for now.

Of course all this is not necessary if you have the correct dd version. The GNU version of dd defaults to human readability; certain BSD versions have the option of passing the msgfmt=human argument per here.


Update: I discovered that the awk method above will only print the one line it finds and ignore all other lines, which is less than ideal for scripting. I updated the awk syntax to do a search and replace (sub) instead so that it will print all other lines as well:

awk '{sub(/.*bytes /, $1/1024/1024" MB "); sub(/in .* secs/, "in "$5/60" mins "); sub(/mins .*/, "mins (" $7/1024/1024" MB/sec)"); printf}')

My new all in one line is:

DD_HUMANIZED=$(echo "$DD_OUTPUT" | sed -r '/.*records /d' | tr -d '()' | awk '{sub(/.*bytes /, $1/1024/1024" MB "); sub(/in .* secs/, "in "$5/60" mins "); sub(/mins .*/, "mins (" $7/1024/1024" MB/sec)"); printf}')