Tag Archives: scripting

Xenserver SSH backup script

Citrix Xenserver is an amazing hypervisor with pretty much every function released to you for free. One thing they do not handle, however, is automated backups.

I have hacked together a backup script for myself that seems to work fairly well. It is my own mix of this and this script along with some logic for e-mail reporting that I came up with myself. It does not require any modification of the xenserver host at all (no need to mount anything!) with the exception of adding a public key to the xenserver’s authorized_keys file.

This script can be run on anything with BASH and the appropriate UNIX tools (even other xenservers if you want) and uses SSH to initiate and transfer the backups to a location of your choosing.

Place this script on the machine you want to be initiating / saving the backups on. It requires that you generate an SSH public/private key pair, which can be done with this command:

ssh-keygen

Add the resulting .pub file’s contents to your xenserver’s /root/.ssh/authorized_keys file (create it if it doesn’t exist.) Take note of the location of the private key file that was generated with that command and put that path in the script.

You can download the script here or view it below. This script has been working pretty well for me. Note it will not work with any VMs that have spaces in their names. I was too lazy to debug this so I just renamed the problem VMs to remove the spaces. Enjoy!

#!/bin/bash

# Modified August 30, 2015 by Nicholas Jeppson
# Taken from http://discussions.citrix.com/topic/345960-xenserver-automated-snapshots-script/ and modified to allow for ssh backups
# Additional insight taken from https://github.com/cepa/xen-backup

# [Usage & Config]
# This script involves two computers: a xenserver machine and a backup machine.
# Put this script on the backup server and run with any account that has privileges to the desired export directory.
#
# This script assumes you have already created a private and public key pair on the backup server
# as well as adding respective the public key to the xenserver authorized_keys file.
#
# [How it works]
# Step1: Snapshots each VM on the xen server
# Step2: Backs up the snapshots to specified location
# Step3: Deletes temporary snapshot created in step 1
# Step4: Deletes old VM backups as defined later in this file
#
# [Note]
# This script will only work with VMs that don't have spaces in their names
# Please make sure you have enough disk space for BACKUP_PATH, or backup will fail
#
# Tested on xenserver 6.5
#
# Modify the variables in the config section below to suit your particular environment's needs.
#

############### Config section ###############

#Location where you want backups to go
BACKUP_PATH="/mnt/backup/OS/"

#SSH configuration
SSH_CIPHER="arcfour128"

#Number of backups to keep
NUM_BACKUPS=2

#Xenserver ssh configuration
#This dictates the address and location of keyfiles as they reside on the xenserver
XEN_ADDRESS="192.168.1.1"
XEN_KEY_LOCATION="/home/backup/.ssh/backup"
XEN_USER="root"

#E-mail configuration
EMAIL_ADDRESS="youremail@provider.here"
EMAIL_SUBJECT="`hostname -s | awk '{print "["toupper($1)"]"}'` VM Backup Report: `date +"%A %b %d %Y"`"

########## End of Config section ###############

ret_code=0

#Replace any spaces found with backslashes because dd doesn't like them
BACKUP_PATH_ESCAPED="`echo $BACKUP_PATH | sed 's/ /\\\ /g'`"

# SSH command
remote_exec() {
chmod 0600 $XEN_KEY_LOCATION
ssh -i $XEN_KEY_LOCATION -o "StrictHostKeyChecking no" -c $SSH_CIPHER $XEN_USER@$XEN_ADDRESS $1
}

backup() {
echo "======================================================"
echo "VM Backup started: `date`"
begin="$(date +%s)"
echo "Backup location: ${BACKUP_PATH}"
echo

#add a slash to the end of the backup path if it doesn't exist
if [[ "$BACKUP_PATH" != */ ]]; then
BACKUP_PATH="$BACKUP_PATH/"
fi

#Build array of VM names
VMNAMES=$(remote_exec "xe vm-list is-control-domain=false | grep name-label | cut -d ':' -f 2 | tr -d ' '")

for VMNAME in $VMNAMES
do
echo "======================================================"
echo "$VMNAME backup started `date`"
echo
before="$(date +%s)"

# create snapshot
TIMESTAMP=`date '+%Y%m%d-%H%M%S'`
SNAPNAME="$VMNAME-$TIMESTAMP"
SNAPUUID=$(remote_exec "xe vm-snapshot vm=\"$VMNAME\" new-name-label=\"$SNAPNAME\"")

# export snapshot
# remote_exec "xe snapshot-export-to-template snapshot-uuid=$SNAPUUID filename= | gzip" | gunzip | dd of="$BACKUP_PATH/$SNAPNAME.xva"
remote_exec "xe snapshot-export-to-template snapshot-uuid=$SNAPUUID filename=" | dd of="$BACKUP_PATH/$SNAPNAME.xva"

#if export was unsuccessful, return error
if [ $? -ne 0 ]; then
echo "Failed to export snapshot name = $snapshot_name$backup_ext"
ret_code=1

else
#calculate backup time, print results
after="$(date +%s)"
elapsed=`bc -l <<< "$after-$before"`
elapsedMin=`bc -l <<< "$elapsed/60"`
echo "Snapshot of $VMNAME saved to $SNAPNAME.xva"
echo "Backup completed in `echo $(printf %.2f $elapsedMin)` minutes"

# destroy snapshot
remote_exec "xe snapshot-uninstall force=true snapshot-uuid=$SNAPUUID"

#remove old backups (uses num_backups variable from top)
BACKUP_COUNT=$(find $BACKUP_PATH -name "$VMNAME*.xva" | wc -l)

if [[ "$BACKUP_COUNT" -gt "$NUM_BACKUPS" ]]; then

OLDEST_BACKUP=$(find $BACKUP_PATH -name "$VMNAME*" -print0 | xargs -0 ls -tr | head -n 1)
echo
echo "Removing oldest backup: $OLDEST_BACKUP"
rm $OLDEST_BACKUP
if [ $? -ne 0 ]; then
echo "Failed to remove $OLDEST_BACKUP"
fi
fi
echo "======================================================"
fi
done

end=$"$(date +%s)"
total_time=`bc -l <<< "$end-$begin"`
total_time_min=`bc -l <<< "$total_time/60"`
echo "Backup completed: `date`"
echo "VM Backup completed in `echo $(printf %.2f $total_time_min)` minutes"
echo
}

#Run the backup function and save all output to a variable, including stderr
BACKUP_OUTPUT=$(backup 2>&1)

#Clean up the output of the backup function
#Remove records count from dd, do some basic math to make dd's numbers more human readable
BACKUP_OUTPUT_HUMANIZED=$(echo "$BACKUP_OUTPUT" | sed -r '/.*records /d' | tr -d '()' \
| awk '{sub(/.*bytes /, $1/1024/1024/1024" GB "); sub(/in .* secs/, "in "$5/60" mins "); sub(/mins .*/, "mins (" $7/1024/1024" MB/sec)"); print}')

#Send a report e-mail with the backup results
echo "$BACKUP_OUTPUT_HUMANIZED" | mail -s "$EMAIL_SUBJECT" "$EMAIL_ADDRESS"

exit $ret_code

Fix Splunk lockout after exceeded quota

Recently I came across a situation with my home install of Splunk (free license) where the 500MB quota was exceeded three days in a row. I hadn’t checked Splunk for a few days so I was completely blindsided by it. The consequence of going over quota three days in a row? Losing the ability to do any searches in Splunk, which is a real downer.

The easiest, although least convenient, way to fix being locked out is to wait it out. If you go 30 days in a row without violating the license, Splunk will unlock itself. Splunk will still receive and index events during that time. The inability to search makes it really difficult to track down what the problem is, though, and I wasn’t happy waiting for 30 days before getting Splunk back.

Poking around on the Splunk forums I discovered that there is a way to get splunk back – perform a fresh install and then migrate your database and settings over to the fresh install. This involves backing up a few things, then copying them over the fresh install’s default folders

  • $SPLUNK_HOME/var/lib/splunk/defaultdb   #Default Splunk index, where all my data is held. If you have other indexes in here you’ll want to copy them too.
  • $SPLUNK_HOME/etc  #all your configuration files

Simply back up the above folders, install Splunk on a new machine, launch Splunk first so it will generate all the default files, then copy the files over to the new instance.

I went a step further and planned for the future. I wrote a quick and dirty script that will do all of this for you,  even on the same machine – no need to copy to another machine.  The script assumes you’re running a redhat derivative and have the correct Splunk install file in a predictable location. Update the locations of splunk directories and install files as needed and run as root.

#!/bin/bash

#Backup important directories
mkdir /opt/splunkbackup/
cp -al /opt/splunk/etc /opt/splunkbackup/
cp -al /opt/splunk/var/lib/splunk/defaultdb /opt/splunkbackup/

#Nuke splunk
/opt/splunk/bin/splunk stop
rm -rf /opt/splunk

#Reload from fresh start
rpm -iv --replacepkgs /home/nicholas/splunk-6.2.2-255606-linux-2.6-x86_64.rpm
/opt/splunk/bin/splunk start --accept-license

#Restore configuration files and indexes
/opt/splunk/bin/splunk stop
rm -rf /opt/splunk/etc
cp -al /opt/splunkbackup/etc /opt/splunk/
rm -rf /opt/splunk/var/lib/splunk/defaultdb
cp -al /opt/splunkbackup/defaultdb /opt/splunk/var/lib/splunk/
chown splunk:splunk -R /opt/splunk/
/opt/splunk/bin/splunk start

#Remove splunk backup
rm -rf /opt/splunkbackup

This will restore your searches, settings, and data. It won’t restore audit and other internal Splunk information, however. This script worked marvelously in getting my Splunk back.

ZFS remote replication script with reporting

In my experimentation with FreeNAS one thing I found lacking was the quality of reports it generated. I suppose one philosophy is that the smaller the e-mail the better, but my philosophy is that the e-mail should still be legible. Some of the e-mails I get from FreeNAS are simply bizarre and cryptic.

FreeNAS has an option to replicate your ZFS volumes to a remote source for backup. As far as I can tell there is no report e-mail when the replication is done, although there may be a cryptic e-mail if anything failed. I have grown used to daily status e-mails from my previous NAS solution (Debian with homegrown scripts.) I set out to do this with FreeNAS and added a few added features along the way.

My script requires that you have already created an appropriate user and private/public key pair for both the source and destination machines (to allow for passwordless logins.) Instructions on how to do this are detailed below. You can download the script here.

Notes and observations

I learned quite a bit when creating this script. The end result is a script that e-mails me a beautiful report telling me anything that was added or removed since the last backup.

  • I used dd for greater speed as suggested here
  • I learned from here that the -R switch for ZFS send sends the entire snapshot tree.
  • The ZFS diff command currently has a bug where it does not always report deleted files / folders. It was opened two years ago, closed, and then recently re-opened as it is still an issue. It is the reason my script uses both ZFS dff and rsync – so I can continually see the bug in action.
  • When dealing with rsync, remember the / at the end!
  • In bash you can pipe output from a command to a variable.
  • When echoing above variable, make sure you enclose it in quotes to preserve formatting.
  • Use the -r flag in sed -r for extended regex functions
  • In my testing the built in freeNAS replication script didn’t appear to replicated the latest snapshot. Interesting…

Below are the preliminary steps that are needed in order for the script to run properly.

Configure a user for replication

Create users

Either manually or through the FreeNAS UI, create a user that will run the backup script. Create that same user on the remote box (backup server.)

Generate RSA keys

Log into local host and generate RSA keys to allow for passwordless login to the system

cd .ssh
ssh-keygen

Make note of the filenames you gave it (the default is id-rsa and id-rsa.pub)

Authorize the resulting public key

Log into remote host and add the public key of local host in ~/username/.ssh/authorized_keys where username is the user you created above. One way to accomplish this is to copy the public key on the main server and paste it into the authorized keys file of the backup server.

On the main server

(assuming the keyfile name is id-rsa)

cd .ssh
less id-rsa.pub

Copy the output on the screen in its entirety

On the backup server

Paste that public key into the authorized_keys file of the backup user

cd .ssh
vi authorized_keys

Allow the new user to mount filesystems

FreeNAS requires you to specifically allow regular users to mount filesystems as described here.

  1. In the web interface under System > Sysctls > Add sysctl:
    Variable: vfs.usermount
    Value: 1
    Enabled: yes

Grant ZFS permissions to the new user

In order for the dataset creation (full backup) feature to work the user we’ve created needs to have specific ZFS permissions granted as outlined here.

Run this command on both the main and backup servers:

zfs allow backup create,destroy,snapshot,rollback,clone,promote,rename,mount,send,receive,quota,reservation,hold storage

where backup is the new user and storage is the dataset name. I’m pretty sure you can make those permissions a little more fine grained but I threw a bunch of them in there for simplicity’s sake.

Configure HP iLo (optional)

My current backup server is an old HP Proliant server equipped with HP iLo. I decided to add a section in my script that, when enabled in the variables section, would have the script use iLo to power the machine on. If you do not have / wish to use iLo to control your backup server you can skip this section.

First, create a user in ILo and grant it Virtual Power and Reset permissions (all the rest can be denied.)

Next, copy the .pub file you created earlier to your computer so you can go into iLo web interface and upload it. Make sure an iLo user exists and the last part  (the username) of the public key file matches exactly with the user you created in HP iLo.

When I first tried this no matter what I tried I couldn’t get passwordless login to work. After much weeping, wailing, and gnashing of teeth. I finally discovered from here that the -f and -C options of the ssh-keygen command are required for iLo to accept the key. I had to regenerate a private/public key pair via the following options, where backup is the user I created in iLo:

ssh-keygen -b 1024 -f backup -C backup

Compare two latest ZFS snapshots for differences

In my previous post about ZFS snapshots I discussed how to get the latest snapshot name. I came across a need to get the name of the second to last snapshot and then compare that with the latest. A little CLI kung-fu is required for this but nothing too scary.

The command of the day is: zfs diff.

zfs diff storage/mythTV@auto-20141007.1248-2h storage/mythTV@auto-20141007.1323-2h

If you get an error using zfs diff, you aren’t running as root. You will need to delegate the diff ZFS permission to the account you’re using:

zfs allow backup diff storage

where backup is the account you want to grant permissions for and storage is the dataset you want to grant permissions to.

The next step is to grab the two latest snapshots using the following commands.

Obtain latest snapshot:

zfs list -t snapshot -o name -s creation -r storage/Documents | tail -1

Obtain the second to latest snapshot:

zfs list -t snapshot -o name -s creation -r storage/Documents | tail -2 | sort -r | tail -1

Putting it together in one line:

zfs diff `zfs list -t snapshot -o name -s creation -r storage/Documents | tail -2 | sort -r | tail -1` `zfs list -t snapshot -o name -s creation -r storage/Documents | tail -1`

While doing some testing I came across an unfortunate bug with the ZFS diff function. Sometimes it won’t show files that have been deleted! It indicates that the folder where the deleted files were in was modified but doesn’t specify any further. This bug appears to affect all ZFS implementations per here and here. As of this writing there has been no traction on this bug. The frustrating part is the bug is over two years old.

The workaround for this regrettable bug is to use rsync  with the -n parameter to compare snapshots. -n indicates to only do a dry run and not actually try to copy anything.

To use Rsync for comparison, you have to do a little more CLI-fu to massage the output from the zfs list command so it’s acceptable to rsync as well as include the full mountpoint of both snapshots. When working with rsync, don’t forget the trailing slash.

rsync -vahn --delete /mnt/storage/Documents/.zfs/snapshot/`zfs list -t snapshot -o name -s creation -r storage/Documents | tail -2 | sort -r | tail -1 | sed 's/.*@//g'`/ /mnt/storage/Documents/.zfs/snapshot/`zfs list -t snapshot -o name -s creation -r storage/Documents | tail -1 | sed 's/.*@//g'`/

Command breakdown:

Rsync arguments:
-v means verbose (lists files added/deleted)
-a means archive (preserve permissions)
-h means human readable numbers
-n means do a dry run only (no writing)
–delete will delete anything in the destination that’s not in the source (but not really since we’re doing -n – it will just print what it would delete on the screen)

Sed arguments
/s search and replace
/.*@ simple regex meaning anything up to and including the @ sign
/  What comes after this slash is what we would like to replace what was matched in the previous command. In this case, we choose nothing, and move directly to the last argument
/g tells sed to keep looking for other matches (not really necessary if we know there is only one in the stream)

All these backticks are pretty ugly, so for readability sake, save those commands into variables instead. The following is how you would do it in bash:

FIRST_SNAPSHOT="`zfs list -t snapshot -o name -s creation -r storage/Documents | tail -2 | sort -r | tail -1 | sed 's/.*@//g'/`"
SECOND_SNAPSHOT="`zfs list -t snapshot -o name -s creation -r storage/Documents | tail -1 | sed 's/.*@//g'/`"
rsync -vahn --delete /mnt/storage/Documents/.zfs/snapshot/$FIRST_SNAPSHOT /mnt/storage/Documents/.zfs/snapshot/$SECOND_SNAPSHOT

I think I’ll stop for now.

Watch a zpool resilver in freeNAS

In my experiments with freeNAS and RaidZ I have come to miss some functionality I enjoyed with Linux and mdadm. One such function was being able to watch an array rebuild, or in ZFS parlance, a pool resilvering.

My inability to watch the resilvering stems from the difference between what the watch command in Linux does and what it does in FreeBSD. Watch in BSD snoops on a tty line whereas watch in Linux executes a command repeatedly.

One option is to install a watch utility for BSD that behaves as the Linux watch command; however, freeNAS is a small read only image so installing things isn’t an option.

The way to do it in freeNAS is to use a while loop in the command line. After 20 minutes of googling I realized that there is no easy way to do this in one line like you can in bash (something about things requiring to be on a new line), so I had to settle for a quick script like one outlined here.

My familiarity with scripts comes from BASH, but I quickly found out freeNAS doesn’t ship with BASH.

echo $shell
/bin/csh

edit: It turns out freeNAS does indeed ship with bash! It’s just not the default shell. Simply execute “bash” in the shell and use your familiar bash shell syntax to your heart’s content. The BASH equivalent of the script below is:

while [ true ]; do clear; zpool status; sleep 1; done

I’ll leave the rest in for reference sake.


I did some digging on how to write CSH scripts and thanks to this website was able to write a simple CSH script to execute a given command at a given interval indefinitely.

Here is my C style watch script:

#!/bin/csh

#A simple script to replace the Linux watch functionality. The first input it takes is how many seconds to refresh; the second, the command to run. If the command has arguments (spaces), it must be passed in quotes.

set INTERVAL = "$1"
set COMMAND = "$2"

while ( 1 )
        clear
        $COMMAND
        sleep $INTERVAL
end

I placed this script in the /tmp directory, made it executable by running chmod +x, and then executing it by running ,/script.sh 1 “command”

Migrating a Windows 8.1 VM from Xen to Xenserver

Since Citrix recently released the entire Xenserver product to the world as free, open source software I thought I might give it a try. I have been pleased with the results and wanted to migrate my desktop VM over to it.

I’ve had a devil of a time getting my Windows 8.1 Professional virtual machine to migrate from plain Xen to Citrix Xenserver 6.2. My first mistake was not doing research before migrating hypervisor environments. While it is true that Citrix uses Xen as the underlying hypervisor, it turns out that there are still plenty of differences between the two environments.

I thought I would take the easy route by installing Citrix Xenconvert and converting my Xen Win8.1 VM to a format Xenserver likes. Although Xenconvert was designed for Physical to Virtual migration, I’ve found in the past that it works just as well for virtual to virtual migration.

After migrating to Xenserver I was greeted with the following friendly message:

INACCESSIBLE_BOOT_DEVICE windows 8.1 bluescreen
lovely

As far as I can tell it was the Xen GPLPV drivers that were the culprit. This leads me to my second mistake: not having a proper backup of the VM. I didn’t keep a backup of this VM in the Xen-friendly format after I migrated it to xenserver. This was mainly due to laziness – a classic example of “one ounce of laziness now produces one ton of hard work later.”

Instead of simply just booting the VM and removing the GPLPV drivers I had to attempt to do it via the Windows PE on the Windows 8.1 disc. I first tried running the GPLPV uninstall script from here, modifying it to point to the c:\ drive for both files and registry settings. Alas, that didn’t appear to do anything.

I then tried to go through the registry via the Windows PE and remove any references to Xen-anything. Success! Or so I thought. It turns out that blindly plowing through the Windows registry without an idea of exactly what you were doing has consequences. The VM would boot but I could not for the life of me get network drivers to work. As far as I can tell I corrupted something in the registry and despite my best efforts I couldn’t fix it.

At this point I had learned to back things up so I kept restoring from backups and messing with removing various registry keys. I continued this trial and error process for some time. After much weeping, wailing, and gnashing of teeth, I finally found the right combination of keys you must remove in order to boot again.

I took what I learned and updated the script from above to make it work with the WinPE environment . Download it here.

Boot into your PE environment of choice and run the script. When it’s finished, your VM will now be able to boot successfully.

The last step is to go into device manager and delete all xen-related drivers, then re-install them. After all that is said and done, your migration from xen to xenserver is complete. Repeat the exact same process to migrate from xenserver back to xen.