Compare two latest ZFS snapshots for differences

In my previous post about ZFS snapshots I discussed how to get the latest snapshot name. I came across a need to get the name of the second to last snapshot and then compare that with the latest. A little CLI kung-fu is required for this but nothing too scary.

The command of the day is: zfs diff.

zfs diff storage/mythTV@auto-20141007.1248-2h storage/mythTV@auto-20141007.1323-2h

If you get an error using zfs diff, you aren’t running as root. You will need to delegate the diff ZFS permission to the account you’re using:

zfs allow backup diff storage

where backup is the account you want to grant permissions for and storage is the dataset you want to grant permissions to.

The next step is to grab the two latest snapshots using the following commands.

Obtain latest snapshot:

zfs list -t snapshot -o name -s creation -r storage/Documents | tail -1

Obtain the second to latest snapshot:

zfs list -t snapshot -o name -s creation -r storage/Documents | tail -2 | sort -r | tail -1

Putting it together in one line:

zfs diff `zfs list -t snapshot -o name -s creation -r storage/Documents | tail -2 | sort -r | tail -1` `zfs list -t snapshot -o name -s creation -r storage/Documents | tail -1`

While doing some testing I came across an unfortunate bug with the ZFS diff function. Sometimes it won’t show files that have been deleted! It indicates that the folder where the deleted files were in was modified but doesn’t specify any further. This bug appears to affect all ZFS implementations per here and here. As of this writing there has been no traction on this bug. The frustrating part is the bug is over two years old.

The workaround for this regrettable bug is to use rsync  with the -n parameter to compare snapshots. -n indicates to only do a dry run and not actually try to copy anything.

To use Rsync for comparison, you have to do a little more CLI-fu to massage the output from the zfs list command so it’s acceptable to rsync as well as include the full mountpoint of both snapshots. When working with rsync, don’t forget the trailing slash.

rsync -vahn --delete /mnt/storage/Documents/.zfs/snapshot/`zfs list -t snapshot -o name -s creation -r storage/Documents | tail -2 | sort -r | tail -1 | sed 's/.*@//g'`/ /mnt/storage/Documents/.zfs/snapshot/`zfs list -t snapshot -o name -s creation -r storage/Documents | tail -1 | sed 's/.*@//g'`/

Command breakdown:

Rsync arguments:
-v means verbose (lists files added/deleted)
-a means archive (preserve permissions)
-h means human readable numbers
-n means do a dry run only (no writing)
–delete will delete anything in the destination that’s not in the source (but not really since we’re doing -n – it will just print what it would delete on the screen)

Sed arguments
/s search and replace
/.*@ simple regex meaning anything up to and including the @ sign
/  What comes after this slash is what we would like to replace what was matched in the previous command. In this case, we choose nothing, and move directly to the last argument
/g tells sed to keep looking for other matches (not really necessary if we know there is only one in the stream)

All these backticks are pretty ugly, so for readability sake, save those commands into variables instead. The following is how you would do it in bash:

FIRST_SNAPSHOT="`zfs list -t snapshot -o name -s creation -r storage/Documents | tail -2 | sort -r | tail -1 | sed 's/.*@//g'/`"
SECOND_SNAPSHOT="`zfs list -t snapshot -o name -s creation -r storage/Documents | tail -1 | sed 's/.*@//g'/`"
rsync -vahn --delete /mnt/storage/Documents/.zfs/snapshot/$FIRST_SNAPSHOT /mnt/storage/Documents/.zfs/snapshot/$SECOND_SNAPSHOT

I think I’ll stop for now.

Blow away ZFS snapshots and watch the progress

For the last month I have had a testing system (FreeNAS) take ZFS snapshots of sample datasets every five minutes. As you can imagine, the snapshot count has risen quite dramatically. I am currently at over 12,000 snapshots.

In testing a backup script I’m working on I’ve discovered that replicating 12,000 snapshots takes a while. The initial data transfer completes in a reasonable time frame but copying each subsequent snapshot takes more time than the original data. Consequently, I decided to blow away all my snapshots. It took a while! I devised this fun little way to watch the progress.

Open two terminal windows. In terminal #1, enter the following:

bash
while [ true ]; do zfs list -H -t snapshot | wc -l; sleep 6; done

The above loads BASH and the runs a simple loop to count the total number of snapshots on the system. The sleep command is only there because it takes a few seconds to return the results when you have more than 10,000 snapshots.

Alternatively you could make the output a little prettier by entering the following:

while [ true ]; do REMAINING="`zfs list -H -t snapshot | wc -l`"; echo "Snapshots remaining: $REMAINING" ; sleep 6; done

In terminal #2, enter the following (taken from here):

bash
for snapshot in `zfs list -H -t snapshot | cut -f 1`
do
zfs destroy $snapshot
done

You can now hide terminal#2 and observe terminal #1. It will show you how many snapshots are left, refreshing the number every 6 seconds. Neat.

Fix Apache “Could not reliably determine name” error

For too many years now I have been too lazy to investigate the Apache error message I get whenever I restart the service:

 ... waiting apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1 for ServerName

I finally decided to investigate it today and found this post which describes a simple fix: create /etc/apache2/conf.d/name and add the ServerName variable to it.

sudo vim /etc/apache2/conf.d/name
ServerName jeppson.org

Change ServerName to be whatever you would like, and you’re good to go.

Fix subsonic after 5.0 upgrade

Subsonic is a great media streaming program that I’ve used for a few years now. It was originally designed for streaming your private music collection but has since moved to allowing you to stream your video collection as well. It’s great for those of us who can’t bring their entire audio/visual library with them but would still like access to said library wherever they are.

I run subsonic behind an apache reverse proxy configuration similar to this one to allow it to run on the same server as other websites over port 80 and allow for HTTPS (When I set up my subsonic server years ago it had no native support for HTTPS.  The only way to have HTTPS was through another web server such as apache.)

After downloading and installing the Subsonic 5.0 upgrade I ran into a couple of issues, detailed below.

Issue #1

I have experienced several times over the years – upgrading causes /etc/default/subsonic to be replaced with a default, clean version. This is a problem if you have a few customizations to your subsonic setup, in my case context-path and port. (My experience is with Debian. I don’t know if other distros perform in a similar manner or not)

Resolution

Before you upgrade subsonic, make a backup copy of /etc/default/subsonic, then restore that copy after upgrade. If you forgot to make a backup first, edit the new /etc/default/subsonic file and check the following

  • Make sure the –port and –https-port arguments are correct
  • Re-add –context-path if you had it configured before. In my setup, I have configured –context-path=/subsonic to make my apache rewrite rules easier to manage.

Issue #2

The video streaming function broke entirely. This was due to the fact that it was trying to reference a local IP address to stream the videos, despite my apache proxypass rule. This problem will only surface itself if you are running Subsonic behind a reverse proxy.

Resolution

After a few days of searching I finally came across this helpful post. To get video to work, simply add

 ProxyPreserveHost on

to the apache configuration file you used for your reverse proxy, then restart apache. This will fix the video streaming function but you will notice your HTTPS icon change (if you configured HTTPS), notifying you that some content on the page is not encrypted. This is due to subsonic streaming the video in plain HTTP instead of HTTPS.

Unfortunately the fix to that appears to require at least Apache 2.4.5. Since I have an earlier version, I was greeted with this lovely message:

Syntax error on line 15 of /etc/apache2/sites-enabled/subsonic:
Invalid command 'SSLProxyCheckPeerName', perhaps misspelled or defined by a module not included in the server configuration

Since I did not want to upgrade my version of apache, I simply decided to accept the risk of my video streams possibly being intercepted.

Success.

WordPress wp-admin links are incorrect after site move

I’ve been scouring the internet for months for this particular issue.  It must not be very common. Ever since I moved my site from one source (local IP address) to another (web facing URL) I have had issues with bad links (things pointing to the old address instead of the new one.)

I have mostly resolved them (using methods from this post) but one vexing issue remained: links in wp-admin.php remained bad; specifically, the  column headers and pagination links in the All Posts section of managing the site – they all still pointed to the backend IP address instead of the domain name of the site.

I found a few bug reports mentioning this but no clear resolution. After investigating ticket 18944 I was put on the right track. One link from that ticket pointed me in the right direction, but the comment that really drove me to the resolution was the last one:

Any proxy configuration is “supported” by WordPress, you just need to remap the server vars based on whatever that particular proxy configuration is using.

This is proxy 101.

That made me realize that when I changed from a local IP to a public facing IP, I also went from direct access to the blog to being behind a reverse proxy. The issue I’ve been having is a proxy issue, not a site move issue. Thanks to the comment above, I learned I need to add a single line to wp-config.php:

$_SERVER[ 'HTTP_HOST' ]   = "jeppson.org";

Replace jeppson.org with the base URL of your site. That’s it! all links are correct now. Brilliant.

WordPress directing to old URL after upgrade to 4.0

I encountered an odd issue after upgrading one of my wordpress sites to version 4.0: the login page suddenly kept trying to redirect to its old address (I had changed addresses some time ago.)

I still don’t know how or why this happened, but after some googling the way to fix it was to follow instructions as outlined here.

wp-login.php can be used to (re-)set the URIs. Find this line:

require( dirname(__FILE__) . '/wp-load.php' );

and insert the following lines below:

//FIXME: do comment/remove these hack lines. (once the database is updated)
update_option('siteurl', 'http://your.domain.name/the/path' );
update_option('home', 'http://your.domain.name/the/path' );

You’re done. Test your site to make sure that it works right. If the change involves a new address for your site, make sure you let people know the new address, and consider adding some redirection instructions in your .htaccess file to guide visitors to the new location.

This worked for me – I could now log in with the correct URL.

 

Get the latest ZFS snapshot name

In my experiments with FreeNAS and ZFS I came across a need to obtain the name of the latest snapshot of a given dataset. For some odd reason this information is not readily available (that I could find, anyway.) After much googling I finally constructed an answer to my own question, “How do I get the name of the latest ZFS snapshot?”

The answer is via the zfs list command, using the -t, -o, and -r options, and then piping the output to tail to grab the last result.

zfs list -t snapshot -o name -s creation -r storage/Documents | tail -1

Argument breakdown:

  • -t type of ZFS item you want information for
  • -o list of properties of the type above you want to return
  • -s sort by
  • -r specific volume
  • -1 (from tail): only return one line (the last one)

The example above returns the name of the latest snapshot taken from my Documents dataset, which is on my storage volume.

FreeNAS on Xenserver with PVHVM support

In my current home setup I have a single server performing many functions thanks to Citrix Xenserver 6.2 and PCI Passthrough. This single box is my firewall, webservers, and NAS. My primary motivation for this is power savings – I didn’t want to have more than one box up 24/7 but still wanted all those separate services, some of which are software appliances that aren’t very customizable.

My current NAS setup is a simple Debian Wheezy virtual machine with the on-board SATA controller from the motherboard passed through to it. The VM runs a six drive software RAID 6 using mdadm and LVM volume management on top of it. Lately, though, I have become concerned with data integrity and my use of commodity drives. It prompted me to investigate ZFS as a replacement for my current setup. ZFS has many features, but the one I’m most interested in is its ability to detect and correct any and all corrupted files / blocks. This will put my mind at ease when it comes to the thousands of files that I have which are accessed infrequently.

I decided to try out FreeNAS, a NAS appliance which utilizes ZFS. After searching on forums it quickly became clear that the people at FreeNAS are not too keen on virtualizing their software. There is very little help to be had there in getting it to work in virtual environments. In the case of Xenserver, FreeNAS does work out of the box but it is considerably slower than bare metal due to its lack of support of Xen HVM drivers.

Fortunately, a friendly FreeNAS user posted a link to his blog outlining how he compiled FreeNAS to work with Xen. Since Xenserver uses Xen (it’s in the name, after all) I was able to use his re-compiled ISO (I was too lazy to compile my own) to test in Xenserver.

There are some bugs to get around to get this to work, though. Wired dad’s xenified FreeNAS doesn’t appear to like to boot in Xenserver, at least out of the box. It begins to boot but then hangs indefinitely on the following error:

run_interrupt_drive_hooks: still waiting after 60 seconds for xenbusb_nop_confighook_cb

This is the result of a bug in the version of qemu Xenserver uses. The bug causes BSD kernels to really not like the DVD virtual device in the VM and refuse to boot. The solution is to remove the virtual DVD drive. How, then, do you install FreeNAS without a DVD drive?

It turns out that all the FreeNAS installer does is extract an image file to your target drive. That file is an .xz file inside the ISO. To get wired dad’s FreeNAS Xen image to work in Xenserver, one must extract that .xz file from the ISO, expand it to an .img file, and then apply that .img file to the Xenserver virtual machine’s hard disk. The following commands can be run on the Xenserver host machine to accomplish this.

  1. Create a virtual machine with a 2GB hard drive.
  2. Mount the FreeNAS-xen ISO in loopback mode to get at the necessary file
    mkdir temp
    mount -o loop FreeNAS-9.2.1.5-RELEASE-xen-x64.iso temp/
  3. Extract the IMG file from the freeNAS ISO
    xzcat ~/temp/FreeNAS-x64.img.xz | dd of=FreeNAS_x64.img bs=64k
    

    Note that the IMG file is 2GB in size, which is larger than can sit in the root drive of a default install Xenserver. Make sure you extract this file somewhere that has enough space.

  4. Import that IMG file into the virtual disk you created with your VM in step 1.
    cd ..
    xe vdi-import uuid=<UUID of the 2GB disk created in step 1> filename=FreeNAS_x64.img
    

    This results in an error:

    The server failed to handle your request, due to an internal error.  The given message may give details useful for debugging the problem.
    message: Caught exception: VDI_IO_ERROR: [ Device I/O errors ]
    

    This error can be safely ignored – it did indeed copy the necessary files.
    Note: To obtain the UUID of the 2GB disk you created in step 1, run the “xe vdi-list” command and look for the name of the disk.

  5. Remove the DVD drive from the virtual machine. From Xencenter:
    Shutdown the VM
    Mount xs-toos.iso
    Run this command in a command prompt:

    xe vm-cd-remove uuid=<UUID of VM> cd-name=xs-tools.iso
  6. Profit!

There is one aspect I haven’t gotten to work yet, and that is Xenserver Tools integration. The important bit – paravirtualized networking – has been achieved so once I get more time I will investigate xenserver tools further.

Flashing updates to HP Proliant DL380 G5

A little while ago I bought an old HP Proliant DL380 G5 from ksl classifieds. I have used it off and on as a backup server but noticed that the drive performance was pretty abysmal. In an effort to fix this I decided to try and upgrade the ROM on the RAID controller it came with – an HP Smartarray P400 SAS/SATA controller.

It turned out to be more difficult than I expected. I first tried booting into Ubuntu server per this guide but I ran into problems with getting it to work. I tried a 32bit version of Ubuntu but I couldn’t even get that version to boot – maybe because of the 6GB of  RAM this unit has.

In experimenting with this I learned a little bit about Hp iLo syntax. This server comes with HP iLo 2, which has a web as well as an ssh interface. I encountered a need to hard reset the server (it was locked up) but the web admin “power off” button did nothing. I had to ssh into the ILO IP address and issue the following command:

power reset

I eventually abandoned my Ubuntu attempts and went with Arch Linux. Its live CD worked like a charm the first time – no fuss. I simply loaded the live CD, copied the update package to my current directory, marked it executable, and ran it.

scp nas:/storage/CP017698.scexe .
chmod +x CP017698.scexe
./CP017698.scexe

Capture

It took about five minutes.

I then set to flash the BIOS, which hadn’t been updated since 2007. It was easier than the RAID array because HP created .ISO images for this task. I obtained the BIOS from here. It was a Windows executable for some reason. The EXE extracted the various image files and had a handy how-to guide.

I took the iLO network CD installation route. I brought up the virtual media manager, loaded the ISO provided, and booted the machine. It brought up a simple flashing screen which updated the BIOS in about 5 minutes.

Capture

My proliant is feeling very hip and up to date now.

Configuring rsync between two machines

rsync is a powerful backup tool. I have used it over SSH before but never with its own internal daemon. Following this guide I configured the rsync daemon with a share and host based access control. I then configured an rsync task in freeNAS to sync pictures between itself and the rsync server via rsync, not SSH (for speed). In this example my server is running Debian Wheezy and the client is running FreeNAS.

  1. On the server, create /etc/rsyncd.conf and add the following:
    max connections = 1
    log file = /var/log/rsync.log
    timeout = 300
    [Pictures]
    comment = All our pictures
    path = /storage/Pictures
    read only = yes
    list = yes
    uid = nobody
    gid = nogroup
    #auth users = mongrel
    list = yes
    hosts allow = 127.0.0.0/8 192.168.0.0/16
    #secrets file = /etc/rsyncd.secrets

    Note the only access control here is via source IP address. You can also have username/password access controls which I commented out.

  2. (Still on the server) start the rsync daemon
    rsync --daemon
  3. Configure the client. I used the freeNAS GUI which generated the following cron job
    rsync -r -t -z --delete  192.168.54.10::Pictures '/mnt/storage/Pictures/'

    Putting that to the test in the command line with an additonal -P parameter to see progress, I saw that the command synchronized successfully. Excellent.

I tested transfer speeds using both the rsync daemon and ssh method. There was a noticeable (8 MB/s) difference in transfer speeds. The rsync way is definitely faster.