Tag Archives: linux

Verify backup integrity with rsync, sed, cat, and tee

Recently it became apparent that a large data transfer I did might have had some errors. I wanted to find an easy way to compare the source and destination to make sure that they were identical. My solution: rsync, sed, cat and tee

I have used rsync quite a bit but did not know about the –checksum flag until recently. When you run rsync with –checksum, it takes much longer, but it effectively does something similar to what a ZFS scrub does – it runs a checksum of every source file and compares it with the checksum of each destination file.  If there is a mismatch, rsync will overwrite the destination file with the source file to correct it.

In my situation I performed a large data migration from my old mdadam-based RAID array to my brand new ZFS array. During the transfer the disks were acting very strange, and at one point one of the disks even popped out of the array. The culprit turned out to be a faulty SATA controller. I bought a cheap 4 port SATA controller from Amazon for my new ZFS array. Do not do this! Spring the cash out for a better controller. The cheap ones, this one at least, only caused headache. I removed it and used the on-board SATA ports on my motherboard and the issues went  away.

All of those shennanigans made me wonder if there was corrupt data on my new ZFS array. A ZFS scrub repaired 15.5G of data! While I’m sure that fixed a lot of the issues, I realized there probably was still some corruption. This is how I verified it

rsync -Pahn --checksum /path/to/source /path/to/destination | tee migration.txt

-P shows progress, -a means archive, -h is for human readable measurements, and -n means dry run (don’t actually copy anything)

Tee is a cool utility that allows you to redirect output of a command both to a file and to standard output. This is useful if you want to see the verification take place in real time but also want to analyze it later.

After the comparison (which took a while!) I wanted to see the discrepancies. the -P flag lists each directory rsync checks as well as which files it detected. You can use sed in conjunction with cat to weed out the unwanted lines (directory listings) so that only the files with discrepancies are left.

 cat pictures.txt | sed '/\/$/d' | tee pictures-truncated.txt

The sed regex simply looks for any line ending in a / (directory listing) and removes that line. What is left is the files in question. You can combine the entire thing into one line like so

rsync -Pahn --checksum /path/to/source /path/to/destination | sed '/\/$/d' | tee migration.txt

In my case I wanted to compare discrepencies with rsync and make decisions on if I wanted to actually fix the issues. If you are 100% sure the source is OK to remove the destination completely, you can simply run

rsync -Pah --checksum --delete /path/to/source /path/to/destination

Xenserver – The uploaded patch file is invalid

It has been six months since I’ve applied any patches to my Citrix Xenserver hypervisor. Shame on me for not checking for updates. The thing has been humming along without any issues so it was easy to forget about.

In trying to install xenserver patches today I kept getting this error message no matter what I tried:

The uploaded patch file is invalid

After deleting everything I could (including files hanging out in /var/patch) I realized that I was simply Doing It Wrong™. D’oh!

When applying xenserver updates, the expected file extension is .xsupdate. I had been trying to xe patch-upload the downloaded zip file, whereas I was supposed to have extracted those zips before trying to upload them.  This quick little line unzipped all my patch ZIP files for me in one swoop:

find *.zip -exec unzip {} \;

Once everything was unzipped I was able to upload and apply the resulting .xsupdate files without issue.

Find out free disk space from the command line

The du command can be used in any Unix shell to determine how much space a folder is taking. It’s a quick way to determine which directory is using the most space.

If you run du on the root drive “/” then you will get an idea of how much space the drive is using. One unfortunate side effect of that command is if you have any mounted drives or other filesystems, it will search the disk usage of those folders as well. Aside from taking a long time that method provides data we don’t necessarily want.

Fortunately, there are a few switches you can use to fix this problem. The -x switch tells du to ignore other filesystems. Perfect.

There are a few other switches that prove useful. Below is a list of my favorites:

  • -x Ignore other filesystems
  • -h Use human readable numbers (kilobytes, megabytes, and gigabytes) instead of raw bytes
  • -s Provide a summary of the folder instead of listing the size of each file inside the folder
  • * (not a switch, but rather an argument to put after your directory.) This summarizes subfolders in that directory, as opposed to simply returning the size of that directory.

For troubleshooting low disk space errors, the following command will give you a good place to start:

du -hsx /*

You can then dig further by altering the above command to reflect a directory instead of /, or simply do * if you want a breakdown of the directory you are currently in.

FreeNAS on Xenserver with PVHVM support

In my current home setup I have a single server performing many functions thanks to Citrix Xenserver 6.2 and PCI Passthrough. This single box is my firewall, webservers, and NAS. My primary motivation for this is power savings – I didn’t want to have more than one box up 24/7 but still wanted all those separate services, some of which are software appliances that aren’t very customizable.

My current NAS setup is a simple Debian Wheezy virtual machine with the on-board SATA controller from the motherboard passed through to it. The VM runs a six drive software RAID 6 using mdadm and LVM volume management on top of it. Lately, though, I have become concerned with data integrity and my use of commodity drives. It prompted me to investigate ZFS as a replacement for my current setup. ZFS has many features, but the one I’m most interested in is its ability to detect and correct any and all corrupted files / blocks. This will put my mind at ease when it comes to the thousands of files that I have which are accessed infrequently.

I decided to try out FreeNAS, a NAS appliance which utilizes ZFS. After searching on forums it quickly became clear that the people at FreeNAS are not too keen on virtualizing their software. There is very little help to be had there in getting it to work in virtual environments. In the case of Xenserver, FreeNAS does work out of the box but it is considerably slower than bare metal due to its lack of support of Xen HVM drivers.

Fortunately, a friendly FreeNAS user posted a link to his blog outlining how he compiled FreeNAS to work with Xen. Since Xenserver uses Xen (it’s in the name, after all) I was able to use his re-compiled ISO (I was too lazy to compile my own) to test in Xenserver.

There are some bugs to get around to get this to work, though. Wired dad’s xenified FreeNAS doesn’t appear to like to boot in Xenserver, at least out of the box. It begins to boot but then hangs indefinitely on the following error:

run_interrupt_drive_hooks: still waiting after 60 seconds for xenbusb_nop_confighook_cb

This is the result of a bug in the version of qemu Xenserver uses. The bug causes BSD kernels to really not like the DVD virtual device in the VM and refuse to boot. The solution is to remove the virtual DVD drive. How, then, do you install FreeNAS without a DVD drive?

It turns out that all the FreeNAS installer does is extract an image file to your target drive. That file is an .xz file inside the ISO. To get wired dad’s FreeNAS Xen image to work in Xenserver, one must extract that .xz file from the ISO, expand it to an .img file, and then apply that .img file to the Xenserver virtual machine’s hard disk. The following commands can be run on the Xenserver host machine to accomplish this.

  1. Create a virtual machine with a 2GB hard drive.
  2. Mount the FreeNAS-xen ISO in loopback mode to get at the necessary file
    mkdir temp
    mount -o loop FreeNAS-9.2.1.5-RELEASE-xen-x64.iso temp/
  3. Extract the IMG file from the freeNAS ISO
    xzcat ~/temp/FreeNAS-x64.img.xz | dd of=FreeNAS_x64.img bs=64k
    

    Note that the IMG file is 2GB in size, which is larger than can sit in the root drive of a default install Xenserver. Make sure you extract this file somewhere that has enough space.

  4. Import that IMG file into the virtual disk you created with your VM in step 1.
    cd ..
    xe vdi-import uuid=<UUID of the 2GB disk created in step 1> filename=FreeNAS_x64.img
    

    This results in an error:

    The server failed to handle your request, due to an internal error.  The given message may give details useful for debugging the problem.
    message: Caught exception: VDI_IO_ERROR: [ Device I/O errors ]
    

    This error can be safely ignored – it did indeed copy the necessary files.
    Note: To obtain the UUID of the 2GB disk you created in step 1, run the “xe vdi-list” command and look for the name of the disk.

  5. Remove the DVD drive from the virtual machine. From Xencenter:
    Shutdown the VM
    Mount xs-toos.iso
    Run this command in a command prompt:

    xe vm-cd-remove uuid=<UUID of VM> cd-name=xs-tools.iso
  6. Profit!

There is one aspect I haven’t gotten to work yet, and that is Xenserver Tools integration. The important bit – paravirtualized networking – has been achieved so once I get more time I will investigate xenserver tools further.

Configuring rsync between two machines

rsync is a powerful backup tool. I have used it over SSH before but never with its own internal daemon. Following this guide I configured the rsync daemon with a share and host based access control. I then configured an rsync task in freeNAS to sync pictures between itself and the rsync server via rsync, not SSH (for speed). In this example my server is running Debian Wheezy and the client is running FreeNAS.

  1. On the server, create /etc/rsyncd.conf and add the following:
    max connections = 1
    log file = /var/log/rsync.log
    timeout = 300
    [Pictures]
    comment = All our pictures
    path = /storage/Pictures
    read only = yes
    list = yes
    uid = nobody
    gid = nogroup
    #auth users = mongrel
    list = yes
    hosts allow = 127.0.0.0/8 192.168.0.0/16
    #secrets file = /etc/rsyncd.secrets

    Note the only access control here is via source IP address. You can also have username/password access controls which I commented out.

  2. (Still on the server) start the rsync daemon
    rsync --daemon
  3. Configure the client. I used the freeNAS GUI which generated the following cron job
    rsync -r -t -z --delete  192.168.54.10::Pictures '/mnt/storage/Pictures/'

    Putting that to the test in the command line with an additonal -P parameter to see progress, I saw that the command synchronized successfully. Excellent.

I tested transfer speeds using both the rsync daemon and ssh method. There was a noticeable (8 MB/s) difference in transfer speeds. The rsync way is definitely faster.

Watch a zpool resilver in freeNAS

In my experiments with freeNAS and RaidZ I have come to miss some functionality I enjoyed with Linux and mdadm. One such function was being able to watch an array rebuild, or in ZFS parlance, a pool resilvering.

My inability to watch the resilvering stems from the difference between what the watch command in Linux does and what it does in FreeBSD. Watch in BSD snoops on a tty line whereas watch in Linux executes a command repeatedly.

One option is to install a watch utility for BSD that behaves as the Linux watch command; however, freeNAS is a small read only image so installing things isn’t an option.

The way to do it in freeNAS is to use a while loop in the command line. After 20 minutes of googling I realized that there is no easy way to do this in one line like you can in bash (something about things requiring to be on a new line), so I had to settle for a quick script like one outlined here.

My familiarity with scripts comes from BASH, but I quickly found out freeNAS doesn’t ship with BASH.

echo $shell
/bin/csh

edit: It turns out freeNAS does indeed ship with bash! It’s just not the default shell. Simply execute “bash” in the shell and use your familiar bash shell syntax to your heart’s content. The BASH equivalent of the script below is:

while [ true ]; do clear; zpool status; sleep 1; done

I’ll leave the rest in for reference sake.


I did some digging on how to write CSH scripts and thanks to this website was able to write a simple CSH script to execute a given command at a given interval indefinitely.

Here is my C style watch script:

#!/bin/csh

#A simple script to replace the Linux watch functionality. The first input it takes is how many seconds to refresh; the second, the command to run. If the command has arguments (spaces), it must be passed in quotes.

set INTERVAL = "$1"
set COMMAND = "$2"

while ( 1 )
        clear
        $COMMAND
        sleep $INTERVAL
end

I placed this script in the /tmp directory, made it executable by running chmod +x, and then executing it by running ,/script.sh 1 “command”

Check hard drives for bad sectors in Linux/BSD

It turns out that when hard drives fail, they don’t all fail completely. In fact, most fail silently, getting worse and worse as time moves on, causing bitrot and other issues.

I had a suspicion that one of my drives was failing so I thought I would test it. The tool for the job: badblocks.

badblocks writes data to the drive and then reads it back to ensure it gets the expected result. I have learned a lot about hard drive failure lately and now subscribe to running badblocks on every new hard drive I receive to ensure it is a good drive. The command I use is:

badblocks -wsv <device>

This is a destructive write test – it will wipe the disk. You can also run a non-destructive test, but for new disks you can go ahead and wipe them. I also use badblocks to ensure old disks can still be trusted with data. It’s great for “burn in” testing to ensure a drive won’t fail.


Update 3/1/19: If you encounter the following error:

badblocks: Value too large for defined data type invalid end block (5860522584): must be 32-bit value

It means your drive is too big for badblocks to recognize using the default sector size. Fix this by specifying a 4k sector size:

badblocks -b 4096 -wsv <device>

Thanks to Ubuntu Forums for the info.

Manually install Sophos UTM update

In the event that you want to install a soft released update to your Sophos UTM appliance before it has been picked up by auto update, you must download and install the patch manually. There is no way to do this in the GUI (yet.) Procedure taken from this helpful post (thanks, heartbleed!)

  1. Shell into the firewall and navigate to /var/up2date/sys
    cd /var/up2date/sys
  2. wget the patch file (.tgz.gpg extension)
    wget ftp://ftp.astaro.com/UTM/v9/up2date/u2d-sys-9.205012-206035.tgz.gpg
  3. Invoke auisys.plx with the –showdesc paramater
    auisys.plx --showdesc
  4. Install the update.
    cc system_up2date system_update

    Alternatively you can go into the web interface and schedule the install from there.

Easy peasy.

Configure SSMTP to use SSL/TLS connections

SSMTP is a very simple SMTP mail program which is used to send e-mails to a target server. It’s not a fully feature e-mail server but simply passes e-mails on. I first became acquainted with it because it’s the only mail server you can install on Citrix Xenserver. I now use it with all my servers because it’s very easy to configure.

Simply install it via command line:

sudo apt-get install ssmtp

There is only one config file to worry about: /etc/ssmtp/ssmtp.conf. To configure it to use an SSL connection (for gmail or if, like me, your ISP blocks port 25), add the following options, changing the brackets with your mailserver, username, and password.

mailhub=<mailserver>:587
UseSTARTTLS=YES
AuthUser=<username>
AuthPass=<password>
AuthMethod=DIGEST-MD5

If you just pasted the above config into your ssmtp.conf be sure to check the resulting config file for duplicate entries.

It’s as simple as that. All outgoing mail will be sent to the server specified above.