All posts by nicholas

proxmox suspend & resume scripts

Update 12/17/2019: Added logic to wait for VM to be suspended before suspending the shypervisor

Update 12/8/2019: After switching VMs I needed to tweak the pair of scripts. I modified it to make all the magic happen on the hypervisor; the VM simply needs to SSH into the hypervisor and call the script. The hypervisor now also needs access to SSH via public key to the VM to tell it to suspend.

#!/bin/sh
#ProxMox suspend script part 1 of 2
#To be run on the VM 
#All this does is call the suspend script on the hypervisor
#This could also just be a bash alias

####### Variables #########
HYPERVISOR=        #Name / IP of the hypervisor
SSH_USER=          #User to SSH into hypervisor as
HYPERVISOR_SCRIPT= #Path to part 2 of the script on the hypervisor

####### End Variables ######

#Execute server suspend script
ssh $SSH_USER@$HYPERVISOR "$HYPERVISOR_SCRIPT" &
#!/bin/bash
#ProxMox suspend script part 2 of 2
#Script to run on the hypervisor, it waits for VM to suspend and then suspends itself
#It relies on passwordless sudo configured on the VM as well as SSH keys to allow passwordless SSH access to the VM from the hypervisor
#It resumes the VM after it resumes itself
#Called from the VM

########### Variables ###############

VM=             #Name/IP of VM to SSH into
VM_SSH_USER=    #User to ssh into the vm with
VMID=           #VMID of VM you wish to suspend

########### End Variables############

#Tell guest VM to suspend
ssh $VM_SSH_USER@$VM "sudo systemctl suspend"

#Wait until guest VM is suspended, wait 5 seconds between attempts
while [ "$(qm status $VMID)" != "status: suspended" ]
do 
    echo "Waiting for VM to suspend"
    sleep 5 
done

#Suspend hypervisor
systemctl suspend

#Resume after shutdown
qm resume $VMID

I have a desktop running ProxMox. My GUI is handled via a virtual machine with physical hardware passed through it. The challenge with this setup is getting suspend & resume to work properly. I got it to work by suspending the VM first, then the host; on resume, I power up the host first, then resume the VM. Doing anything else would cause hardware passthrough problems that would force me to reboot the VM.

I automated the suspend process by using two scripts: one for the VM, and one for the hypervisor. The first script is run on the VM. It makes an SSH command to the hypervisor (thanks to this post) to instruct it to run the second half of the script; then initiates a suspend of the VM.

The second half of the script waits a few seconds to allow the VM to suspend itself, then instructs the hypervisor to also go into suspend. I had to split these into two scripts because once the VM is suspended, it can’t issue any more commands. Suspending the hypervisor must happen after the VM itself is suspended.

Here is script #1 (to be run on the VM) It assumes you have already set up a private/public key pair to allow for passwordless login into the hypervisor from the VM.

#!/bin/sh
#ProxMox suspend script part 1 of 2
#Tto be run on the VM so it suspends before the hypervisor does

####### Variables #########
HYPERVISOR=HYPERVISOR_NAME_OR_IP
SSH_USER=SSH_USER_ON_HYPERVISOR
HYPERVISOR_SCRIPT_LOCATION=NAME_AND_LOCATION_OF_PART2_OF_SCRIPT

####### End Variables ######

#Execute server suspend script, then suspend VM
ssh $SSH_USER@$HYPERVISOR  $HYPERVISOR_SCRIPT_LOCATION &

#Suspend
systemctl suspend

Here is script #2 (which script #1 calls), to be run on the hypervisor

#!/bin/bash
#ProxMox suspend script part 2 of 2
#Script to run on the hypervisor, it waits for VM to suspend and then suspends itself
#It resumes the VM after it resumes itself

########### Variables ###############

#Specify VMid you wish to suspend
VMID=VMID_OF_VM_YOU_WANT_TO_SUSPEND

########### End Variables############

#Wait 5 seconds before doing anything to allow for VM to suspend
sleep 5

#Suspend hypervisor
systemctl suspend

#Resume after shutdown
qm resume $VMID

It works on my machine 🙂

Primary VGA passthrough in ProxMox

I recently decided to amplify my VFIO experience by experimenting with passing my primary display adapter to a VM in proxmox. Previously I had just run tasksel on the proxmox host itself to install a GUI. I wanted better separation from the server side of proxmox and the client side. I also wanted to be able to distro-hop while maintaining the proxmox backend.

Initially I tried following my guide for passing through a secondary graphics card but ran into a snag. It did not work with my primary card and kept outputting these errors:

device vfio-pci,host=09:00.0,id=hostdev0,bus=pci.4,addr=0x0: Failed to mmap 0000:09:00.0 BAR 1. Performance may be slow

After much digging I finally found this post which explained I needed to unbind a few things for it to work properly:

echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind

After more searching I found this post on reddit which had a nifty script for automating this when VM startup is desired. I tweaked it a bit to suit my needs.

Find your IDs for GPU by doing lspci and looking for your adapter. Find the IDs by running lspci -n -s <GPU location discovered with lspci>. Lastly VMID is the promxox ID for the VM you wish to start.

#!/bin/sh
#Script to launch Linux desktop
#Adapted from from https://www.reddit.com/r/VFIO/comments/abfjs8/cant_seem_to_get_vfio_working_with_qemu/?utm_medium=android_app&utm_source=share

GPU=09:00
GPU_ID="10de 1c82"
GPU_AUDIO="10de 0fb9"
VMID=116

# Remove the framebuffer and console
echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind

# Unload the Kernel Modules that use the GPU
modprobe -r nvidia_drm
modprobe -r nvidia_modeset
modprobe -r nvidia
modprobe -r snd_hda_intel

# Load the vfio kernel module
modprobe vfio
modprobe vfio_iommu_type1
modprobe vfio-pci

#Assign card to vfio-pci
echo -n "${GPU_ID}" > /sys/bus/pci/drivers/vfio-pci/new_id
echo -n "${GPU_AUDIO}" > /sys/bus/pci/drivers/vfio-pci/new_id

#Start desktop
sudo qm start $VMID

#Wait here until the VM is turned off
while [ "$(qm status $VMID)" != "status: stopped" ] 
do
 sleep 5
done

#Reassign primary graphics card back to host
echo -n "0000:${GPU}.0" > /sys/bus/pci/drivers/vfio-pci/unbind
echo -n "0000:${GPU}.1" > /sys/bus/pci/drivers/vfio-pci/unbind
echo -n "${GPU_ID}" > /sys/bus/pci/drivers/vfio-pci/remove_id
echo -n "${GPU_AUDIO}" > /sys/bus/pci/drivers/vfio-pci/remove_id
rmmod vfio-pci
modprobe nvidia
modprobe nvidia_drm
modprobe nvidia_modeset
modprobe snd_hda_intel
sleep 1
echo -n "0000:${GPU}.0" > /sys/bus/pci/drivers/nvidia/bind
echo -n "0000:${GPU}.1" > /sys/bus/pci/drivers/snd_hda_intel/bind
sleep 1
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/bind
echo 1 > /sys/class/vtconsole/vtcon0/bind
echo 1 > /sys/class/vtconsole/vtcon1/bind

With my primary adapter passed through I realized I also want other things passed through, mainly USB. I tried Proxmox’s USB device passthrough options but it doesn’t work well with USB audio (stutters and choppy.) I wanted to pass through my whole USB controller to the VM.

This didn’t work as well as I had planned due to IOMMU groups. A great explanation of IOMMU groups can be found here. I had to figure out which of my USB controllers were in which IOMMU group to see if I could pass the whole thing through or not (some of them were in the same IOMMU group as SATA & network controllers, which I did not want to pass through to the VM.)

Fortunately I was able to discover which USB controllers I could safely pass through first by running lspci to see the device ID, then running find to see which IOMMU group it was in, then checking against lspci to see what other devices were in that group. The whole group comes over together when you pass through to a VM.

First determine the IDs of your USB controllers

lspci | grep USB

01:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43ba (rev 02)
08:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03)
0a:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 145c
43:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 145c

Next get which IOMMU group these devices belong to

find /sys/kernel/iommu_groups/ -type l|sort -h|grep '01:00.0\|08:00.0\|0a:00.3\|43:00.3'

/sys/kernel/iommu_groups/14/devices/0000:01:00.0
/sys/kernel/iommu_groups/15/devices/0000:08:00.0
/sys/kernel/iommu_groups/19/devices/0000:0a:00.3
/sys/kernel/iommu_groups/37/devices/0000:43:00.3

Then see what other devices use the same IOMMU group (the group is the number after /sys/kernel/iommu_groups/)

find /sys/kernel/iommu_groups/ -type l|sort -h | grep '/14\|/15\|/19\|/37'

/sys/kernel/iommu_groups/14/devices/0000:01:00.0
/sys/kernel/iommu_groups/14/devices/0000:01:00.1
/sys/kernel/iommu_groups/14/devices/0000:01:00.2
/sys/kernel/iommu_groups/14/devices/0000:02:00.0
/sys/kernel/iommu_groups/14/devices/0000:02:04.0
/sys/kernel/iommu_groups/14/devices/0000:02:05.0
/sys/kernel/iommu_groups/14/devices/0000:02:06.0
/sys/kernel/iommu_groups/14/devices/0000:02:07.0
/sys/kernel/iommu_groups/14/devices/0000:04:00.0
/sys/kernel/iommu_groups/14/devices/0000:05:00.0
/sys/kernel/iommu_groups/14/devices/0000:06:00.0
/sys/kernel/iommu_groups/15/devices/0000:08:00.0
/sys/kernel/iommu_groups/19/devices/0000:0a:00.3
/sys/kernel/iommu_groups/37/devices/0000:43:00.3

As you can see one of my USB controllers (01:00.0) has a whole bunch of stuff in its IOMMU group, so I don’t want to use it lest I bring all those other things into the VM with it. The other three, though, are isolated in their groups and thus are perfect for passthrough.

In my case I passed through 0a:00.3 & 43:00.3 as 08:00.0 is a PCI card I want passed through to my Windows VM. This passed through about 2/3 of the USB ports on my system to my guest VM.

Migrating from OPNSense to Ubiquiti Unifi Secure Gateway

I love the Ubiquiti Unifi interface. The only thing missing in my environment was the gateway. I had no complaints with my OPNSense firewall, but that missing section on the Unifi controller homepage haunted me, so I took the plunge and got a Unifi Secure Gateway Pro 4.

Basic Configuration

Initial setup

Official documentation is pretty detailed. Before you install your USG you will want to go into your controller and define your current network by going to Settings / Networks / LAN. This is where you specify DHCP scope and settings. I did not do this and struggled to get DHCP running properly as a result. Be sure to also set NTP settings, as these will also be applied to your USG.

To configure your USG for adoption, hop on the 192.168.1.0/24 network and sign into 192.168.1.1 via a web browser. Username and password are both ubnt. On this screen you can specify WAN and LAN settings. Configure your USG to match the network and gateway settings you’ve defined in your controller and hit apply. Now you can go into your controller and adopt the firewall into your environment.

Firewall

Basic port forwarding rules, static routes, and firewall rules can all be handled in the controller GUI via settings / Routing & Firewall. The GUI assumes your gateway only has one public IP address going to it. If you have multiple public IPs then you will need to configure them in config.gateway.json (see the Advanced Configuration section below.)

DHCP

As stated in the Initial Setup section, this is handled by the controller. You can specify a DHCP scope in the USG’s limited web interface but any settings there are quickly overwritten by the controller pushing out its configuration.

DHCP reservations are handled in the controller via the clients tab (on the left.) Open the client you want to make a reservation for, click the settings cog (top right), click Network, then click “Use Fixed IP Address” and specify the IP you want that device to use.

You can also specify advanced DHCP settings under Settings / Services / DHCP.

Seeing active DHCP leases requires dropping to the CLI on the USG. SSH into the USG and run:

show dhcp leases

Traffic limiting

You can create User Groups in the Unifi interface which define maximum bandwidth usage. You can then assign that User group to a specific client in the Unifi interface.

NAT

The Unifi GUI only supports Destination NAT (DNAT) and only supports the gateway’s WAN IP. You can configure this via settings / Routing & Firewall / Port Forwarding. For more advanced configuration, see below.

Advanced Configuration

A major downside of the USG is that the Unifi interface, while awesome, is extremely limited when it comes to Firewall functions. Thus, most configuration has to be done in the command line to get it to compete with OPNSense.

The core concept with the Unifi ecosystem is that devices are controlled by the Unifi Network Management controller. Thus, with the USG, any changes made to the firewall itself are overwritten by the controller on next provision.

In order to persist any command line changes you make, you must create a config.gateway.json file as outlined here, then copy it to your controller, which will then push the config to your USG on each provision. You will run into problems if you get this json file wrong (reboot loops) so you want to be very sure everything is correct in that file. I recommend a json validator (or an IDE like VS Code.)

One good shortcut I’ve found when googling how to do things is to simply use “edgerouter” instead of “USG” for the search term. The syntax to configure the edgerouter is identical (they both run EdgeOS.)

The most foolproof way to get a config.gateway.json that works is to run the configure commands manually on your USG, then when everything is how you want it, run this command to generate the running config in json format:

mca-ctrl -t dump-cfg > config.txt

You can then read config.txt and look for the specific settings you configured and save them into your config.gateway.json. The JSON syntax follows the CLI commands, with each part of the command broken into different brackets and quotes. An example config.gateway.json looks like this:

{
  "service": {
    "nat": {
      "rule": {
        "4500": {
          "description": "port_forward_WAN2",
          "destination": {
            "address": "100.64.100.100",
            "port": "22"
          },
          "inbound-interface": "eth3",
          "inside-address": {
            "address": "192.168.1.100"
          },
          "protocol": "tcp",
          "type": "destination"
        }
      }
    }
  },
  "vpn": {
    "ipsec": {
      "site-to-site": {
        "peer": {
          "yyyy.ignorelist.com": {
            "authentication": {
              "id": "xxxx.ignorelist.com"
            },
            "local-address": "xxxx.ignorelist.com"
          }
        }
      }
    }
  }
}

DNS

Use the static-host-mapping parameter to specify static DNS entries. Make sure the fqdn is listed in your config, otherwise they may or may not work. Example snippet:

{
...
  "system": {
    "static-host-mapping": {
      "host-name": {
        "firewall": {
          "alias":[
            "firewall.jeppsonlocal"
          ],
          "inet": [
            "192.168.1.1"
          ]
        }
      }
    }
  }
...
}

Live traffic graphs

Sadly there is no live / realtime graphs in the UniFi interface. It’s still possible to get that information if you drop to CLI; however the utilities to see this are not installed by default – you will need to install them (iftop & bmon in my case.) Thanks to this helpful reddit post that got me going.

As of this writing the USG PRO 4 is based in Debian Wheezy, so you will need to add those repositories to the device in order to use apt-get to install iftop & bmon.

Be sure not to get the wrong Debian version. Also be sure not to issue apt-get upgrade – bad things will happen in both cases and you will need to hard reset your device to fix them.

You can add the repositories using the firewall configure command. These can be translated into a config.gateway.json if desired, but I decided not to since this is a pretty low level change that you might not want to happen on future devices. Also note that you will have to re-install these tools after a firmware upgrade.

configure
#Main wheezy archive
set system package repository wheezy components 'main contrib non-free'
set system package repository wheezy distribution wheezy
set system package repository wheezy url 'http://archive.debian.org/debian/'
commit
save
exit

sudo apt-get update
sudo apt-get install iftop bmon

If you want to undo the above changes, substitute set with delete:

#to remove:
configure
delete system package repository wheezy
commit

1:1 NAT

For 1:1 NAT you need 3 NAT rules (Destination NAT, Source NAT, and Hairpin NAT) and a corresponding firewall rule. Example:

{
    "service": {
        "nat": {
            "rule": {
                "1000": {
                    "description": "Mail 1:1 DNAT",
                    "destination": {
                        "address": "1.1.1.1",
                        "port": "25,80,443,465,587,993,995"
                    },
                    "inbound-interface": "pppoe0",
                    "inside-address": {
                        "address": "192.168.1.1"
                    },
                    "protocol": "tcp",
                    "type": "destination"
                },
                "3000": {
                    "description": "Mail 1:1 Hairpin NAT",
                    "destination": {
                        "address": "1.1.1.25",
                        "port": "25,80,443,465,587,993,995"
                    },
                    "inbound-interface": "eth0",
                    "inside-address": {
                        "address": "192.168.1.25"
                    },
                    "protocol": "tcp",
                    "type": "destination"
                },
                "5000": {
                    "description": "Mail 1:1 SNAT",
                    "type": "source",
                    "source": {
                        "address": "192.168.1.25"
                    }
                }
            }
        },
        "firewall": {
            "name": {
                "WAN_IN": {
                    "rule": {
                        "1000": {
                            "action": "accept",
                            "description": "Mail 1:1 DNAT",
                            "destination": {
                                "address": "192.168.1.25",
                                "port": "25,80,443,465,587,993,995"
                            },
                            "protocol": "tcp",
                            "log": "enable"
                        }
                    }
                }
            }
        }
    }
}

OpenVPN Site to Site

My OPNSense router had a site-to-site OpenVPN going with an OpenWRT router. Details on how to configure this are in a separate blog post here.


That covers the basics of what my OPNSense firewall was doing. It’s a bit of a learning curve but once I got past that it’s been working really well.

sideload Gears of War 5 on Windows 10

Sideloading Gears 5 is similar to sideloading Gears 4. You need to grab the URL the store is using to download the game with a proxy tool like fiddler, then download that URL with a download manager.

Gears 5 is an msixvc file instead of an eappx file. You can still install this via the add-appxpackage command.

I ran into issues trying to run add-appxpackage from a network drive. It worked after copying to a local drive and running the command again.

Why go through the trouble? Because the Microsoft store’s DRM is so bad it requires complete re-installs when anything goes wrong. This is very annoying for those of us on less than gigabit internet connections trying to reinstall a 60+ GB game.

Transfer linode VM over ssh

I love Linode for their straightforward pricing. I can use them for temporary infrastructure and not have to worry about getting overcharged. When it comes time to transfer infrastructure back, the process is fairly straightforward. In my case I wanted to keep a disk image of my Linode VM for future use.

The linode documentation is very good. I used their copy an image over ssh article combined with their rescue and rebuild article sprinkled with a bit of gzip compression and use of pv to grab my linode image locally, complete with a progress bar.

First, boot your linode into recovery mode via dashboard / Linodes / <name of your linode>, then click on Rescue tab, map your drives as needed.

Launch console (top right) to get into the recovery shell. In my case I wanted to SSH into my linode to grab the image, so I set a password and started the ssh service:

passwd
/etc/init.d/ssh start

Then on your end, pipe ssh , gzip, pv and dd together to grab the compressed disk image with progress monitoring:

ssh root@ "dd if=/dev/sda | gzip -1 -" | pv | dd of=linode-image.gz

Success.

ProxMox VMs reboot when switch is rebooted

I came across an interesting situation where if I rebooted my Ubiquiti UniFi Switch 24 for a firmware upgrade, all my VM hosts would reboot themselves. It turned out to be due to my having enabled HA in ProxMox. The hosts would temporarily lose connectivity to each other and begin to fence themselves off from the cluster. This caused HA to kill the VMs on those hosts. Then once connectivity was restored everything would eventually come back up.

The proper way to fix this would be to have multiple paths for each host to talk to each other, so if one switch goes down the cluster is still able to communicate. In my case, where I only have one switch, the “poor man’s fix” was to simply disable HA altogether during the switch reboot, as outlined here. Then, once the switch is back up, re-enable HA.

On each node, stop the pve-ha-lrm service. Once it’s stopped on all hosts, stop the pve-ha-crm service. Then reboot your switch.

After the switch is back up, start pve-ha-lrm on each node first, then pve-ha-crm (if it doesn’t auto start itself) to re-enable HA.

Using ProxMox as a NAS

Lately I’ve been very unhappy with latest FreeBSD causing reboots randomly during disk resilvering. I simply cannot tolerate random reboots of my fileserver. This fact combined with the migration of OpenZFS to the ZFS on Linux code base means it’s time for me to move from a FreeBSD based ZFS NAS to a Linux-based one.

Sadly there aren’t many options in this space yet. I wanted something where basic tasks were taken care of, like what FreeNAS does, but also supports ZFS. The solution I settled on was ProxMox, which is a hypervisor, but it also has ZFS support.

The biggest drawback of ProxMox vs FreeNAS is the GUI. There are some disk-related GUI options in ProxMox, but mostly it’s VM focused. Thus, I had to configure my required services via CLI.

Following are the settings I used when I configured my NAS to run ProxMox.

Repo setup

If you don’t want to pay for a proxmox license, change the PVE enterprise repository to the free version by modifying /etc/apt/sources.list.d/pve-enterprise.list to the following:

deb http://download.proxmox.com/debian/pve buster pve-no-subscription

Then run at apt update & apt upgrade.

Email alerts

Postfix configuration

Edit /etc/postfix/main.cf and tweak your mail server config as needed (relayhost). Restart postfix after editing:

systemctl restart postfix

Forward mail for root to your own email

Edit /etc/aliases and add an alias for root to forward to your desired e-mail address. Add this line:

root: YOUR_EMAIL_ADDRESS

Afterward run:

newaliases

ZFS configuration

Pool Import

Import the pool using the zpool import -f command (-f to force import despite having been active in a different system)

zpool import -f  

By default they’re imported into the main root directory (/). If you want to have them go to /mnt, use the zfs set mountpoint command:

zfs set mountpoint=/mnt/ 

Monitoring

Install and configure zfs-zed

apt install zfs-zed

Modify /etc/zfs/zed.d/zed.rc and uncomment ZED_EMAIL_ADDR, ZED_EMAIL_PROG, and ZED_EMAIL_OPTS. Edit them to suit your needs (default values work fine, they just need to be uncommented.) Optionally uncomment ZED_NOTIFY_VERBOSE and change to 1 if you want more verbose notices like what FreeNAS does (scrub notifications, for example.)

After modifying /etc/zfs/zed.d/zed.rc, restart zed:

systemctl restart zfs-zed

Scrubbing

By default ProxMox scrubs each of your datasets on the second Sunday of every month. This cron job is located in /etc/cron.d/zfsutils-linux. Modify to your liking.

Snapshot & Replication

There are many different snapshot & replication scripts out there. I landed on Sanoid. Thanks to SvennD for helping me grasp how to get it working.

Install sanoid :

#Install necessary packages
apt install debhelper libcapture-tiny-perl libconfig-inifiles-perl pv lzop mbuffer git
# Clone repo, build deb, install
git clone https://github.com/jimsalterjrs/sanoid.git cd sanoid
ln -s packages/debian . 
dpkg-buildpackage -uc -us 
apt install ../sanoid_*_all.deb 

Snapshots

Edit /etc/sanoid/sanoid.conf with a backup and retention schedule for each of your datasets. Example taken from sanoid documentation:

[data/home]
	use_template = production
[data/images]
	use_template = production
	recursive = yes
	process_children_only = yes
[data/images/win7]
	hourly = 4

#############################
# templates below this line #
#############################

[template_production]
        frequently = 0
        hourly = 36
        daily = 30
        monthly = 3
        yearly = 0
        autosnap = yes
        autoprune = yes

Once sanoid.conf is to your liking, create a cron job to launch sanoid every hour (sanoid determines whether any action is needed when executed.)

crontab -e
#Add this line, save and exit
0 * * * * /usr/sbin/sanoid --cron

Replication

syncoid (part of sanoid) easily replicates snapshots. The syntax is pretty straightforward:

syncoid <source> <destination> -r 
#-r means recursive and is optional

For remote locations specify a username@ before the ip/hostname, then a colon and the dataset name, for example:

syncoid root@10.0.0.1:sourceDataset localDataset -r

You can even have a remote source go to a different remote destination, which is pretty neat.

Other syncoid options of interest:

--debug  #for seeing everything happening, useful for logging
--exclude #Regular expression to exclude certain datasets
--src-bwlimit #Set an upload limit so you don't saturate your bandwidth
--quiet #don't output anything unless it's an error

Automate synchronization by placing the same syncoid command into a cronjob:

0 */4 * * * /usr/sbin/syncoid --exclude=bigdataset1 --source-bwlimit=1M --recursive pool/data root@192.168.1.100:pool/data
#if you don't want status emails when the cron job runs, add --quiet

NFS

Install the nfs-kernel-server package and specify your NFS exports in /etc/exports.

apt install nfs-kernel-server portmap

Example /etc/exports :

/mnt/example/DIR1 192.168.0.0/16(rw,sync,all_squash,anonuid=0,anongid=0)

Restart nfs-server after modifying your exports:

systemctl restart nfs-server

Samba

Install samba, configure /etc/samba/smb.conf, and add users.

apt install samba
systemctl enable smbd

/etc/samba/smb.conf syntax is fairly straightforward. See the samba documentation for more information. Example share configuration:

[exampleshare]
comment = Example share
path = /mnt/example
valid users = user1 user2
writable = yes

Add users to the system itself with the adduser command:

adduser user1

Add those same users to samba with the smbpasswd -a command. Example:

smbpasswd -a user1

Restart samba after making changes:

systemctl restart smbd

SMART monitoring

Taken from https://pve.proxmox.com/wiki/Disk_Health_Monitoring:

By default, smartmontools daemon smartd is active and enabled, and scans the disks under /dev/sdX and /dev/hdX every 30 minutes for errors and warnings, and sends an e-mail to root if it detects a problem. 

Edit the file /etc/smartd.conf to suit your needs. You can specify/exclude devices, smart attributes, etc there. See here for more information. Restart the smartd service after modifying.

UPS monitoring

apc-upsd was easiest for me to configure, so I went with it. Thanks to this blog for giving me the information to get started.

First, install apcupsd:

apt install apcupsd apcupsd-doc

As soon as it was installed my console kept getting spammed about IRQ issues. To stop these errors I stopped the apcupsd daemon:

 systemctl stop apcupsd

Now modify /etc/apcupsd/apcupssd.conf to suit your needs. The section I added for my CyberPower OR2200LCDRT2U was simply:

UPSTYPE usb
DEVICE

Then modify /etc/default/apcupsd to specify it’s configured:

#/etc/default/apcupsd
ISCONFIGURED=yes

After configuring, you can restart the apcupsd service

systemctl start apcupsd

To check the status of your UPS, you can run the apcaccess status command:

/sbin/apcaccess status

Log monitoring

Install Logwatch to monitor system events. Here is a good primer on all of Logwatch’s options.

apt install logwatch

Modify /usr/share/logwatch/default.conf/logwatch.conf to suit your needs. By default it runs daily (defined in /etc/cron.daily/00logwatch). I added the following lines for my config to filter out unwanted information:

Service = "-zz-disk_space"
Service = "-postfix"
Service = "vsmartd"
Service = "-zz-lm_sensors"

Manually run logwatch to get a preview of what you’ll see:

logwatch --range today --mailto YOUR_EMAIL_ADDRESS

UPDATE 8/29/2020
I discovered additional tweaking to logwatch to get it exactly how I like it (thanks to this post and this one at serverfault.)

Defaults for monitored services are located in /usr/share/logwatch/default.conf/services/

You can copy this default file to /etc/logwatch/conf/services/<filename.conf> and then modify the service as needed. In my case I wanted to ignore logins for a particular user from a particular machine. This can be done by copying & editing sshd.conf and adding the following:

# Ignore these hosts
*Remove = 192.168.100.1
*Remove = X.Y.123.123
# Ignore these usernames
*Remove = testuser
# Ignore other noise. Note that we need to escape the ()
*Remove = "pam_succeed_if\(sshd:auth\): error retrieving information about user netscan.*

Troubleshooting

ZFS-ZED not sending email

If ZED isn’t sending emails it’s likely due to an error in the config. For some reason default values still need to be uncommented for zed to work, even if left unaltered. Thanks to this post for the info.

Samba share access denied

If you get access denied when trying to write to a SMB share, double check the file permissions on the server level. Execute chmod / chown as appropriate. Example:

chown user1 -R /mnt/example/user1

mountpoint check script

I have a few NFS mounts that I want to be working at all times. If there is a power outage, sometimes NFS clients come up before the NFS server does, and thus the mounts are not there. I wrote a quick little bash script to fix this utilizing the mountpoint command.

Behold (Change the mountpoint(s) to the one you want to monitor.)

#!/bin/bash
#Simple bash script to check mount points and re-mount them if they're not mounted
#Authored by Nick Jepspon 8/11/2019

### Variables ###
# Changes these to suit your needs

MOUNTPOINTS=(/mnt/1 /mnt/2 /mnt/3)  #space separated list of mountpoints to monitor

### End Variables ###


for mount in $MOUNTPOINTS 
do
    if  ! mountpoint -q $mount
    then
        echo "$mount is not mounted, attempting to mount."
        mount $mount
    fi
    #otherwise do nothing
done

I have this set as a cronjob running every 5 minutes

*/5 * * * * /mountcheck.sh

Now the system will continually try to mount the specified folder if it isn’t already mounted.

OPENVPN site to site vpn between USG and openwrt

A new firewall means a new site to site VPN configuration. My current iteration of this is a USG Pro 4 serving as an OpenVPN server and a Netgear Nighthawk R8000 serving as a VPN client joining their two networks together.

First, I had to wrap my head around some concepts. To set this up you need three sets of certificates and a DH file:

  • CA: To generate and validate certificates
  • Server: To encrypt/decrypt traffic for the Server
  • Client: To encrypt/decrypt traffic from the Client
  • DH: Not a certificate but still needed by the server for encryption

The server and client will also need openvpn configurations containing matching encryption/hashing methods, CA public key, and protocol/port settings.

Generate certificates

If you already have PKI infrastructure in place you simply need to generate two sets of keys and a DH file for the server/client to use. If you don’t, the easy-rsa project comes to the rescue. This tutorial uses easy-rsa version 3.

I didn’t want to generate the certificates on my firewall so I picked a Debian system to do the certificate generation. First, install easy-rsa:

sudo apt install easy-rsa

In Debian easy-rsa is installed to /usr/share/easy-rsa/

Optional: Set desired variables by moving /usr/share/easy-rsa/vars.example to /usr/share/easy-rsa/vars and un-commenting / editing to suit your needs (in my case I like to extend the life of my certificates beyond two years.)

Next, create your PKI and generate CA certificates:

/usr/share/easy-rsa/easyrsa init-pki
/usr/share/easy-rsa/easyrsa build-ca

Now create your DH file. Grab a cup of coffee for this one, it can take up to ten minutes to complete:

/usr/share/easy-rsa/easyrsa gen-dh

Then create your server & client certificates. For this guide we are calling the server ovpn-server and the client ovpn-client

#For the server
/usr/share/easy-rsa/easyrsa gen-req ovpn-server nopass 
/usr/share/easy-rsa/easyrsa sign-req server ovpn-server

#For the client
/usr/share/easy-rsa/easyrsa gen-req ovpn-client nopass
/usr/share/easy-rsa/easyrsa sign-req client ovpn-client

You will be asked for a common name. Remember what you put here, you will need it later. If you just hit enter and accept the default the common name will match what was passed in the above commands (ovpn-server for the server certificate and ovpn-client for the client certificate.)

Lastly, copy these files to their respective hosts:

USG Server: CA, Server key & cert, and DH file. (substitute with IP of your device)

scp pki/dh.pem pki/ca.crt pki/private/ovpn-server.key  pki/issued/ovpn-server.crt admin@IP_OF_YOUR_USG:/config/auth/

OpenWRT Client: Client key & cert, and CA cert:

scp pki/private/ovpn-client.key pki/issued/ovpn-client.crt pki/ca.crt root@IP_OF_YOUR_OPENWRT:/etc/config/

USG: VPN Server

Documentation for the EdgeRouter is much easier to find than for the USG. Since they use the same operating system I based this off of this guide from Logan Marchione for the EdgeRouter. SSH into your USG and issue the following, substituting the $variables with the values you desire for your network.

Explanation of variables:

VPN_SUBNET: Used for VPN communication. Must be different from both server and client subnets.
SERVER_SUBNET: Subnet on server side you wish to pass to client network
VPN_PORT: Change this to desired listening port for the OpenVPN server
REMOTE_SUBNET: Subnet on client side you wish to pass to server network
REMOTE_NETMASK: Netmask of client subnet
REMOTE_VPN_IP: Static IP you wish to give the client on the VPN subnet.
REMOTE_CERT_NAME: Common name given to client certificate generated previously.

Replace $variables below before pasting into USG terminal:

configure
#OpenVPN config
set interfaces openvpn vtun0
set interfaces openvpn vtun0 description "OpenVPN Site to Site"
set interfaces openvpn vtun0 mode server
set interfaces openvpn vtun0 encryption aes256
set interfaces openvpn vtun0 hash sha256
set interfaces openvpn vtun0 server subnet $VPN_SUBNET
set interfaces openvpn vtun0 server push-route $SERVER_SUBNET
set interfaces openvpn vtun0 tls ca-cert-file /config/auth/ca.crt
set interfaces openvpn vtun0 tls cert-file /config/auth/ovpn-client.crt
set interfaces openvpn vtun0 tls key-file /config/auth/ovpn-client.key
set interfaces openvpn vtun0 tls dh-file /config/auth/dh.pem
set interfaces openvpn vtun0 openvpn-option "--port $VPN_PORT"
set interfaces openvpn vtun0 openvpn-option --tls-server
set interfaces openvpn vtun0 openvpn-option "--comp-lzo yes"
set interfaces openvpn vtun0 openvpn-option --persist-key
set interfaces openvpn vtun0 openvpn-option --persist-tun
set interfaces openvpn vtun0 openvpn-option "--keepalive 10 120"
set interfaces openvpn vtun0 openvpn-option "--user nobody"
set interfaces openvpn vtun0 openvpn-option "--group nogroup"
set interfaces openvpn vtun0 openvpn-option "--route $REMOTE_SUBNET $REMOTE_NETMASK $REMOTE_VPN_IP"
set interfaces openvpn vtun0 server client $REMOTE_CERT_NAME ip $REMOTE_VPN_IP
set interfaces openvpn vtun0 server client $REMOTE_CERT_NAME subnet $REMOTE_SUBNET $REMOTE_NETMASK

#Firewall config
set firewall name WAN_LOCAL rule 50 action accept
set firewall name WAN_LOCAL rule 50 description "OpenVPN Site to Site"
set firewall name WAN_LOCAL rule 50 destination port $VPN_PORT
set firewall name WAN_LOCAL rule 50 log enable
set firewall name WAN_LOCAL rule 50 protocol udp
commit

If the code above commits successfully, the next step is to add the config to config.gateway.json. The USG’s config is managed by its Unifi controller, so for any of the changes made above to stick we must copy them to /usr/lib/unifi/data/sites/default/config.gateway.json on the controller (create the file if it doesn’t already exist.)

A quick shortcut is to run the mca-ctrl -t dump-cfg command, then parse out the parts you want to go into config.gateway.json as outlined in the UniFi documentation. For the lazy, here is the config.gateway.json generated from the above commands (be sure to modify $variables to suit your needs.)

{
  "firewall": {
    "WAN_LOCAL": {
      "rule": {
        "50": {
          "action": "accept",
          "description": "OpenVPN Site to Site",
          "destination": {
            "port": "$VPN_PORT"
          },
          "log": "enable",
          "protocol": "udp"
        }
      }
    }
  },
  "interfaces": {
    "openvpn": {
      "vtun0": {
        "description": "OpenVPN Site to Site",
        "encryption": "aes256",
        "hash": "sha256",
        "mode": "server",
        "openvpn-option": [
          "--port $VPN_PORT",
          "--tls-server",
          "--comp-lzo yes",
          "--persist-key",
          "--persist-tun",
          "--keepalive 10 120",
          "--user nobody",
          "--group nogroup",
          "--route $REMOTE_SUBNET $REMOTE_NETMASK $REMOTE_VPN_IP"
        ],
        "server": {
          "client": {
            "$REMOTE_CERT_NAME": {
              "ip": "$REMOTE_VPN_IP",
              "subnet": [
                "$REMOTE_SUBNET $REMOTE_NETMASK"
              ]
            }
          },
          "push-route": [
            "$SERVER_SUBNET"
          ],
          "subnet": "$VPN_SUBNET"
        },
        "tls": {
          "ca-cert-file": "/config/auth/ca.crt",
          "cert-file": "/config/auth/ovpn-client.crt",
          "dh-file": "/config/auth/dh.pem",
          "key-file": "/config/auth/ovpn-client.key"
        }
      }
    }
  }
}

OpenWRT: VPN client

Configuration is doable from the GUI but I found much easier with the command line. I got a lot of the configuration from this gist from braian87b

Install openvpn and the luci-app-openvpn packages:

opkg update
opkg install openvpn luci-app-openvpn

OpenVPN config files are located in /etc/config. In addition to the certificates we copied there earlier, we will also want to copy the openvpn client configuration to that directory.

Here is the config file matching the configuration generated above. Again, remember to replace $variables with your config matching what was generated above. Save it to /etc/config/site2site.conf

#/etc/config/site2site.conf
client
dev tun
proto udp
remote $DNS_OR_IP_OF_USG_OPENVPN_SERVER $VPN_PORT
cipher AES-256-CBC
auth SHA256
resolv-retry infinite
nobind
comp-lzo yes
persist-key
persist-tun
verb 3
ca /etc/config/ca.crt
cert /etc/config/ovpn-client.crt
key /etc/config/ovpn-client.key
remote-cert-tls server

With the openvpn config file, client certificate & key, and CA certificate we are ready to configure firewall rules and instruct the router to initiate the VPN connection.

# a new OpenVPN instance:
uci set openvpn.site2site=openvpn
uci set openvpn.site2site.enabled='1'
uci set openvpn.site2site.config='/etc/config/site2site.conf'

# a new network interface for tun:
uci set network.site2sitevpn=interface
uci set network.site2sitevpn.proto='none' #dhcp #none
uci set network.site2sitevpn.ifname='tun0'

# a new firewall zone (for VPN):
uci add firewall zone
uci set firewall.@zone[-1].name='vpn'
uci set firewall.@zone[-1].input='ACCEPT'
uci set firewall.@zone[-1].output='ACCEPT'
uci set firewall.@zone[-1].forward='ACCEPT'
uci set firewall.@zone[-1].masq='1'
uci set firewall.@zone[-1].mtu_fix='1'
uci add_list firewall.@zone[-1].network='site2sitevpn'

# enable forwarding from LAN to VPN:
uci add firewall forwarding
uci set firewall.@forwarding[-1].src='lan'
uci set firewall.@forwarding[-1].dest='vpn'

# Finally, you should commit UCI changes:
uci commit

Monitor VPN connection progress by using logread. If all goes well you will see the successful connection established message. If not, you’ll be able to get an idea of what’s wrong.

logread -f

If all goes well you’ll now have a bidirectional VPN between your two sites; however, traffic from the server’s subnet going directly to the client router itself (the OpenWRT device’s IP) will be considered as coming from the WAN interface and will be blocked. If you need to access the OpenWRT device directly from the USG’s subnet, you’ll need to add a firewall rule allowing it to do so:

uci add firewall rule
uci set firewall.@rule[-1]=rule
uci set firewall.@rule[-1].enabled='1'
uci set firewall.@rule[-1].target='ACCEPT'
uci set firewall.@rule[-1].src='wan'
uci set firewall.@rule[-1].name='Allow VPN to access router'
uci set firewall.@rule[-1].src_ip='$SERVER_SUBNET'
uci set firewall.@rule[-1].dest_ip='$INTERNALL_IP_OF_OPENWRT_ROUTER'
uci commit

Troubleshooting

One-sided VPN

I fought for some time with the fact that the VPN was established, but only traffic going from the Client network to the Server network would work. Traffic from the OpenVPN server subnet to the OpenVPN client subnet would simply hang and not work.

I finally found on the ubiquiti forums that this is due to default OpenVPN behavior of restricting traffic from the server subnet to the client subnet (see the OpenVPN how-to for more information.) The solution is to add lines in the server config informing it of the client network and to allow traffic to it. Below is an example USG config allowing informing it of remote subnet 192.168.230/24 and assigning the Client an IP of 10.0.76.253:

set interfaces openvpn vtun5 server client client1 ip 10.0.76.253
set interfaces openvpn vtun5 server client client1 subnet 192.168.230.0/24

VPN status stays “stopped” in OpenWRT

The best way to troubleshoot is to look at the logs in realtime. SSH to the OpenWRT router and run the command “logread -f” then try to initiate the connection again. The errors there will point you to the problem.