Tag Archives: linux

proxmox openvswitch bond

Recently I had to switch my Proxmox server which was using Linux bonds to using openvswitch. These are my notes:

Install openvswitch:

apt install openvswitch-switch

Configure openvswitch to bond interfaces and use VLANs using https://pve.proxmox.com/wiki/Open_vSwitch as an example:

allow-vmbr0 bond0
iface bond0 inet manual
	ovs_bonds enp4s0f0 eno1
	ovs_type OVSBond
	ovs_bridge vmbr0
	ovs_options bond_mode=active-backup

auto lo
iface lo inet loopback

iface eno1 inet manual

iface enp4s0f0 inet manual

allow-ovs vmbr0
iface vmbr0 inet manual
	ovs_type OVSBridge
	ovs_ports bond0 vlan50 vlan10

#Proxmox communication
allow-vmbr0 vlan50
iface vlan50 inet static
  ovs_type OVSIntPort
  ovs_bridge vmbr0
  ovs_options tag=50
  ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
  address 10.0.50.2
  netmask 255.255.255.0
  gateway 10.0.50.1

#Storage network
allow-vmbr0 vlan10
iface vlan10 inet static
  ovs_type OVSIntPort
  ovs_bridge vmbr0
  ovs_options tag=10
  ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
  address 192.168.10.2
  netmask 255.255.255.0

List active interface:

ovs-appctl bond/show bond0

Update 3/14/2020

I realized that openvswitch won’t fail back over to the original slave once it comes back online. I couldn’t for the life of me find the equivalent of bond-primary syntax for openvswitch; however I did find this command:

ovs-appctl list-commands

which reveals this command:

bond/set-active-slave port slave

So you can manually fallback using this command:

ovs-appctl bond/set-active-slave bond0 enp4s0f1

chroot into encrypted drive

I foolishly went browsing in my EFI partition on my Ubuntu (Elementary OS) laptop and decided to delete the Ubuntu folder. This made my laptop unbootable. This was my procedure to bring it back to life:

Boot into Ubuntu Live CD / USB environment

Decrypt LUKS encrypted drive (https://blog.sleeplessbeastie.eu/2015/11/16/how-to-mount-encrypted-lvm-logical-volume/)

sudo fdisk -l
#Determine encrypted partition is /dev/nvme0n1p3 because it's the largest
sudo cryptsetup luksOpen /dev/nvme0n1p3 encrypted_device
sudo vgchange -ay

Mount encrypted drive & chroot (https://askubuntu.com/questions/831216/how-can-i-reinstall-grub-to-the-efi-partition)

sudo mount /dev/elementary-vg/root /mnt
sudo mount /dev/nvme0n1p2 /mnt/boot/
sudo mount /dev/nvme0n1p1 /mnt/boot/efi
for i in /dev /dev/pts /proc /sys /run; do sudo mount -B $i /mnt$i; done
sudo chroot /mnt
sudo grub-install
update-grub  

#end chroot & unmount
exit
cd
for i in /mnt/dev/pts /mnt/dev  /mnt/proc /mnt/sys /mnt/run /mnt/boot/efi /mnt/boot /mnt; do sudo umount $i;  done

Add static route in CentOS7

I recently began a project of segmenting my LAN into various VLANs. One issue that cropped up had me banging my head against the wall for days. I had a particular VM that would use OpenVPN to a private VPN provider. I had that same system sending things to a file share via transmission-daemon.

Pre-subnet move everything worked, but once I moved my file server to a different subnet suddenly this VM could not access it while on the VPN. Transmission would hang for some time before finally saying

transmission-daemon.service: Failed with result 'timeout'.

The problem was since my file server was on a different subnet, it was trying to route traffic to it via the default gateway, which in this case was the VPN provider. I had to add a specific route to tell the server to use my LAN network instead of the VPN network in order to restore connectivity to the file server (thanks to this site for the primer.)

I had to create a file /etc/sysconfig/network-scripts/route-eth0 and give it the following line:

192.168.2.0/24 via 192.168.1.1 dev eth0

This instructed my VM to get to the 192.168.2 network via the 192.168.1.1 gateway on eth0. Restart the network service (or reboot) and success!

Java IDRAC 6 Ubuntu 18.04 setup

I recently acquired a Dell PowerEdge R610 and had a hard time getting its iDRAC to work properly on my ElementaryOS setup (Ubuntu 18.04 derivative.) I had two problems: Connection failed error and keyboard not working.

Connection Failed

After much searching I finally found this post:

The post explains the problem is with the security settings of Java 8+ preventing the connection. I didn’t know where my security file was so I first ran a quick find command to find it:

sudo find / -name java.security

In my case it was located in /etc/java-11-openjdk/security/java.security

The last step was to remove RC4 from the list of blacklisted ciphers, as this is the cause of the problem.

sudo vim /etc/java-11-openjdk/security/java.security

#change jdk.tls.disabledAlgorithms=SSLv3, RC4, DES, MD5withRSA, DH keySize < 1024, \ to be:
jdk.tls.disabledAlgorithms=SSLv3, DES, MD5withRSA, DH keySize < 1024, \

Save and exit, and iDRAC will now load!

Except now…

Keyboard doesn’t work

Update 2022-04-13 I recently had an issue where the keyboard didn’t work despite having Java 8. I fixed it by going to Tools and checking “Pass all keystrokes to server” within the jvm window.

My system was defaulting to using JRE 11, which apparently causes the keyboard to not function at all. I found on this reddit post that you really need an older version of Java. To do so on Ubuntu 18.04 you need to install it along with the icedtea plugin and run update-alternatives

sudo apt install openjdk-8-jre icedtea-8-plugin

Edit /etc/java-8-openjdk/security/java.security and remove the restriction on the RC4 algorhythm. Then configure the system to run java 8:

sudo update-alternatives --config java
#select java 8

Lastly, configure the icedtea plugin to run Java 8 instead of 11, because for some reason this plugin ignores the system java settings. Launch the IcedTea Web Control panel (find it in your system menu) and then Navigate to JVM settings. Enter /usr/ in the section “Set JVM for IcedTea-Web – working best with OpenJDK” section. Then hit Apply / OK

Phew. FINALLY you should be able to use iDRAC 6 on your modern Ubuntu system.

Find local accounts with shell login

I spent some time building this query so I thought I’d write it down. To find out which accounts in /etc/passwd with a UID greater than 500 have anything other than nologin or false set as their login shell, combine grep & awk to print them out:

grep -v 'nologin\|false' /etc/passwd | awk -F: '{if($3>500)print}'

grep -v: exclude results
awk -F: use colon as a field separator
{if($3>500)print: if the third column using the above colon separator is greater than 500, print the line

If running this command via salt cmd.run, you must use double quotes and escape the $ to get it to work properly:

salt <hostname> cmd.run "grep -v 'nologin\|false' /etc/passwd | awk -F: '{if(\$3>500)print}'"

Primary VGA passthrough in ProxMox

I recently decided to amplify my VFIO experience by experimenting with passing my primary display adapter to a VM in proxmox. Previously I had just run tasksel on the proxmox host itself to install a GUI. I wanted better separation from the server side of proxmox and the client side. I also wanted to be able to distro-hop while maintaining the proxmox backend.

Initially I tried following my guide for passing through a secondary graphics card but ran into a snag. It did not work with my primary card and kept outputting these errors:

device vfio-pci,host=09:00.0,id=hostdev0,bus=pci.4,addr=0x0: Failed to mmap 0000:09:00.0 BAR 1. Performance may be slow

After much digging I finally found this post which explained I needed to unbind a few things for it to work properly:

echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind

After more searching I found this post on reddit which had a nifty script for automating this when VM startup is desired. I tweaked it a bit to suit my needs.

Find your IDs for GPU by doing lspci and looking for your adapter. Find the IDs by running lspci -n -s <GPU location discovered with lspci>. Lastly VMID is the promxox ID for the VM you wish to start.

#!/bin/sh
#Script to launch Linux desktop
#Adapted from from https://www.reddit.com/r/VFIO/comments/abfjs8/cant_seem_to_get_vfio_working_with_qemu/?utm_medium=android_app&utm_source=share

GPU=09:00
GPU_ID="10de 1c82"
GPU_AUDIO="10de 0fb9"
VMID=116

# Remove the framebuffer and console
echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind

# Unload the Kernel Modules that use the GPU
modprobe -r nvidia_drm
modprobe -r nvidia_modeset
modprobe -r nvidia
modprobe -r snd_hda_intel

# Load the vfio kernel module
modprobe vfio
modprobe vfio_iommu_type1
modprobe vfio-pci

#Assign card to vfio-pci
echo -n "${GPU_ID}" > /sys/bus/pci/drivers/vfio-pci/new_id
echo -n "${GPU_AUDIO}" > /sys/bus/pci/drivers/vfio-pci/new_id

#Start desktop
sudo qm start $VMID

#Wait here until the VM is turned off
while [ "$(qm status $VMID)" != "status: stopped" ] 
do
 sleep 5
done

#Reassign primary graphics card back to host
echo -n "0000:${GPU}.0" > /sys/bus/pci/drivers/vfio-pci/unbind
echo -n "0000:${GPU}.1" > /sys/bus/pci/drivers/vfio-pci/unbind
echo -n "${GPU_ID}" > /sys/bus/pci/drivers/vfio-pci/remove_id
echo -n "${GPU_AUDIO}" > /sys/bus/pci/drivers/vfio-pci/remove_id
rmmod vfio-pci
modprobe nvidia
modprobe nvidia_drm
modprobe nvidia_modeset
modprobe snd_hda_intel
sleep 1
echo -n "0000:${GPU}.0" > /sys/bus/pci/drivers/nvidia/bind
echo -n "0000:${GPU}.1" > /sys/bus/pci/drivers/snd_hda_intel/bind
sleep 1
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/bind
echo 1 > /sys/class/vtconsole/vtcon0/bind
echo 1 > /sys/class/vtconsole/vtcon1/bind

With my primary adapter passed through I realized I also want other things passed through, mainly USB. I tried Proxmox’s USB device passthrough options but it doesn’t work well with USB audio (stutters and choppy.) I wanted to pass through my whole USB controller to the VM.

This didn’t work as well as I had planned due to IOMMU groups. A great explanation of IOMMU groups can be found here. I had to figure out which of my USB controllers were in which IOMMU group to see if I could pass the whole thing through or not (some of them were in the same IOMMU group as SATA & network controllers, which I did not want to pass through to the VM.)

Fortunately I was able to discover which USB controllers I could safely pass through first by running lspci to see the device ID, then running find to see which IOMMU group it was in, then checking against lspci to see what other devices were in that group. The whole group comes over together when you pass through to a VM.

First determine the IDs of your USB controllers

lspci | grep USB

01:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43ba (rev 02)
08:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03)
0a:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 145c
43:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 145c

Next get which IOMMU group these devices belong to

find /sys/kernel/iommu_groups/ -type l|sort -h|grep '01:00.0\|08:00.0\|0a:00.3\|43:00.3'

/sys/kernel/iommu_groups/14/devices/0000:01:00.0
/sys/kernel/iommu_groups/15/devices/0000:08:00.0
/sys/kernel/iommu_groups/19/devices/0000:0a:00.3
/sys/kernel/iommu_groups/37/devices/0000:43:00.3

Then see what other devices use the same IOMMU group (the group is the number after /sys/kernel/iommu_groups/)

find /sys/kernel/iommu_groups/ -type l|sort -h | grep '/14\|/15\|/19\|/37'

/sys/kernel/iommu_groups/14/devices/0000:01:00.0
/sys/kernel/iommu_groups/14/devices/0000:01:00.1
/sys/kernel/iommu_groups/14/devices/0000:01:00.2
/sys/kernel/iommu_groups/14/devices/0000:02:00.0
/sys/kernel/iommu_groups/14/devices/0000:02:04.0
/sys/kernel/iommu_groups/14/devices/0000:02:05.0
/sys/kernel/iommu_groups/14/devices/0000:02:06.0
/sys/kernel/iommu_groups/14/devices/0000:02:07.0
/sys/kernel/iommu_groups/14/devices/0000:04:00.0
/sys/kernel/iommu_groups/14/devices/0000:05:00.0
/sys/kernel/iommu_groups/14/devices/0000:06:00.0
/sys/kernel/iommu_groups/15/devices/0000:08:00.0
/sys/kernel/iommu_groups/19/devices/0000:0a:00.3
/sys/kernel/iommu_groups/37/devices/0000:43:00.3

As you can see one of my USB controllers (01:00.0) has a whole bunch of stuff in its IOMMU group, so I don’t want to use it lest I bring all those other things into the VM with it. The other three, though, are isolated in their groups and thus are perfect for passthrough.

In my case I passed through 0a:00.3 & 43:00.3 as 08:00.0 is a PCI card I want passed through to my Windows VM. This passed through about 2/3 of the USB ports on my system to my guest VM.

Transfer linode VM over ssh

I love Linode for their straightforward pricing. I can use them for temporary infrastructure and not have to worry about getting overcharged. When it comes time to transfer infrastructure back, the process is fairly straightforward. In my case I wanted to keep a disk image of my Linode VM for future use.

The linode documentation is very good. I used their copy an image over ssh article combined with their rescue and rebuild article sprinkled with a bit of gzip compression and use of pv to grab my linode image locally, complete with a progress bar.

First, boot your linode into recovery mode via dashboard / Linodes / <name of your linode>, then click on Rescue tab, map your drives as needed.

Launch console (top right) to get into the recovery shell. In my case I wanted to SSH into my linode to grab the image, so I set a password and started the ssh service:

passwd
/etc/init.d/ssh start

Then on your end, pipe ssh , gzip, pv and dd together to grab the compressed disk image with progress monitoring:

ssh root@ "dd if=/dev/sda | gzip -1 -" | pv | dd of=linode-image.gz

Success.

Using ProxMox as a NAS

Lately I’ve been very unhappy with latest FreeBSD causing reboots randomly during disk resilvering. I simply cannot tolerate random reboots of my fileserver. This fact combined with the migration of OpenZFS to the ZFS on Linux code base means it’s time for me to move from a FreeBSD based ZFS NAS to a Linux-based one.

Sadly there aren’t many options in this space yet. I wanted something where basic tasks were taken care of, like what FreeNAS does, but also supports ZFS. The solution I settled on was ProxMox, which is a hypervisor, but it also has ZFS support.

The biggest drawback of ProxMox vs FreeNAS is the GUI. There are some disk-related GUI options in ProxMox, but mostly it’s VM focused. Thus, I had to configure my required services via CLI.

Following are the settings I used when I configured my NAS to run ProxMox.

Repo setup

If you don’t want to pay for a proxmox license, change the PVE enterprise repository to the free version by modifying /etc/apt/sources.list.d/pve-enterprise.list to the following:

deb http://download.proxmox.com/debian/pve buster pve-no-subscription

Then run at apt update & apt upgrade.

Email alerts

Postfix configuration

Edit /etc/postfix/main.cf and tweak your mail server config as needed (relayhost). Restart postfix after editing:

systemctl restart postfix

Forward mail for root to your own email

Edit /etc/aliases and add an alias for root to forward to your desired e-mail address. Add this line:

root: YOUR_EMAIL_ADDRESS

Afterward run:

newaliases

ZFS configuration

Pool Import

Import the pool using the zpool import -f command (-f to force import despite having been active in a different system)

zpool import -f  

By default they’re imported into the main root directory (/). If you want to have them go to /mnt, use the zfs set mountpoint command:

zfs set mountpoint=/mnt/ 

Monitoring

Install and configure zfs-zed

apt install zfs-zed

Modify /etc/zfs/zed.d/zed.rc and uncomment ZED_EMAIL_ADDR, ZED_EMAIL_PROG, and ZED_EMAIL_OPTS. Edit them to suit your needs (default values work fine, they just need to be uncommented.) Optionally uncomment ZED_NOTIFY_VERBOSE and change to 1 if you want more verbose notices like what FreeNAS does (scrub notifications, for example.)

After modifying /etc/zfs/zed.d/zed.rc, restart zed:

systemctl restart zfs-zed

Scrubbing

By default ProxMox scrubs each of your datasets on the second Sunday of every month. This cron job is located in /etc/cron.d/zfsutils-linux. Modify to your liking.

Snapshot & Replication

There are many different snapshot & replication scripts out there. I landed on Sanoid. Thanks to SvennD for helping me grasp how to get it working.

Install sanoid :

#Install necessary packages
apt install debhelper libcapture-tiny-perl libconfig-inifiles-perl pv lzop mbuffer git
# Clone repo, build deb, install
git clone https://github.com/jimsalterjrs/sanoid.git cd sanoid
ln -s packages/debian . 
dpkg-buildpackage -uc -us 
apt install ../sanoid_*_all.deb 

Snapshots

Edit /etc/sanoid/sanoid.conf with a backup and retention schedule for each of your datasets. Example taken from sanoid documentation:

[data/home]
	use_template = production
[data/images]
	use_template = production
	recursive = yes
	process_children_only = yes
[data/images/win7]
	hourly = 4

#############################
# templates below this line #
#############################

[template_production]
        frequently = 0
        hourly = 36
        daily = 30
        monthly = 3
        yearly = 0
        autosnap = yes
        autoprune = yes

Once sanoid.conf is to your liking, create a cron job to launch sanoid every hour (sanoid determines whether any action is needed when executed.)

crontab -e
#Add this line, save and exit
0 * * * * /usr/sbin/sanoid --cron

Replication

syncoid (part of sanoid) easily replicates snapshots. The syntax is pretty straightforward:

syncoid <source> <destination> -r 
#-r means recursive and is optional

For remote locations specify a username@ before the ip/hostname, then a colon and the dataset name, for example:

syncoid root@10.0.0.1:sourceDataset localDataset -r

You can even have a remote source go to a different remote destination, which is pretty neat.

Other syncoid options of interest:

--debug  #for seeing everything happening, useful for logging
--exclude #Regular expression to exclude certain datasets
--src-bwlimit #Set an upload limit so you don't saturate your bandwidth
--quiet #don't output anything unless it's an error

Automate synchronization by placing the same syncoid command into a cronjob:

0 */4 * * * /usr/sbin/syncoid --exclude=bigdataset1 --source-bwlimit=1M --recursive pool/data root@192.168.1.100:pool/data
#if you don't want status emails when the cron job runs, add --quiet

NFS

Install the nfs-kernel-server package and specify your NFS exports in /etc/exports.

apt install nfs-kernel-server portmap

Example /etc/exports :

/mnt/example/DIR1 192.168.0.0/16(rw,sync,all_squash,anonuid=0,anongid=0)

Restart nfs-server after modifying your exports:

systemctl restart nfs-server

Samba

Install samba, configure /etc/samba/smb.conf, and add users.

apt install samba
systemctl enable smbd

/etc/samba/smb.conf syntax is fairly straightforward. See the samba documentation for more information. Example share configuration:

[exampleshare]
comment = Example share
path = /mnt/example
valid users = user1 user2
writable = yes

Add users to the system itself with the adduser command:

adduser user1

Add those same users to samba with the smbpasswd -a command. Example:

smbpasswd -a user1

Restart samba after making changes:

systemctl restart smbd

SMART monitoring

Taken from https://pve.proxmox.com/wiki/Disk_Health_Monitoring:

By default, smartmontools daemon smartd is active and enabled, and scans the disks under /dev/sdX and /dev/hdX every 30 minutes for errors and warnings, and sends an e-mail to root if it detects a problem. 

Edit the file /etc/smartd.conf to suit your needs. You can specify/exclude devices, smart attributes, etc there. See here for more information. Restart the smartd service after modifying.

UPS monitoring

apc-upsd was easiest for me to configure, so I went with it. Thanks to this blog for giving me the information to get started.

First, install apcupsd:

apt install apcupsd apcupsd-doc

As soon as it was installed my console kept getting spammed about IRQ issues. To stop these errors I stopped the apcupsd daemon:

 systemctl stop apcupsd

Now modify /etc/apcupsd/apcupssd.conf to suit your needs. The section I added for my CyberPower OR2200LCDRT2U was simply:

UPSTYPE usb
DEVICE

Then modify /etc/default/apcupsd to specify it’s configured:

#/etc/default/apcupsd
ISCONFIGURED=yes

After configuring, you can restart the apcupsd service

systemctl start apcupsd

To check the status of your UPS, you can run the apcaccess status command:

/sbin/apcaccess status

Log monitoring

Install Logwatch to monitor system events. Here is a good primer on all of Logwatch’s options.

apt install logwatch

Modify /usr/share/logwatch/default.conf/logwatch.conf to suit your needs. By default it runs daily (defined in /etc/cron.daily/00logwatch). I added the following lines for my config to filter out unwanted information:

Service = "-zz-disk_space"
Service = "-postfix"
Service = "vsmartd"
Service = "-zz-lm_sensors"

Manually run logwatch to get a preview of what you’ll see:

logwatch --range today --mailto YOUR_EMAIL_ADDRESS

UPDATE 8/29/2020
I discovered additional tweaking to logwatch to get it exactly how I like it (thanks to this post and this one at serverfault.)

Defaults for monitored services are located in /usr/share/logwatch/default.conf/services/

You can copy this default file to /etc/logwatch/conf/services/<filename.conf> and then modify the service as needed. In my case I wanted to ignore logins for a particular user from a particular machine. This can be done by copying & editing sshd.conf and adding the following:

# Ignore these hosts
*Remove = 192.168.100.1
*Remove = X.Y.123.123
# Ignore these usernames
*Remove = testuser
# Ignore other noise. Note that we need to escape the ()
*Remove = "pam_succeed_if\(sshd:auth\): error retrieving information about user netscan.*

Troubleshooting

ZFS-ZED not sending email

If ZED isn’t sending emails it’s likely due to an error in the config. For some reason default values still need to be uncommented for zed to work, even if left unaltered. Thanks to this post for the info.

Samba share access denied

If you get access denied when trying to write to a SMB share, double check the file permissions on the server level. Execute chmod / chown as appropriate. Example:

chown user1 -R /mnt/example/user1

zfs drive removal ‘part of active pool’ fix

Occasionally I will manually offline a disk in my ZFS pool for one reason or another. Annoyingly I will sometimes get this error when I try to online that same disk back into the pool:

cannot online /dev/sda: cannot relabel '/dev/sda': unable to read disk capacity

The fix, thankfully, is fairly simple. Simply run the following command (make double sure you’re doing it on the correct device!)

sudo wipefs -a <DEVICE>

After I ran that command ZFS automatically picked the disk back and resilvered it into the pool.

Thanks to this superuser.com discussion for the advice!