Category Archives: OS

Add NVIDIA GPU to LXC container

April 8, 2025 nicholas Leave a comment

I followed this guide to get NVIDIA drivers working on my Proxmox machine. However when I tried to get them working in my container I couldn’t see how to get nvidia-smi installed. Thankfully this blog had what I needed.

The step I missed was copying & installing the NVIDIA drivers into the container with this flag:

--no-kernel-module

That got me one step closer but I could not spin up open-webui in a container. I kept getting the error

Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]

The fix was to install the NVDIA Container Toolkit:

Configure the production repository:

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Update the packages list from the repository:

sudo apt-get update

Install the NVIDIA Container Toolkit packages:

sudo apt-get install -y nvidia-container-toolkit

An additional hurtle I encountered was this error:

Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: failed to add device rules: unable to find any existing device filters attached to the cgroup: bpf_prog_query(BPF_CGROUP_DEVICE) failed: operation not permitted: unknown

I found here that the fix is to change a line in /etc/nvidia-container-runtime/config.toml. Uncomment and change no-cgroups to true.

no-cgroups = true

Success.

Not working after reboot

I had a working config until I rebooted the host. It turns out that two services need to run on the host:

nvidia-persistenced
nvidia-smi

Configured cron tab to run these on reboot:

/etc/cron.d/nvidia:
@reboot root /usr/bin/nvidia-smi
@reboot root /usr/bin/nvidia-persistenced

Update 2025-05-06

I encountered an error when trying to set up alltalk tts:


nvidia-container-cli: mount error: stat failed: /dev/nvidia-modeset: no such file or directory: unknown

It turns out I needed to expose /dev/nvidia-modeset to the container as well. Thanks to this reddit post for the answer. The complete container passthrough config is now this:

lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 243:* rwm
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file

CLI, OS

Change ceph network

January 10, 2025 nicholas Leave a comment

My notes on changing which network your Proxmox CEPH cluster lives in. In my case I wanted to switch from a 10 gig network to a 40gig network in a different subnet. Source: https://forum.proxmox.com/threads/ceph-changing-public-network.119116

Change network configuration in “ceph.conf”
- Be sure to edit both cluster network and public network
Destroy and recreate monitors (one by one);
Destroy and recreate managers (one by one, leaving the active one for last);
Destroy and recreate metadata servers (one by one, leaving the active one for last;
Restart OSDs (one by one – or more, depending how many OSDs you have in the cluster – so you avoid restarting the hosts);

Get CEPH running on new Proxmox node

pveceph install –repository no-subscription

Move OSDs to new host

Source: https://forum.proxmox.com/threads/move-osd-to-another-node.33965/page-2

Follow a similar procedure above of downing each OSD one by one on the old host. Remove the drives and place them in the new host. Then run the following:

pvscan
ceph-volume lvm activate --all

Troubleshooting

Unable to remove monitor with unknown status

https://forum.proxmox.com/threads/ceph-cant-remove-monitor-with-unknown-status.63613

rm -r /var/lib/ceph/mon/ceph-pve2/

Remove failed host

I had to edit /etc/pve/ceph.conf manually, remove host when it failed. It wouldn’t work in the Proxmox GUI.

Hardware, OS

Docker for old Java-based IPMI systems

January 10, 2025 nicholas Leave a comment

As time goes on it’s getting harder to access the remote console on one of my older systems. It requires an old version of Java that just doesn’t work right on modern operating systems. Fortunately, Docker has come to my rescue.

ipmi-kvm-docker is a simple docker setup that will spin up an old version of firefox with the old version of Java that’s required. It wraps it all nicely in a graphical environment inside the docker container and uses novnc to serve this environment up as a web page. You can configure which port it uses, and then you simply connect to your docker host on that port. You can also specify a volume to mount if you have ISOs or other files you’d like your IPMI environment to see. Brilliant.

I simply ran these commands on my docker system to get it up and running:

docker run -p 8080:8080 -v /your/local/folder:/root/images solarkennedy/ipmi-kvm-docker

Then all I had to do was connect to my docker system on port 8080 in a browser. It worked great. My 10+year old server is still going strong, and easy to reinstall now thanks to this tool.

CLI, OS

Add multiple search domains in CentOS 7

June 8, 2022 nicholas Leave a comment

I needed to add multiple domains to search DNS with on my Cent7 box. It turns out there are two ways to do it. Cent7 uses networkmanager, so you can use the cli tool to add what you want, or you can edit the file directly.

Using nmcli:

sudo nmcli con mod eth0 ipv4.dns-search "domain1.org,domain2.org,domain3.org"

This causes nmcli to add this line to your network interface config file (/etc/sysconfig/network-scripts/ifcfg-eth0 in my case)

DOMAIN="domain1.org domain2.org domain3.org"

After either using nmcli or manually editing your file, simply restart the network service and your search domains now work!

CLI, OS

Restore files from remote borg repository disk image

May 9, 2021 nicholas Leave a comment

My off-site backup involves sending borgbackup archives of VM images to a remote synology server. I recently needed to restore a single file from one of the VM images stored within this borg backup repository on the remote server. My connection to this server is not very fast so I didn’t want to wait to download the entire image file to mount it locally.

My solution was to mount the remote borgbackup repository on my local machine over SSH so I could poke around for and copy the specific file I wanted. This requires the borgbackup binary to be present on the remote machine. Since it’s a synology, I simply copied the standalone binary over.

The restore process was complicated by the fact that the VM disk image is owned by root, so in order to access the file I needed to mount the remote repository as root.

This is the process:

Set BORG_REMOTE_PATH
1. export BORG_REMOTE_PATH=<PATH_TO_BORG_BINARY_ON_REMOTE_SYSTEM>
(Arch Linux): install python-llfuse
Mount repository over SSH:
1. borg mount <USER>@<REMOTE_SYSTEM>:<PATH_TO_REMOTE_BORGBACKUP_REPOSITORY>::<BACKUP_NAME> <MOUNT_FOLDER>
Follow disk image mounting process
1. losetup -Pr -f <PATH_TO_MOUNTED_BORGBACKUP>/<FILENAME_OF_VM_IMAGE>
2. mount -o ro /dev/loop0p2 /mnt/loop0/
Follow reverse to unmount when done:
1. umount /mnt/loop0
2. losetup -d /dev/loop0
3. borg umount <MOUNT_FOLDER>

Success! I was able to restore an individual file within a raw VM image backup on a remote Borgbackup repository using this method.

CLI, Networking, OS

send test syslog messages with nc

October 1, 2020 nicholas Leave a comment

I needed to send some test packets over UDP to make sure connectivity was working. I found this site which outlined how to do it really well

nc -u <IP/hostname> <port>

Then on the next line you can send test messages, then hit CTRL+D when done. In my case I wanted to test sending syslog data, so I did nc -u <hostname> 514, then wrote test messages. the -u specifies UDP and 514 is the syslog port. I was then able to confirm on the other end the message was received. Handy.

Hardware, Networking, OS

KVM with vga passthrough in arch linux

June 13, 2020 nicholas 2 Comments

I’ve once again switched from Proxmox to Arch Linux for my desktop machine. Both use KVM so it’s really just a matter of using the different VM manager syntax (virt-manager vs qm.) I used my notes from my previous stint with Arch, my article on GPU Passthrough in Proxmox as well as a thorough reading of the Arch wiki’s PCI Passthrough article.

Enable IOMMU

Configure GRUB to load the necessary iommu modules at boot. Append amd_iommu=on iommu=pt to the end of GRUB_CMDLINE_LINUX_DEFAULT (change accordingly if you have Intel instead of AMD)

sudo vim /etc/default/grub
...
GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 amd_iommu=on iommu=pt"

Run update-grub

sudo update-grub

Reserve GPU for VFIO

Reserve the GPU you wish to pass through to a VM for use with the vfio kernel driver (so the host OS doesn’t interfere with it)

Determine the PCI address of your GPU
1. Run lspci -v and look for your card. Mine was 01:00.0 & 01:00.1. You can omit the part after the decimal to include them both in one go – so in that case it would be 01:00
2. Run lspci -n -s <PCI address from above> to obtain vendor IDs.
  Example :
  lspci -n -s 01:00
  01:00.0 0300: 10de:1b81 (rev a1) 01:00.1 0403: 10de:10f0 (rev a1)
Assign your GPU to vfio driver using the IDs obtained above.
Example using above IDs:
echo "options vfio-pci ids=10de:1b81,10de:10f0" >> /etc/modprobe.d/vfio.conf

Reboot the host to put the kernel / drivers into effect.

Configure virt-manager

Install virt-manager, dnsmasq & libvirtd:

pacman -Sy libvirtd virt-manager dnsmasq
sudo systemctl enable libvirtd
sudo systemctl start libvirtd

Configure Networking

Assuming you’re using network manager for your connections, create a bridge (thanks to ciberciti.biz & the arch wiki for information on how to do so.) Replace interface names with ones corresponding to your machine:

sudo nmcli connection add type bridge ifname br0 stp no
sudo nmcli connection add type bridge-slave ifname enp4s0 master br0 
sudo nmcli connection show
#Make note of the active connection name
sudo nmcli connection down "Wired connection 2" #from above
sudo nmcli connection up bridge-br0

Create a second bridge bound to lo0 for host-only communication. Change IP as desired:

sudo nmcli connection add type bridge ifname br99 stp no ip4 192.168.2.1/24
sudo nmcli connection add type bridge-slave ifname lo master br99
sudo nmcli connection up bridge-br99

Configure VM

Initial configuration

When creating the passthrough VM, make sure chipset is Q35.

Set the CPU model to host-passthrough (type it in, there is no dropdown for it.)

When adding disks / other devices, set the device model to virtio

Add your GPU by going to Add Hardware and finding it under PCI Host Device.

Windows 10 specific tweaks

If your passthrough VM is going to be windows based, some tweaks are required to get the GPU to work properly within the VM.

Ignore MSRs (blue screen fix)

Later versions of Windows 10 instantly bluescreen with kmode_exception_not_handled unless you pass an option to ignore MSRs. Add the kvm ignore_msrs=1 option in /etc/modprobe.d/kvm.conf to do so. Optionally add the report_ignored_msrs=0 option to squelch massive amounts of kernel messages every time an MSR was ignored.

echo "options kvm ignore_msrs=1" >> /etc/modprobe.d/kvm.conf
#Optional - ignore kernel messages from ignored MSRs
echo "options kvm report_ignored_msrs=0" >> /etc/modprobe.d/kvm.conf

Reboot to make those changes take effect.

NVIDIA Code 43 workaround

Use the virsh edit command to make some tweaks to the VM configuration. We need to hide the fact that this is a VM otherwise the GPU drivers will not load and will throw Error 43. We need to add a vendor_id in the hyperv section, and create a kvm section enabling hidden state, which hides certain CPU flags that the drivers use to detect if they’re in a VM or not.

sudo virsh edit <VM_NAME>

<features>
	<hyperv>
		...
		<vendor_id state='on' value='1234567890ab'/>
		...
	</hyperv>
	...
	<kvm>
	<hidden state='on'/>
	</kvm>
</features>

Optimize CPU

Determine architecture

If you operate on a multi-core system such as my AMD Ryzen Threadripper the you will want to optimize your CPU core configuration in the VM per the CPU Pinning section in the Arch Wiki

Determine your CPU topology by running lscpu -e and lstopo The important things to look for are the CPU number and core number. On my box, it looks like this:

CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ
0 0 0 0 0:0:0:0 yes 3400.0000 2200.0000
1 0 0 1 1:1:1:0 yes 3400.0000 2200.0000
2 0 0 2 2:2:2:0 yes 3400.0000 2200.0000
3 0 0 3 3:3:3:0 yes 3400.0000 2200.0000
4 0 0 4 4:4:4:1 yes 3400.0000 2200.0000
5 0 0 5 5:5:5:1 yes 3400.0000 2200.0000
6 0 0 6 6:6:6:1 yes 3400.0000 2200.0000
7 0 0 7 7:7:7:1 yes 3400.0000 2200.0000
8 0 0 8 8:8:8:2 yes 3400.0000 2200.0000
9 0 0 9 9:9:9:2 yes 3400.0000 2200.0000
10 0 0 10 10:10:10:2 yes 3400.0000 2200.0000
11 0 0 11 11:11:11:2 yes 3400.0000 2200.0000
12 0 0 12 12:12:12:3 yes 3400.0000 2200.0000
13 0 0 13 13:13:13:3 yes 3400.0000 2200.0000
14 0 0 14 14:14:14:3 yes 3400.0000 2200.0000
15 0 0 15 15:15:15:3 yes 3400.0000 2200.0000
16 0 0 0 0:0:0:0 yes 3400.0000 2200.0000
17 0 0 1 1:1:1:0 yes 3400.0000 2200.0000
18 0 0 2 2:2:2:0 yes 3400.0000 2200.0000
19 0 0 3 3:3:3:0 yes 3400.0000 2200.0000
20 0 0 4 4:4:4:1 yes 3400.0000 2200.0000
21 0 0 5 5:5:5:1 yes 3400.0000 2200.0000
22 0 0 6 6:6:6:1 yes 3400.0000 2200.0000
23 0 0 7 7:7:7:1 yes 3400.0000 2200.0000
24 0 0 8 8:8:8:2 yes 3400.0000 2200.0000
25 0 0 9 9:9:9:2 yes 3400.0000 2200.0000
26 0 0 10 10:10:10:2 yes 3400.0000 2200.0000
27 0 0 11 11:11:11:2 yes 3400.0000 2200.0000
28 0 0 12 12:12:12:3 yes 3400.0000 2200.0000
29 0 0 13 13:13:13:3 yes 3400.0000 2200.0000
30 0 0 14 14:14:14:3 yes 3400.0000 2200.0000
31 0 0 15 15:15:15:3 yes 3400.0000 2200.0000

From the above output I see my CPU core 0 is shared by CPUs 0 & 16, meaning CPU 0 and CPU 16 (as seen by the Linux kernel) are hyperthreaded to the same physical CPU core.

Especially for gaming, you want to keep all threads on the same CPU cores (for multithreading) and the same CPU die (on my threadripper, CPUs 0-7 reside on one physical die, and CPUs 8-15 reside on the other, within the same socket.)

In my case I want to dedicate one CPU die to my VM with its accompanying hyperthreads (CPUs 0-7 & hyperthreads 16-23) You can accomplish this using the virsh edit command and creating a cputune section (make sure you have a matching vcpu count for the number of cores you’re configuring.) Also edit CPU mode with the proper topology of 1 socket, 1 die, 8 cores with 2 threads. Lastly, configure memory to only be from the proper NUMA node the CPU cores your VM is using (Read here for more info.)

sudo virsh edit <VM_NAME>

<domain type='kvm'>
  ...
  <vcpu placement='static' cpuset='0-7,16-23'>16</vcpu> 
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='16'/>
    <vcpupin vcpu='2' cpuset='1'/>
    <vcpupin vcpu='3' cpuset='17'/>
    <vcpupin vcpu='4' cpuset='2'/>
    <vcpupin vcpu='5' cpuset='18'/>
    <vcpupin vcpu='6' cpuset='3'/>
    <vcpupin vcpu='7' cpuset='19'/>
    <vcpupin vcpu='8' cpuset='4'/>
    <vcpupin vcpu='9' cpuset='20'/>
    <vcpupin vcpu='10' cpuset='5'/>
    <vcpupin vcpu='11' cpuset='21'/>
    <vcpupin vcpu='12' cpuset='6'/>
    <vcpupin vcpu='13' cpuset='22'/>
    <vcpupin vcpu='14' cpuset='7'/>
    <vcpupin vcpu='15' cpuset='23'/>
    <emulatorpin cpuset='0-7','26-23'/>
  </cputune>
  ...
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' dies='1' cores='8' threads='2'/>
    <feature policy='require' name='topoext'/>
    <numa>
      <cell id='0' cpus='0-15' memory='16777216' unit='KiB'/>
    </numa>
  </cpu>
  ...
</domain>

Configure NUMA

Non-uniform memory access is essential for 1st and 2nd gen Ryzen chips. It turns out that by default my motherboard hid the real NUMA configuration from the operating system. Remedy this by changing the BIOS setting to set Memory Interleaving = Channel (for my ASRock X399 motherboard it’s in CBS / DF options.) See here: https://www.reddit.com/r/Amd/comments/6vrcq0/psa_threadripper_umanuma_setting_in_bios/

After changing BIOS setting, lstopo now shows proper configuration:

CPU frequency

Change CPU frequency setting to use performance mode:

sudo pacman -S cpupower
sudo cpupower frequency-set -g performance

Enable Hugepages

Append default_hugepagesz=1G hugepagesz=1G hugepages=16 to the kernel line in /etc/default/grub and re-run sudo grub-mkconfig -o /boot/grub/grub.cfg

Configure FIFO CPU scheduling

The Arch Wiki mentions to run qemu-system-x86_64 with taskset and chrt but doesn’t mention how to do so if you’re using virt-manager. Fortunately this reddit thread outlined how to accomplish it: libvirt hooks. Create the following script and place it in /etc/libvirt/hooks/qemu , change the VM variable to match the name of your VM, mark that new file as executable (chmod +x /etc/libvirt/hooks/qemu ) and restart libvirtd

#!/bin/bash
#Hook to change VM to FIFO scheduling to decrease latency
#Place this file in /etc/libvirt/hooks/qemu and mark it executable

#Change the VM variable to match the name of your VM
VM="win10"

if [ "$1" == "$VM" ] && [ "$2" == "started" ]; then
  if pid=$(pidof qemu-system-x86_64); then
     chrt -f -p 1 $pid
    echo $(date) changing CPU scheduling to FIFO for VM $1 pid $pid >> /var/log/libvirthook.log
  else
    echo $(date) Unable to acquire PID of $1 >> /var/log/libvirthook.log
  fi
fi
#Additional debug
#echo $(date) libvirt hook arg1=$1 arg2=$2 arg3=$3 arg4=$4 pid=$pid >> /var/log/libvirthook.log

Isolate CPUs

Update 7/28/20: I no longer do this in favor of the qemu hook script above, which prioritizes to p1 the qemu process for the cores it needs. I’m leaving this section here for historical/additional tweaking purposes.

Update 6/28/20: Additional tuning since I was having some stuttering and framerate issues. Also read here about the emulatorpin option

Dedicate CPUs to the VM (host will not use them) – append isolcups, nohz_full & rcu_nocbs kernel parameters into /etc/default/grub

...
GRUB_CMDLINE_LINUX_DEFAULT=... isolcpus=0-7,16-23 nohz_full=0-7,16-23 rcu_nocbs=0-7,16-23
...

Update grub:

sudo grub-mkconfig -o /boot/grub/grub.cfg

Reboot, then check if it worked:

cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-linux root=/dev/mapper/arch-root rw loglevel=3 amd_iommu=on iommu=pt isolcpus=0-7,16-23 nohz_full=0-7,16-23 rcu_nocbs=0-7,16-23

taskset -cp 1
pid 1's current affinity list: 8-15,24-31

You can still tell programs to use the CPUs the VM has manually with the taskset command:

chrt -r 1 taskset -c <cores to use> <name of program/process>

Low Latency Audio

Upbate 7/8/2020: I found this article and this reddit thread (and this one) on how to use pulseaudio for your guest VM to get low latency guest VM audio piped to the host machine.

Update qemu config

edit /etc/libvirt/qemu.conf: uncomment the line #user = "root" and replace “root” with your username

Update pulseaudio config

Edit /etc/pulse/daemon.conf and uncomment the following lines (remove semicolon)

;default-sample-rate = 44100
;alternate-sample-rate = 48000

Note: Change VM audio settings to match 44100 sample rate

Edit /etc/pulse/default.pa and append auth-anonymous=1 to load-module module-native-protocol-unix

load-module module-native-protocol-unix auth-anonymous=1

The restart pulseaudio:

pulseaudio -k

Update VM XML

remove all audio devices from the virtual hardware details bar (left side in VM info view).

Edit XML via virsh edit <VM_NAME>

Make sure top line reads

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>

Add the following after </devices> (bottom of file)

<qemu:commandline>
    <qemu:arg value='-device'/>
    <qemu:arg value='ich9-intel-hda,bus=pcie.0,addr=0x1b'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='hda-micro,audiodev=hda'/>
    <qemu:arg value='-audiodev'/>
    <qemu:arg value='pa,id=hda,server=unix:/run/user/1000/pulse/native'/>
</qemu:commandline>

Replace /user/1000 with the UID of your user (output of id command)

Final Win10 XML tweaks for 1950x threadripper

<domain type='kvm' id='1' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
 ...
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  <vcpu placement='static'>16</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='16'/>
    <vcpupin vcpu='2' cpuset='1'/>
    <vcpupin vcpu='3' cpuset='17'/>
    <vcpupin vcpu='4' cpuset='2'/>
    <vcpupin vcpu='5' cpuset='18'/>
    <vcpupin vcpu='6' cpuset='3'/>
    <vcpupin vcpu='7' cpuset='19'/>
    <vcpupin vcpu='8' cpuset='4'/>
    <vcpupin vcpu='9' cpuset='20'/>
    <vcpupin vcpu='10' cpuset='5'/>
    <vcpupin vcpu='11' cpuset='21'/>
    <vcpupin vcpu='12' cpuset='6'/>
    <vcpupin vcpu='13' cpuset='22'/>
    <vcpupin vcpu='14' cpuset='7'/>
    <vcpupin vcpu='15' cpuset='23'/>
    <emulatorpin cpuset='8-15,24-31'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
  ...
  <features>
     ...
    <hyperv>
     ...
      <vendor_id state='on' value='1234567890ab'/>
    </hyperv>
    <kvm>
      <hidden state='on'/>
    </kvm>
    ...
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' dies='1' cores='8' threads='2'/>
    <feature policy='require' name='topoext'/>
    <numa>
      <cell id='0' cpus='0-15' memory='16777216' unit='KiB'/>
    </numa>
  </cpu>
   ...
  <qemu:commandline>
    <qemu:arg value='-device'/>
    <qemu:arg value='ich9-intel-hda,bus=pcie.0,addr=0x1b'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='hda-micro,audiodev=hda'/>
    <qemu:arg value='-audiodev'/>
    <qemu:arg value='pa,id=hda,server=unix:/run/user/1000/pulse/native'/>
  </qemu:commandline>
</domain>

Profit

I’m very pleased with my current setup. It works well!

arch install notes 2020-06

June 2, 2020 nicholas Leave a comment

My install notes to get Arch Linux set up just the way I like it, June 2020 edition. Reference: https://wiki.archlinux.org/index.php/Installation_guide

Change to dvorak layout:
loadkeys dvorak

Sync NTP time:
timedatectl set-ntp true

Configure disk:
fdisk
#create separate efi partition, LVM root & swap
pvcreate <dev> vgcreate arch <dev> lvcreate -L+2G arch -n swap lvcreate -l100%FREE -n root arch

Initialize swap:
mkswap /dev/arch/swap swapon /dev/arch/swap

Format & Mount root:
mkfs.ext4 /dev/arch/root mount /dev/arch/root /mnt

Create EFI partition
mkdosfs -F32 <partition 1> mkdir /mnt/efi mount <partition 1> /mnt/efi

Make mirrorlist use only xmission
sed -i 's/^Server/#Server/g;s/#Server\(.*xmission.*\)/Server\1/g' /etc/pacman.d/mirrorlist

Install base system plus extra packages:
pacstrap /mnt base linux linux-firmware lvm2 efibootmgr samba vim htop networkmanager inetutils man-db man-pages texinfo openssh grub

Generate fstab
genfstab -U /mnt >> /mnt/etc/fstab

Enter new environment chroot
arch-chroot /mnt

Set timezone
ln -sf /usr/share/zoneinfo/America/Boise /etc/localtime

Configure en_US locales
sed -i 's/^#en_US\(.*\)/en_US\1/g' /etc/locale.gen locale-gen

Make dvorak layout permanent
echo "KEYMAP=dvorak" > /etc/vconsole.conf

Set hostname
echo "_HOSTNAME_" > /etc/hostname
echo "127.0.1.1 _HOSTNAME_._DOMAIN_ _HOSTNAME_" >> /etc/hosts

Enable lvm2 hook for initial ramdisk (boot)
sed -i 's/HOOKS=(.*\<block\>/& lvm2/' /etc/mkinitcpio.conf

Generate initial ramdisk
mkinitcpio -P

Set password for root user:
passwd

Install Grub (EFI)
grub-install --target=x86_64-efi --efi-directory=/efi --bootloader-id=GRUB
grub-mkconfig -o /boot/grub/grub.cfg

Enable networking & SSH on bootup:
systemctl enable NetworkManager sshd

Configure NTP
yum -y install ntp #modify /etc/ntp.conf for timeservers as desired systemctl enable ntpd

Exit chroot & reboot
exit reboot

CLI, Hardware, OS, Virtualization

Threadripper / Epyc processor core optimization

May 10, 2020 nicholas 1 Comment

I had a pet project (folding@home) where I wanted to maximize computing power. I became frustrated with default CPU scheduling of my folding@home threads. Ideal performance would keep similar threads on the same CPU, but the threads were jumping all over the place, which was impacting performance.

Step one was to figure out which threads belonged to which physical cores. I found on this site that you can use cat to find out what your “sibling threads” are:

cat /sys/devices/system/cpu/cpu{0..15}/topology/thread_siblings_list

The above command is for my Threadripper & Epyc systems, which each have 16 cores hyperthreaded to 32 cores. Adjust the {0..15} number to match your number of cores (core 0 being the fist core.) This was my output:

cat /sys/devices/system/cpu/cpu{0..15}/topology/thread_siblings_list

0,16
1,17
2,18
3,19
4,20
5,21
6,22
7,23
8,24
9,25
10,26
11,27
12,28
13,29
14,30
15,31

Now that I know the sibling threads are offset by 16, I can use this information to optimize my folding@home VMs. I modified my CPU pinning script to take this into consideration. The script ensures that each VM is pinned to only use sibling threads (ensuring they all stay on the same physical CPU.)

This script should be used with caution. It pins processes to specific CPUs, which limits the kernel scheduler’s ability to move things around if needed. If configured badly this can cause the machine to lock up or VMs to be terminated.

I saw some impressive results spinning up four separate 8 core VMs and pinning them to sibling cores using this script. It almost doubled the rate at which I completed folding@home work units.

And now, the script:

#!/bin/bash
#Properly assign CPU cores to their respective die for EPYC/Threadripper systems
#Based on how hyperthreads are done in these systems
#cat /sys/devices/system/cpu/cpu{0..15}/topology/thread_siblings_list

#The script takes two arguments - the ID of the Proxmox VM to modify, and the core to begin the VM on
#If running this against multiple VMs, make sure to increment this second number by half of the cores of the previous VM
#For example, if I have one 8 core VM and I run this script specifying 0 for the offset, if I spin up a second VM, the second argument would be 4
#this would ensure the second VM starts on core 4 (the 5th core) and assigns sibling cores to match

set -eo pipefail

#take First argument as which VMID to pin CPU cores to, the second argument is which core to start pinning to
VMID=$1
OFFSET=$2

#Determine offset for sibling threads
SIBLING_THREAD_OFFSET=$(cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list| sed 's/,/ /g' | awk '{print $2}')

#Function to determine number of CPU cores a VM has
cpu_tasks() {
	expect <<EOF | sed -n 's/^.* CPU .*thread_id=\(.*\)$/\1/p' | tr -d '\r' || true
spawn qm monitor $VMID
expect ">"
send "info cpus\r"
expect ">"
EOF
}

#Only act if VMID & OFFSET are set
if [[ -z $VMID  || -z $OFFSET ]]
then
	echo "Usage: cpupin.sh <VMID> <OFFSET>"
	exit 1
else
	#Get PIDs of each CPU core for VM, count number of VM cores, and get even/odd PIDs for assignment
	VCPUS=($(cpu_tasks))
	VCPU_COUNT="${#VCPUS[@]}"
	VCPU_EVEN_THREADS=($(for EVEN_THREAD in "${VCPUS[@]}"; do echo $EVEN_THREAD; done | awk '!(NR%2)'))
	VCPU_ODD_THREADS=($(for ODD_THREAD in "${VCPUS[@]}"; do echo $ODD_THREAD; done | awk '(NR%2)'))

	if [[ $VCPU_COUNT -eq 0 ]]; then
		echo "* No VCPUS for VM$VMID"
		exit 1
	fi

	echo "* Detected ${#VCPUS[@]} assigned to VM$VMID..."
	echo "* Resetting cpu shield..."

	#Start at offset CPU number, assign odd numbered PIDs to their own CPU thread, then increment CPU core number
	#0-3 if offset is 0, 4-7 if offset is 4, etc
	ODD_CPU_INDEX=$OFFSET
	for PID in "${VCPU_ODD_THREADS[@]}"
	do
		echo "* Assigning ODD thread $ODD_CPU_INDEX to $PID..."
		taskset -pc "$ODD_CPU_INDEX" "$PID"
		((ODD_CPU_INDEX+=1))
	done

	#Start at offset + CPU count, assign even number PIDs to their own CPU thread, then increment CPU core number
	#16-19 if offset is 0,	20-23 if offset is 4, etc
	EVEN_CPU_INDEX=$(($OFFSET + $SIBLING_THREAD_OFFSET))
	for PID in "${VCPU_EVEN_THREADS[@]}"
	do
		echo "* Assigning EVEN thread $EVEN_CPU_INDEX to $PID..."
		taskset -pc "$EVEN_CPU_INDEX" "$PID"
		((EVEN_CPU_INDEX+=1))
	done
fi

CLI, Networking, OS

UBUNTU 20.04 cloned VM same DHCP IP fix

May 7, 2020 nicholas 2 Comments

I cloned an Ubuntu 20.04 VM and was frustrated to see both boxes kept getting the same DHCP IP address despite having different network MAC addresses. I finally found on this helpful post which states Ubuntu 20.04 uses systemd-networkd for DHCP leases which behaves differently than dhclient. As wickedchicken states,

systemd-networkd uses a different method to generate the DUID than dhclient. dhclient by default uses the link-layer address while systemd-networkd uses the contents of /etc/machine-id. Since the VMs were cloned, they have the same machine-id and the DHCP server returns the same IP for both.

To fix, replace the contents of one or both of /etc/machine-id. This can be anything, but deleting the file and running systemd-machine-id-setup will create a random machine-id in the same way done on machine setup.

So my fix was to run the following on the cloned machine:

sudo rm /etc/machine-id
sudo systemd-machine-id-setup
sudo reboot

That did the trick!

For the systems that registered their hostnames under the wrong IPs, I had to take the following action for my Ubuntu 20.04 desktop as well as my Ubiquiti USG-Pro 4

Ubiquiti: Clear DHCP lease

clear dhcp lease ip <ip_address>

Ubuntu desktop: Flush DNS

sudo systemd-resolve --flush-caches

Not working after reboot

Get CEPH running on new Proxmox node

Move OSDs to new host

Troubleshooting

Unable to remove monitor with unknown status

Remove failed host

Enable IOMMU

Reserve GPU for VFIO

Configure virt-manager

Install virt-manager, dnsmasq & libvirtd:

Configure Networking

Configure VM

Initial configuration

Windows 10 specific tweaks

Ignore MSRs (blue screen fix)

NVIDIA Code 43 workaround

Optimize CPU

Determine architecture

Configure NUMA

CPU frequency

Enable Hugepages

Configure FIFO CPU scheduling

Isolate CPUs

Low Latency Audio

Update qemu config

Update pulseaudio config

Update VM XML

Final Win10 XML tweaks for 1950x threadripper

Profit

Nick's technical musings