Tag Archives: systemd

Configure Zimbra live replication

I’ve recently configured live active replication from my Zimbra e-mail server to a backup server. This is really slick – in the event of primary server failure, I can bring up my secondary in a matter of minutes with no data loss. I used the Zimbra live sync scripts on Gitlab to accomplish this.

These are my notes on things I needed to do in addition to the readme to get things to work properly on my Zimbra 8.8.15 Open Source Edition installs on CentOS 7 boxes.

Install atd (at package):
sudo yum install atd

Make sure the backup server has the same firewall rules as the primary: https://wiki.zimbra.com/wiki/Ports

On the backup server, configure DNS for the mail server to resolve to the Backup server’s IP address. hostname: mail.server.dns -> mirror mail server.

Disable DNS forwarding for primary mail server domain if configured (to ensure mail goes to backup server in the event of switchover.)

Clone over prod mail server, spin up and change network settings:

  • keep hostname (important)
  • change IP, DNS, hosts to use new IP address/network

/etc/sysconfig/network-scripts/ifcfg-eth0
/etc/hosts
/etc/resolv.conf

Ensure proper VLAN settings in backup VM (may be different than primary)

Systemd service:
add Environment=PATH=/opt/zimbra/bin:/opt/zimbra/common/lib/jvm/java/bin:/opt/zimbra/common/bin:/opt/zimbra/common/sbin:/usr/sbin:/sbin:/bin:/usr/sbin:/usr/bin
WorkingDirectory=/opt/zimbra

Remove start argument from ExecStart: ExecStart=/opt/zimbra/live_sync/live_syncd

This is the complete systemd unit for live sync:

[Unit]
Description=Zimbra live sync - to be run on the mirror server
After=network.target

[Service]
ExecStart=/opt/zimbra/live_sync/live_syncd
ExecStop=/opt/zimbra/live_sync/live_syncd kill
User=zimbra
Environment=PATH=/opt/zimbra/bin:/opt/zimbra/common/lib/jvm/java/bin:/opt/zimbra/common/bin:/opt/zimbra/common/sbin:/usr/sbin:/sbin:/bin:/usr/sbin:/usr/bin
WorkingDirectory=/opt/zimbra

[Install]
WantedBy=multi-user.target

Time limit

It looks like there’s a time limit for how long Zimbra keeps redo logs. It means you will get a lost mail situation if you try to bring your primary server back up after it’s been offline for too long (more than a few weeks.) If you’ve been failed over to your secondary mail server for more than two weeks, you’ll want to do the reverse procedure – clone the backup to the primary, edit IP addresses, then run the zimbra live sync. Log into the restored server to ensure mails from greater than 2 weeks ago are all there.

piKVM pushover startup script

I’ve had an issue where I wasn’t sure if my dynamic DNS provider registered properly. I then realized that I have a piKVM attached to one of my servers that boots on powerup, even if the server does not. I could utilize this piKVM to help me out.

Thanks to inspiration from Chris Dzombak I was able to whip up a little script that runs on startup. This script waits 5 minutes to allow for my firewall and modem to boot up, then sends a pushover notification to let me know the piKVM is online and what its external IP address is.

To get it working on the piKVM I had to enter into RW mode, write and save the script, add execute permissions to the script, then configure a systemd service to run the script at startup.

Here is the script, saved under /root/boot-pushover.sh

#!/usr/bin/env bash
set -eu

#Wait 5 minutes to allow router bootup
sleep 300

TOKEN="PUSHOVER_APPLICATION_TOKEN"
USER="PUSHOVER_USER_TOKEN"
EXTERNAL_IP="$(curl ifconfig.me)"
MESSAGE="$(hostname) is online. External IP: $EXTERNAL_IP"

#Send pushover command to alert it's up and send its external IP
curl -s \
  --form-string "token=$TOKEN" \
  --form-string "user=$USER" \
  --form-string "message=$MESSAGE" \
  https://api.pushover.net/1/messages.json

Set executable: chmod +x /root/boot-pushover.sh

Here is the systemd service, saved under /etc/systemd/system/boot-pushover-notification.service

[Service]
Type=oneshot
ExecStart=/root/boot-pushover.sh
RemainAfterExit=yes
User=root
Group=root
RestartSec=15
Restart=on-failure

[Unit]
Wants=network.target
After=network.target nss-lookup.target

[Install]
WantedBy=multi-user.target

Reload daemons & enable startup:

systemctl daemon-reload
systemctl enable boot-pushover-notification.service

Test by exiting rw mode and rebooting the piKVM:

ro
reboot

It works really well!

create podman services with podman-compose

Podman is a fork of Docker that Redhat is using. I really liked docker-compose functionality; fortunately there is a podman-compose project which is more or less the same thing.

I now have a setup where each podman container is controlled by a systemd service, set to run on startup, with version controlled podman-compose files.

First, I installed podman-compose:

sudo curl -o /usr/local/bin/podman-compose https://raw.githubusercontent.com/containers/podman-compose/devel/podman_compose.py
chmod +x /usr/local/bin/podman-compose

I then created podman-compose files (syntax identical to docker-compose) for each container. Here is one example (jackett.yml)

---
version: "2"
services:
  jackett:
    image: linuxserver/jackett
    container_name: jackett
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=America/Boise
    volumes:
      - /mnt/storage/Docker/Jackett/config:/config
      - /mnt/storage/Docker/Jackett/downloads:/downloads
    ports:
      - 9117:9117
    restart: unless-stopped

I then created a corresponding systemd unit file for each container:

#/etc/systemd/system/jackett.service
[Unit]
Description=Jackett
After=network.target

[Service]
Restart=always

# Compose up
ExecStart=/usr/local/bin/podman-compose -f /home/nicholas/podman/jackett.yml up

# Compose down, remove containers and volumes
ExecStop=/usr/local/bin/podman-compose -f /home/nicholas/podman/jackett.yml down -v

[Install]
WantedBy=multi-user.target

I then do a systemctl daemon-reload, and enable the service for startup:

sudo systemctl daemon-reload
sudo systemctl enable jackett

Success.

Why not create a single podman-compose file for all my services, instead of creating individual services for each container? I wanted to be able to clearly see log output for each container with journalctl -f -u <service name.> If you lump all your services in a single compose file, the output from each container gets all jumbled into that single service log. Separating out each container into its own service was more clean.

Track and log unclean shutdowns in CentOS 7

I needed to find a way to track if my CentOS 7 systems reboot unexpectedly. I was surprised that this isn’t something that the OS does by default. I found this article from RedHat that outlines that you basically have to write a couple of systemd scripts yourself if you want this functionality. So, I did.

I ended up with three separate systemd services that accomplish what I want:

  • set_graceful_shutdown: Runs just before shutdown. Creates a file /root/grateful_shutdown
  • log_ungraceful_shutdown: Runs on startup. Checks to see if /root/grateful_shutdown is missing and logs this fact to a file (/var/log/shutdown.log) if it is.
  • reset_shutdown_flag: Runs after log_ungraceful_shutdown. It checks for the presence of that file, and if it exists, removes it.

I placed these three files into /etc/systemd/system and then ran systemctl daemon-reload & systemctl enable for each one.

set_graceful_shutdown.service

[Unit]
Description=Set flag for graceful shutdown
DefaultDependencies=no
RefuseManualStart=true
Before=shutdown.target

[Service]
Type=oneshot
ExecStart=/bin/touch /root/graceful_shutdown

[Install]
WantedBy=shutdown.target

log_ungraceful_shutdown.service

[Unit]
Description=Log ungraceful shutdown
ConditionPathExists=!/root/graceful_shutdown
RefuseManualStart=true
RefuseManualStop=true
Before=reset_shutdown_flag.service

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c "echo $$(date): Improper shutdown detected >> /var/log/shutdown.log"

[Install]
WantedBy=multi-user.target

reset_shutdown_flag.service

[Unit]
Description=Check if previous system shutdown was graceful
ConditionPathExists=/root/graceful_shutdown
RefuseManualStart=true
RefuseManualStop=true

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/rm /root/graceful_shutdown

[Install]
WantedBy=multi-user.target

It feels like a kludge but it works pretty well. The result is I get an entry in a log file if the system wasn’t shut down properly.

Mount encfs folder on startup with systemd

A quick note on how to encrypt a folder with encfs and then mount it on boot via a systemd startup script. In my case the folder is located on a network drive and I wanted it to happen whether I was logged in or not.

Create encfs folder:

encfs <path to encrypted folder> <path to mount decrypted folder>

Follow the prompts to create the folder and set a password.

Next create a file which will contain your decryption password

echo "YOUR_PASSWORD" > /home/user/super_secret_password_location
chmod 700 /home/user/super_secret_password_location

Create a simple script to be called by systemd on startup using cat to pass your password over to encfs

#!/bin/bash
cat super_secret_password_location | encfs -S path_to_encrypted_folder path_to_mount_decrypted_folder

Finally create a systemd unit to run your script on startup:

vim /etc/systemd/system/mount-encrypted.service
[Unit] 
Description=Mount encrypted folder 
After=network.target 

[Service] 
User=<YOUR USER> 
Type=oneshot 
ExecStartPre=/bin/sleep 20 
ExecStart=PATH_TO_SCRIPT
TimeoutStopSec=30 
KillMode=process 

[Install] 
WantedBy=multi-user.target

Then enable the unit:

sudo systemctl daemon-reload
sudo systemctl enable mount-encrypted.service

VGA Passthrough with Threadripper

An unfortunate bug exists for the AMD Threadripper family of GPUs which causes VGA Passthrough not to work properly. Fortunately some very clever people have implemented a workaround to allow proper VGA passthrough until a proper Linux Kernel patch can be accepted and implemented. See here for the whole story.

Right now my Thrdearipper 1950x successfully has GPU passthrough thanks to HyenaCheeseHeads “java hack” applet.  I went this route because I really didn’t want to try and recompile my ProxMox kernel to get passthrough to work. Per the description “It is a small program that runs as any user with read/write access to sysfs (this small guide assumes “root”). The program monitors any PCIe device that is connected to VFIO-PCI when the program starts, if the device disconnects due to the issues described in this post then the program tries to re-connect the device by rewriting the bridge configuration.” Instructions taken from the above Reddit post.

  • Go to https://pastebin.com/iYg3Dngs and hit “Download” (the MD5 sum is supposed to be 91914b021b890d778f4055bcc5f41002)
  • Rename the downloaded file to “ZenBridgeBaconRecovery.java” and put it in a new folder somewhere
  • Go to the folder in a terminal and type “javac ZenBridgeBaconRecovery.java”, this should take a short while and then complete with no errors. You may need to install the Java 8 JDK to get the javac command (use your distribution’s software manager)
  • In the same folder type “sudo java ZenBridgeBaconRecovery”
  • Make sure that the PCIe device that you intend to passthru is listed as monitored with a bridge
  • Now start your VM

In my case (Debian Stretch, ProxMox) I needed to install openjdk-8-jdk-headless

sudo apt install openjdk-8-jdk-headless
javac ZenBridgeBaconRecovery.java

Next I have a little script on startup to spawn this as root in a detached tmux session, so I don’t have to remember to run it (If you try to start your VM before running this, it will hose passthrough on your system until you reboot it.) Be sure to change the script to point to wherever you compiled ZenBridgeBaconRecovery

#!/bin/bash
cd /home/nicholas  #change me to suit your needs
sudo java ZenBridgeBaconRecovery

And here is the command I use to run on startup:

tmux new -d '/home/nicholas/passthrough.sh'

Again, be sure to modify the above to point to the path of wherever you saved the above script.

So far this works pretty well for me. I hate having to run a java process as sudo, but it’s better than recompiling my kernel.


Update 6/27/2018:  I’ve created a systemd service script for the ZenBaconRecovery file to run at boot. Here is my file, placed in
/etc/systemd/system/zenbridge.service:  (change your working directory to match the zenbridgebaconrecovery java file location. Don’t forget to do systemctl daemon-reload.)

[Unit] 
Description=Zen Bridge Bacon Recovery 
After=network.target 

[Service] 
Type=simple 
User=root 
WorkingDirectory=/home/nicholas 
ExecStart=/usr/bin/java ZenBridgeBaconRecovery 
Restart=on-failure # or always, on-abort, etc 

[Install] 
WantedBy=multi-user.target 
~

Update 8/18/2018 Finally solved for everyone!

Per an update on the reddit thread motherboard manufactures have finally put out BIOS updates that resolve the PCI passthrough problems. I updated my X399 Tachi to the latest version of its UEFI BIOS (3.20) and indeed PCI passthrough worked without any more wonky workarounds!

XAPI won’t start in Xenserver 7

I came home yesterday to discover that every last one of my VMs were unresponsive. It was most distressing. I couldn’t even SSH into my xenserver – it was unresponsive too. Its physical console had dropped into an emergency shell. A reboot allowed me to get a physical console again, but my networking and VMs would not start.

In trying to pick up the pieces and put everything back together I ran

systemctl --failed

which revealed several key services not running – namely openvswitch and xapi (very important services.) Manually starting them did nothing – they would silently fail and immediately quit working.

After banging my head against a wall for a bit (I really didn’t want to restore from backup) I stumbled across this post. It states in essence that xapi won’t start if the disk is full. I checked disk usage and it said I had a few gigs free, but thought I’d try the steps in the post anyway.

ls /var/log

revealed quite a lot of log files. I then decided to just delete all the .gz archived logs:

rm /var/log/*.gz

After doing this, xapi started. I restarted the hypervisor for good measure and everything came up – all back to normal as if nothing had happened.

It’s incredibly frustrating that Xenserver is designed to be a ticking time bomb with default configuration. If you don’t take care to manually delete old logs, or alternatively send logs to a remote log server, it will crash and burn. This is stupid. That being said, I was impressed that it recovered so gracefully once I freed up some disk space.

If you’re running xenserver, make sure you’re logging somewhere else – or put a cron job to delete old log files!