Proxmox

Introduction

Proxmox is the default operating system for Campus Stations

Versions

Minimum Supported: Proxmox VE 9.1

LXD to Proxmox Migration

Ubuntu LXD and Proxmox LXC are both popular compute hosts for trust assets. Migrating between them is easy.

Original host:

lxc exec haproxyct -- bash -c "cd / && tar --exclude=run --exclude=dev --exclude=sys --exclude=proc -czf /haproxy.tar.gz ./*"
lxc file pull "haproxyct/haproxyct.tar.gz" /home/
scp /home/haproxyct.tar.gz user@furious-new-dedicated-server:/

Destination Host:

pct create 100 /haproxyct.tar.gz -hostname haproxyct --rootfs lv_thin:10 --net0 name=eth0,ip=10.10.10.10/24,bridge=vmbr0,gw=10.10.10.254

Reference:

LXC Hardware Watchdog

Rebooting an LXC container using a hardware watchdog in a virtualization environment like Proxmox VE involves setting up a virtual watchdog device that triggers a container restart or a host reboot when the container becomes unresponsive. While LXC containers typically run in user space, they can be configured with virtual hardware watchdogs that are monitored by the host system.

Key Approaches for LXC Watchdog Reboot

  • Proxmox HA (High Availability) with Watchdog: The most robust method for automatically rebooting a stalled LXC container. Enable HA on the container and configure watchdog-based fencing, which will monitor the container and restart it if it hangs.
  • Virtual Watchdog Configuration: In Proxmox, you can add a virtual watchdog device to the container configuration. Using a watchdog module like i6300esb allows the host to monitor the container.
  • Systemd Watchdog: Inside the container, you can enable RuntimeWatchdogSec in /etc/systemd/system.conf to automatically trigger a reboot if the container's systemd manager freezes.
  • Manual Trigger (Testing): To simulate a hang and trigger the watchdog, you can run echo c > /proc/sysrq-trigger (assuming the watchdog is properly configured to act on kernel panics).

Steps to Implement (Proxmox VE)

  1. Configure Container: Add watchdog: model=i6300esb,action=reset to the container configuration file located at /etc/pve/lxc/[VMID].conf.
  2. Enable HA: Configure the LXC container in the Proxmox HA manager, setting restart limits to ensure it restarts upon failure.
  3. Kernel Modules: Ensure necessary watchdog modules (e.g., i6300esb) are loaded on the host.
  4. Verify Setup: Check if the watchdog is active by running dmesg | grep watchdog on the host.

Important Considerations

  • Softdog vs. Hardware: If no dedicated hardware watchdog is present, the system will fall back to the softdog kernel module.
  • Action: The watchdog can be set to either poweroff or reset (reboot) the container.
  • Nvidia GPU: If the LXC container relies on GPU passthrough, you may need to ensure the GPU drivers are properly managed on the host to avoid needing a full host reboot.