Fixing Proxmox Boot Hangs When Passing Through 2× RTX 3090 GPUs: Step-by-Step Troubleshooting Guide

Running multiple NVIDIA GPUs for AI workloads in Proxmox VE can cause early boot hangs if the host OS tries to load conflicting drivers. In this guide I document how my Proxmox host with 2× RTX 3090 was stuck at systemd-modules-load, how I debugged it, which files to inspect (/etc/default/grub, /etc/modprobe.d/, /etc/modules-load.d/), and the final stable configuration for rock-solid GPU passthrough to an Ubuntu VM.

What Happened

After setting up Proxmox VE to host an AI VM with dual RTX 3090s, the system began freezing during boot. The last message was:

sysinit.target: starting held back, waiting for: systemd-modules-load.service

Only booting with systemd.mask=systemd-modules-load.service worked. This pointed to a bad module load order during early boot.

Key Checks During Debugging

  1. Kernel Command Line (/etc/default/grub)
    Must include: amd_iommu=on iommu=pt vfio-pci.ids=10de:2204,10de:1aef rd.driver.pre=vfio-pci \ nomodeset video=efifb:off video=vesafb:off modprobe.blacklist=nouveau,nvidiafb,nvidia
  2. Module Auto-Load Lists
    • /etc/modules → should be empty.
    • /etc/modules-load.d/*.conf → must not list nvidia, nouveau, or vfio_virqfd.
  3. Blacklists (/etc/modprobe.d/*.conf)
    Block NVIDIA drivers: blacklist nouveau blacklist nvidia blacklist nvidia_drm blacklist nvidia_modeset blacklist nvidiafb options nouveau modeset=0 And bind IDs to VFIO: options vfio-pci ids=10de:2204,10de:1aef
  4. Initramfs
    Always run: update-grub update-initramfs -u -k all
  5. Runtime Validation
    • systemctl status systemd-modules-load → should be active (exited).
    • lsmod | egrep 'vfio|nvidia|nouveau' → vfio present, no nvidia/nouveau.
    • lspci -k | grep -A3 -E "NVIDIA|Audio" → both GPUs bound to vfio-pci.

My Final Stable Configuration

  • /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet nomodeset amd_iommu=on iommu=pt vfio-pci.ids=10de:2204,10de:1aef rd.driver.pre=vfio-pci modprobe.blacklist=nouveau,nvidiafb,nvidia video=efifb:off video=vesafb:off"
  • /etc/modules → empty
  • /etc/modules-load.d/ → empty
  • /etc/modprobe.d/blacklist-gpu-host.conf blacklist nouveau blacklist nvidia blacklist nvidia_drm blacklist nvidia_modeset blacklist nvidiafb options nouveau modeset=0
  • /etc/modprobe.d/vfio.conf options vfio-pci ids=10de:2204,10de:1aef

This ensures Proxmox never binds the RTX 3090s, leaving them cleanly available for passthrough to my Ubuntu AI VM.

Copy-Paste Ready Checklist

Run these commands on your Proxmox host to apply the stable config:

# 1. Configure GRUB
sed -i -E 's|^GRUB_CMDLINE_LINUX_DEFAULT=.*|GRUB_CMDLINE_LINUX_DEFAULT="quiet nomodeset amd_iommu=on iommu=pt vfio-pci.ids=10de:2204,10de:1aef rd.driver.pre=vfio-pci modprobe.blacklist=nouveau,nvidiafb,nvidia video=efifb:off video=vesafb:off"|' /etc/default/grub
update-grub

# 2. Clean module auto-load lists
truncate -s0 /etc/modules
truncate -s0 /etc/modules-load.d/modules.conf

# 3. Blacklist NVIDIA drivers
cat >/etc/modprobe.d/blacklist-gpu-host.conf <<'EOF'
blacklist nouveau
blacklist nvidia
blacklist nvidia_drm
blacklist nvidia_modeset
blacklist nvidiafb
options nouveau modeset=0
EOF

# 4. Bind both RTX 3090s (GPU + audio) to vfio-pci
cat >/etc/modprobe.d/vfio.conf <<'EOF'
options vfio-pci ids=10de:2204,10de:1aef
EOF

# 5. Rebuild initramfs
update-initramfs -u -k all

# 6. Reboot
reboot

After reboot:

  • lsmod | grep vfio → shows vfio modules.
  • lsmod | grep nvidia → no output.
  • lspci -k | grep -A3 NVIDIAKernel driver in use: vfio-pci.

At this point, your Proxmox host boots reliably and both RTX 3090s are ready for passthrough to your Ubuntu VM.

Related posts

Fixing Proxmox Boot Hangs When Passing Through 2× RTX 3090 GPUs: Step-by-Step Troubleshooting Guide

Running multiple NVIDIA GPUs for AI workloads in Proxmox VE can cause early boot hangs if the host OS tries to load conflicting drivers. In this guide I document how my Proxmox host with 2× RTX 3090 was stuck at systemd-modules-load, how I debugged it, which files to inspect (/etc/default/grub, /etc/modprobe.d/, /etc/modules-load.d/), and the final stable configuration for rock-solid GPU passthrough to an Ubuntu VM.

Building the Perfect Edge AI Supercomputer – Adding an Edge Virtualization Layer with Proxmox and GPU Passthrough

I built on my edge AI hardware by adding Proxmox VE as the virtualization layer. After prepping BIOS, using Rufus with the nomodeset trick, and installing Proxmox, I enabled IOMMU, configured VFIO, and passed through 2× RTX 3090 GPUs to a single Ubuntu VM. This setup lets me run private AI workloads at near bare-metal speed, while keeping Windows and native Ubuntu for special use cases.

Budget AI Supercomputers: Dell Server vs. Threadripper Build vs. Next-Gen AI Desktop

Exploring three budget AI supercomputer paths: a Dell R740xd for enterprise labs with big storage but limited GPU flexibility, a TRX50 + Threadripper 7970X workstation offering fast DDR5, Gen5 NVMe, and dual RTX GPU power, and the futuristic GB10 AI desktop with unified CPU/GPU memory. Dell is lab-friendly, GB10 is AI-only, but the TRX50 build strikes the best balance today.

Building the Perfect Edge AI Supercomputer – Cost Effective Hardware

Keeping up with today’s technology is both exciting and demanding. My passion for home labs started many years ago, and while my family often jokes about the time and money I spend on self-education, they understand the value of staying ahead in such a fast-moving field. What started as curiosity has grown into a journey of building cost-effective supercomputers for edge AI and virtualization.

Fix VMware Workstation Performance Issues on Windows 11: Disable Hyper-V and VBS

This blog explains why VMware Workstation runs slower on Windows 11 compared to Windows 10, focusing on changes like Hyper-V, VBS, and HVCI being enabled by default on modern CPUs. It explores why sharing hypervisors with native hardware causes performance issues, and why disabling Hyper-V restores full VMware performance. Step-by-step PowerShell scripts are provided to toggle Hyper-V on or off safely.

Terraform deployment for FortiGate Next-Generation Firewall in Microsoft Azure

This blog explores deploying FortiGate VM in Azure, tackling challenges like license restrictions, Terraform API changes, and Marketplace agreements. It offers insights, troubleshooting tips, and lessons learned for successful single VM deployment in Azure. Using an evaluation license combined with B-series Azure VMs running FortiGate is primarily intended for experimentation and is not recommended for production environments.

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.