Migration from Proxmox to Talos Linux

Why Leave Proxmox?

After running Proxmox for several years, I decided it was time for a fresh approach. Don’t get me wrong - Proxmox is excellent. But I wanted:

The New Stack

Talos Linux

Talos is an OS designed specifically for Kubernetes. Key features:

KubeVirt for VMs

Instead of traditional hypervisors, KubeVirt extends Kubernetes to run VMs:

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: ubuntu-vm
spec:
  running: true
  template:
    spec:
      domain:
        devices:
          disks:
          - name: root
            disk:
              bus: virtio
        resources:
          requests:
            memory: 4Gi
            cpu: 2

VMs become Kubernetes objects. Start, stop, manage them with kubectl or our custom console.

Migration Process

1. Backup Everything

Critical step. I backed up:

2. Talos Installation

The Boot Challenge:

The first challenge was getting the R430 to boot from the Talos ISO instead of Proxmox. After mounting the ISO via iDRAC virtual media, the server kept booting to Proxmox’s GRUB menu.

Solution:

  1. Wiped the Proxmox bootloader: dd if=/dev/zero of=/dev/sda bs=1M count=100
  2. Reset USB status in iDRAC
  3. Remapped ISO as CD/DVD (not removable disk)
  4. Used F11 Boot Manager to explicitly select virtual media

Installation Process:

# Generate configuration
talosctl gen config tom-lab-cluster https://192.168.1.100:6443 \
  --output-dir ~/r430-migration/talos-config

# Apply configuration
talosctl apply-config --insecure --nodes 192.168.1.100 \
  --file ~/r430-migration/talos-config/controlplane.yaml

Disk Selection Issue:

Initially tried /dev/sda, but the virtual CD took that device. Switched to /dev/sdb:

machine:
  install:
    disk: /dev/sdb  # Not /dev/sda!

After installation, Talos reboots and Kubernetes starts initializing.

3. Bootstrap Kubernetes

The Bootstrap Process:

# Set talosconfig
export TALOSCONFIG=~/r430-migration/talos-config/talosconfig

# Bootstrap etcd (cluster state)
talosctl bootstrap --nodes 192.168.1.100

# Get kubeconfig (with retry logic)
talosctl -e 192.168.1.100 --nodes 192.168.1.100 kubeconfig --force

# Verify cluster
kubectl get nodes
# NAME              STATUS   ROLES           AGE   VERSION
# r430-k8s-master   Ready    control-plane  5m    v1.29.1

Common Issue: kubeconfig Not Working

If you see “connection refused”, the kubeconfig might not be set correctly:

# Force regenerate
talosctl -e 192.168.1.100 --nodes 192.168.1.100 kubeconfig --force

# Verify
kubectl get nodes

Single-node cluster ready in ~5 minutes. All control plane components run on the same node.

4. Storage Layer

First Attempt: Longhorn

Longhorn seemed perfect - distributed storage with snapshots and backups. But we hit a critical blocker:

# Longhorn requires iSCSI
talosctl -e 192.168.1.100 --nodes 192.168.1.100 read /usr/sbin/iscsiadm
# Error: no such file or directory

Talos Linux is immutable and minimal - it doesn’t include open-iscsi by default. Adding it would require custom extensions, adding complexity.

Solution: Local Path Provisioner

Switched to Local Path Provisioner - simpler and perfect for single-node:

# Install
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.26/deploy/local-path-storage.yaml

# Set as default
kubectl patch storageclass local-path \
  -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

# Verify
kubectl get storageclass
# NAME         PROVISIONER            RECLAIMPOLICY
# local-path   rancher.io/local-path  Delete

PodSecurity Fix:

The provisioner’s helper pods need privileged access:

kubectl label namespace local-path-storage \
  pod-security.kubernetes.io/enforce=privileged --overwrite

Works perfectly for a single-node lab. No replication needed, fast local storage.

5. Networking

MetalLB for LoadBalancer IPs:

# Install MetalLB
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.3/config/manifests/metallb-native.yaml

# Configure IP pool (interactive in script)
# IP range: 192.168.1.200-192.168.1.220
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: default-pool
  namespace: metallb-system
spec:
  addresses:
  - 192.168.1.200-192.168.1.220
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: default
  namespace: metallb-system
spec:
  ipAddressPools:
  - default-pool

Traefik as Ingress Controller:

Traefik deployed and exposed on 192.168.1.200. Later, we added Nginx Proxy Manager for easier domain management.

Nginx Proxy Manager:

After initial Traefik setup, we deployed NPM for better SSL certificate management and user-friendly proxy host configuration. NPM runs on 192.168.1.202 and handles all external-facing services.

6. KubeVirt

Installation:

# Set version
export KUBEVIRT_VERSION=v1.1.2

# Install operator
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/${KUBEVIRT_VERSION}/kubevirt-operator.yaml

# Wait for operator
kubectl wait --for=condition=ready pod -n kubevirt-system \
  -l kubevirt.io=virt-operator --timeout=120s

# Install KubeVirt
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/${KUBEVIRT_VERSION}/kubevirt-cr.yaml

# Wait for components
kubectl wait --for=condition=ready pod -n kubevirt-system \
  -l kubevirt.io=virt-handler --timeout=300s

Hardware Virtualization Check:

The installation script checks for VT-x/AMD-V support. Even if the script reports it’s not detected, KubeVirt will automatically use hardware virtualization if available:

# Verify CPU support
talosctl -e 192.168.1.100 --nodes 192.168.1.100 read /proc/cpuinfo | grep vmx
# Should show: vmx (Intel) or svm (AMD)

VMs now run alongside containers with near-native performance.

Results

What Works Great

Challenges

Performance

Dell R430 specs:

Kubernetes overhead is minimal. VMs run at near-native speed with KubeVirt.

Custom Console

Why Build Custom?

KubeSphere installation failed due to:

Our Solution:

Built a lightweight custom console:

Key Challenges Solved:

Simple, fast, does exactly what we need. See Custom Console article for full details.

Conclusion

Worth it. The migration took a weekend, but I now have:

Would I recommend it? If you:

Then yes, absolutely.


Next: Architecture Deep Dive