Migration from Proxmox to Talos Linux
Why Leave Proxmox?
After running Proxmox for several years, I decided it was time for a fresh approach. Don’t get me wrong - Proxmox is excellent. But I wanted:
- Immutable infrastructure - No SSH, no manual changes
- Infrastructure as code - Everything versioned in Git
- Modern orchestration - Kubernetes native
- Unified platform - Containers and VMs together
The New Stack
Talos Linux
Talos is an OS designed specifically for Kubernetes. Key features:
- API-driven - No SSH access, all config via API
- Immutable - Fresh boot every time, no drift
- Secure by default - Minimal attack surface
- Kubernetes-native - Optimized for K8s workloads
KubeVirt for VMs
Instead of traditional hypervisors, KubeVirt extends Kubernetes to run VMs:
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: ubuntu-vm
spec:
running: true
template:
spec:
domain:
devices:
disks:
- name: root
disk:
bus: virtio
resources:
requests:
memory: 4Gi
cpu: 2
VMs become Kubernetes objects. Start, stop, manage them with kubectl or our custom console.
Migration Process
1. Backup Everything
Critical step. I backed up:
- VM disks and configs
- Network configuration
- DNS records
- Service dependencies
2. Talos Installation
The Boot Challenge:
The first challenge was getting the R430 to boot from the Talos ISO instead of Proxmox. After mounting the ISO via iDRAC virtual media, the server kept booting to Proxmox’s GRUB menu.
Solution:
- Wiped the Proxmox bootloader:
dd if=/dev/zero of=/dev/sda bs=1M count=100 - Reset USB status in iDRAC
- Remapped ISO as CD/DVD (not removable disk)
- Used F11 Boot Manager to explicitly select virtual media
Installation Process:
# Generate configuration
talosctl gen config tom-lab-cluster https://192.168.1.100:6443 \
--output-dir ~/r430-migration/talos-config
# Apply configuration
talosctl apply-config --insecure --nodes 192.168.1.100 \
--file ~/r430-migration/talos-config/controlplane.yaml
Disk Selection Issue:
Initially tried /dev/sda, but the virtual CD took that device. Switched to /dev/sdb:
machine:
install:
disk: /dev/sdb # Not /dev/sda!
After installation, Talos reboots and Kubernetes starts initializing.
3. Bootstrap Kubernetes
The Bootstrap Process:
# Set talosconfig
export TALOSCONFIG=~/r430-migration/talos-config/talosconfig
# Bootstrap etcd (cluster state)
talosctl bootstrap --nodes 192.168.1.100
# Get kubeconfig (with retry logic)
talosctl -e 192.168.1.100 --nodes 192.168.1.100 kubeconfig --force
# Verify cluster
kubectl get nodes
# NAME STATUS ROLES AGE VERSION
# r430-k8s-master Ready control-plane 5m v1.29.1
Common Issue: kubeconfig Not Working
If you see “connection refused”, the kubeconfig might not be set correctly:
# Force regenerate
talosctl -e 192.168.1.100 --nodes 192.168.1.100 kubeconfig --force
# Verify
kubectl get nodes
Single-node cluster ready in ~5 minutes. All control plane components run on the same node.
4. Storage Layer
First Attempt: Longhorn
Longhorn seemed perfect - distributed storage with snapshots and backups. But we hit a critical blocker:
# Longhorn requires iSCSI
talosctl -e 192.168.1.100 --nodes 192.168.1.100 read /usr/sbin/iscsiadm
# Error: no such file or directory
Talos Linux is immutable and minimal - it doesn’t include open-iscsi by default. Adding it would require custom extensions, adding complexity.
Solution: Local Path Provisioner
Switched to Local Path Provisioner - simpler and perfect for single-node:
# Install
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.26/deploy/local-path-storage.yaml
# Set as default
kubectl patch storageclass local-path \
-p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
# Verify
kubectl get storageclass
# NAME PROVISIONER RECLAIMPOLICY
# local-path rancher.io/local-path Delete
PodSecurity Fix:
The provisioner’s helper pods need privileged access:
kubectl label namespace local-path-storage \
pod-security.kubernetes.io/enforce=privileged --overwrite
Works perfectly for a single-node lab. No replication needed, fast local storage.
5. Networking
MetalLB for LoadBalancer IPs:
# Install MetalLB
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.3/config/manifests/metallb-native.yaml
# Configure IP pool (interactive in script)
# IP range: 192.168.1.200-192.168.1.220
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: default-pool
namespace: metallb-system
spec:
addresses:
- 192.168.1.200-192.168.1.220
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: default
namespace: metallb-system
spec:
ipAddressPools:
- default-pool
Traefik as Ingress Controller:
Traefik deployed and exposed on 192.168.1.200. Later, we added Nginx Proxy Manager for easier domain management.
Nginx Proxy Manager:
After initial Traefik setup, we deployed NPM for better SSL certificate management and user-friendly proxy host configuration. NPM runs on 192.168.1.202 and handles all external-facing services.
6. KubeVirt
Installation:
# Set version
export KUBEVIRT_VERSION=v1.1.2
# Install operator
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/${KUBEVIRT_VERSION}/kubevirt-operator.yaml
# Wait for operator
kubectl wait --for=condition=ready pod -n kubevirt-system \
-l kubevirt.io=virt-operator --timeout=120s
# Install KubeVirt
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/${KUBEVIRT_VERSION}/kubevirt-cr.yaml
# Wait for components
kubectl wait --for=condition=ready pod -n kubevirt-system \
-l kubevirt.io=virt-handler --timeout=300s
Hardware Virtualization Check:
The installation script checks for VT-x/AMD-V support. Even if the script reports it’s not detected, KubeVirt will automatically use hardware virtualization if available:
# Verify CPU support
talosctl -e 192.168.1.100 --nodes 192.168.1.100 read /proc/cpuinfo | grep vmx
# Should show: vmx (Intel) or svm (AMD)
VMs now run alongside containers with near-native performance.
Results
What Works Great
- Immutability - No SSH temptation, everything in Git
- Unified management - kubectl for everything
- Fast updates - Talos upgrades in minutes
- Observability - Standard K8s monitoring tools
Challenges
- Learning curve - Different paradigm from traditional hypervisors
- Tooling - Custom console needed (KubeSphere issues)
- Storage - Had to pivot from Longhorn to Local Path (iSCSI dependency)
- Debugging - No SSH, must use
talosctlfor node access - Boot issues - Getting R430 to boot from ISO took several attempts
- PodSecurity - Multiple components needed privileged namespace labels
- Image architecture - Had to rebuild for AMD64 (Mac → R430)
- TLS certificates - Kubelet serving CSRs needed manual approval
- Local registry - Talos HTTP configuration for insecure registry
Performance
Dell R430 specs:
- 2x Intel Xeon (32 cores total)
- 128GB RAM
- Hardware RAID
Kubernetes overhead is minimal. VMs run at near-native speed with KubeVirt.
Custom Console
Why Build Custom?
KubeSphere installation failed due to:
- Outdated Helm chart URLs (404 errors)
- Image pull errors
- Complex dependencies
- Overkill for single-node lab
Our Solution:
Built a lightweight custom console:
- Backend: Go with
k8s.io/client-goand dynamic client for KubeVirt - Frontend: React + TypeScript + TailwindCSS + Vite
- Features:
- Real-time dashboard with cluster stats
- Node monitoring
- Pod listing and filtering
- VM management (start/stop/restart)
- Namespace filtering
Key Challenges Solved:
- Architecture mismatch (ARM64 → AMD64)
- KubeVirt client version conflicts
- In-cluster config timing issues
- Local registry HTTP configuration
Simple, fast, does exactly what we need. See Custom Console article for full details.
Conclusion
Worth it. The migration took a weekend, but I now have:
- Infrastructure fully in Git
- Modern cloud-native platform
- Better security posture
- Unified containers + VMs
Would I recommend it? If you:
- Are comfortable with Kubernetes
- Want infrastructure as code
- Have time to learn new tools
- Don’t need GUI for everything
Then yes, absolutely.
Next: Architecture Deep Dive