Script to setup a home lab Kubernetes cluster
| .gitignore | ||
| config.yaml | ||
| deploy.py | ||
| port-forward.sh | ||
| pyproject.toml | ||
| README.md | ||
| requirements.txt | ||
Kubernetes + Longhorn Lab Setup
Automated deployment of a complete Kubernetes cluster with storage and monitoring from your host machine.
Prerequisites
- 3 Ubuntu VMs with SSH access (password auth)
- Python 3.10+ on your host machine
- Extra disk on worker VMs for Longhorn storage (e.g.,
/dev/vdb)
Quick Start
# 1. Create virtual environment
uv venv .venv
source .venv/bin/activate
uv pip install -r requirements.txt
# 2. Edit config with your VM IPs
nano config.yaml
# 3. Run deployment
python deploy.py
What Gets Installed
| Component | Version | Description |
|---|---|---|
| Kubernetes | 1.35 | Container orchestration |
| Calico | 3.31.3 | CNI networking |
| Longhorn | latest | Distributed block storage |
| Metrics Server | latest | Resource metrics |
| ingress-nginx | latest | Ingress controller |
| cert-manager | latest | Certificate management |
| Prometheus | latest | Metrics & alerting |
| Grafana | latest | Dashboards |
| Loki + Promtail | latest | Log aggregation |
Configuration
Edit config.yaml:
ssh:
user: ubuntu
password: your-password
nodes:
control:
ip: 192.168.1.100
hostname: k8s-control
workers:
- ip: 192.168.1.101
hostname: k8s-worker-1
- ip: 192.168.1.102
hostname: k8s-worker-2
kubernetes:
version: "1.35"
pod_cidr: "10.244.0.0/16"
service_cidr: "10.96.0.0/12"
calico_version: "3.31.3"
longhorn:
disk: /dev/vdb
mount: /var/lib/longhorn
monitoring:
grafana_password: admin123
Accessing Web UIs
After deployment, SSH to control plane and run the helper script:
ssh user@<control-plane-ip>
~/port-forward.sh
Then access from your browser:
| Service | URL | Credentials |
|---|---|---|
| Grafana | http://<control-ip>:3000 | admin / admin123 |
| Prometheus | http://<control-ip>:9090 | - |
| Alertmanager | http://<control-ip>:9093 | - |
| Longhorn | http://<control-ip>:8080 | - |
Reset Cluster
To start fresh, reset all nodes:
# Control plane
ssh user@<control-ip> "echo 'password' | sudo -S bash -c 'kubeadm reset -f; rm -rf /etc/cni/net.d /var/lib/etcd ~/.kube'"
# Workers (includes Longhorn disk wipe)
ssh user@<worker-ip> "echo 'password' | sudo -S bash -c 'kubeadm reset -f; rm -rf /etc/cni/net.d; umount /var/lib/longhorn 2>/dev/null; wipefs -a /dev/vdb 2>/dev/null; rm -rf /var/lib/longhorn'"
Troubleshooting
# Check nodes
kubectl get nodes -o wide
# Check all pods
kubectl get pods -A
# Check specific namespace
kubectl get pods -n longhorn-system
kubectl get pods -n monitoring
# View logs
kubectl logs -n <namespace> <pod-name>
# Describe pod for events
kubectl describe pod -n <namespace> <pod-name>
Deployment Phases
- Prepare nodes - Install containerd, kubelet, kubeadm
- Init control plane - kubeadm init + Calico CNI
- Join workers - kubeadm join + Longhorn disk prep
- Install add-ons - Helm, Metrics Server, Longhorn, ingress-nginx, cert-manager
- Install monitoring - Prometheus stack, Loki, Promtail
- Final setup - Apps namespace, health check, port-forward script
Files
deploy.py- Main deployment scriptconfig.yaml- Cluster configurationport-forward.sh- Helper script (also uploaded to control plane)