Cluster Architecture¶
Overview¶
The cluster runs k3s in HA mode — all three nodes are control-plane members and etcd peers simultaneously. There's no dedicated "worker-only" node. This is a deliberate choice: with only 3 machines, dedicating one purely to workers would waste the etcd quorum capacity.
[PC / kubectl]
│
▼
HAProxy :6443 (TCP passthrough)
│
├── master 192.168.55.10 (control-plane + etcd)
├── worker1 192.168.55.11 (control-plane + etcd)
└── worker2 192.168.55.12 (control-plane + etcd)
HA Control Plane¶
Why HA matters¶
HA has two distinct dimensions that are easy to mix up:
- Control-plane HA — the scheduler, controller-manager, and API server need to be up for the cluster to react to failures (evict pods from a dead node, scale deployments, etc.)
- Application HA — pods keep running even if the control plane goes down, but the cluster is "blind" — it can't reschedule, scale, or repair anything
In practice: if a node dies, k8s detects it after node-monitor-grace-period (~40s by default) and starts evicting pods. You need a working control plane for that eviction to happen.
etcd quorum¶
k3s uses embedded etcd for HA. With 3 nodes, the cluster tolerates 1 node failure — the formula is (n-1)/2 nodes can fail while maintaining quorum.
Tested behavior¶
With the master node powered off:
- After ~2-3 minutes, pods from the dead node were rescheduled on the remaining nodes
- API access was uninterrupted (HAProxy routed to surviving control-plane nodes)
- node-monitor-grace-period is the key timer; pods stuck in Terminating require --force if kubelet can't confirm shutdown
Initial Cluster Setup¶
Step 0 — Uninstall existing k3s¶
k3s HA must be bootstrapped from scratch. You can't "upgrade" a single-node install to HA.
On all control-plane nodes:
On workers (if any):
Clean up any leftover data:
Step 1 — Bootstrap the first control-plane¶
On 192.168.55.10:
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="
server \
--cluster-init \
--tls-san 192.168.0.45 \
--tls-san cluster.kcn333.com
" sh -
--cluster-init— creates a new HA cluster with embedded etcd--tls-san— adds IPs/domains to the API server's TLS certificate SAN. Without this, kubectl via HAProxy gets a TLS error. Add both the LB IP and your domain.
Step 2 — Grab the cluster token¶
Step 3 — Join the second and third control-plane nodes¶
On 192.168.55.11 and 192.168.55.12:
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="
server \
--server https://192.168.55.10:6443 \
--token <TOKEN> \
--tls-san 192.168.0.45 \
--tls-san cluster.kcn333.com
" sh -
--server points to the first node — this means "join the existing etcd cluster" rather than creating a new one.
TLS SAN Management¶
What SAN is¶
The Subject Alternative Name is the list of IPs and domains the API server certificate is valid for. When kubectl connects through HAProxy, it checks that the IP it's talking to is listed in the cert's SAN. If it's not — TLS error.
Where the config lives¶
Look for the ExecStart block:
ExecStart=/usr/local/bin/k3s \
server \
'--cluster-init' \
'--tls-san' \
'192.168.0.45' \
'--tls-san' \
'cluster.kcn333.com' \
Updating SAN¶
# 1. Snapshot etcd first (always!)
sudo k3s etcd-snapshot save --name pre-tls-san-change
# 2. Edit the service file
sudo vi /etc/systemd/system/k3s.service
# 3. Reload and restart
sudo systemctl daemon-reload
sudo systemctl restart k3s
# 4. Verify the new SAN is in the cert
echo | openssl s_client -connect <IP>:6443 2>/dev/null | \
openssl x509 -text -noout | grep -A10 "Subject Alternative Name"
k3s SAN behavior — quirk¶
k3s manages the API server cert via a dynamic listener stored as a Secret k3s-serving in kube-system. The SAN list comes from:
- --tls-san flags in k3s.service
- All node IPs in the cluster
- Node hostnames
- Standard k8s names (kubernetes, kubernetes.default, etc.)
- History from etcd — addresses from previous configurations
Removing a --tls-san and restarting isn't always enough — k3s can rebuild the cert from etcd history. In practice, if the old IP isn't accessible anyway, it's not a security risk.
SAN sync in HA¶
Changing SAN on one control-plane node is automatically synced via etcd to the other nodes. You don't need to manually edit k3s.service on every node for SAN changes.
Accessing the Cluster¶
kubeconfig¶
The kubeconfig lives at /etc/rancher/k3s/k3s.yaml on any control-plane node. It contains admin-level credentials — treat it like a root password.
# Copy to your local machine
ssh user@master "sudo cat /etc/rancher/k3s/k3s.yaml" | Out-File -Encoding ascii k3s.yaml
# Change the server address to HAProxy
# server: https://127.0.0.1:6443 → https://192.168.0.45:6443
Merging kubeconfigs¶
Useful Commands¶
# Cluster state
kubectl get nodes -o wide
kubectl get pods -A -o wide
# etcd operations
sudo k3s etcd-snapshot save --name <name>
sudo k3s etcd-snapshot ls
# Certificate rotation
sudo k3s certificate rotate
sudo systemctl daemon-reload
sudo systemctl restart k3s
# k3s service logs
sudo journalctl -u k3s -n 100 --no-pager