API load balancers

Keepalived and HAProxy configuration and validation.

Scope: This VIP is for the Kubernetes API only. Ingress traffic is handled by Ingress-NGINX exposed through MetalLB.

Kubernetes load balancer setup (dev env)

Using VRRP (Virtual Router Redundancy Protocol).
The information in this section includes two load balancers, master and backup, however, only the master needs to run in the dev environment if you are low on resources.
Install haproxy and keepalived, noting that keepalived will go to loaded state not running state.
```
sudo apt install haproxy keepalived -y
```
The router_id is local only and doesn't affect the VRRP protocol. It is mostly used to identify messages and should be unique per server. I use the hostname.
The Virtual IP for my cluster is 192.168.1.200.
For all load balancers change the state to BACKUP and let priority decide
If you have multiple k8s clusters in your environment ensure that the virtual_router_id is unique for each cluster as it is used for election. I use the last octet of the VIP.
The vrrp interface will depend on your environment, in my dev virtual environment it is: enp0s5.

Master load balancer: lb-1

Edit keepalived configuration: sudo vi /etc/keepalived/keepalived.conf

global_defs {
router_id lb-1
script_user root
enable_script_security
}

vrrp_script check_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 3
fall 5
rise 2
weight -150
}

vrrp_instance VI_1 {
state BACKUP
interface enp0s5
virtual_router_id 200
priority 254
advert_int 1
preempt_delay 10

virtual_ipaddress {
192.168.1.200/24
}

track_script {
check_apiserver
}

garp_master_repeat 5
garp_master_refresh 10

notify_master "/etc/keepalived/status_capture.sh MASTER"
notify_backup "/etc/keepalived/status_capture.sh BACKUP"
notify_fault  "/etc/keepalived/status_capture.sh FAULT"
}

Backup load balancer: dev-lb-v2

Edit keepalived configuration: vi /etc/keepalived/keepalived.conf

global_defs {
router_id dev-lb-v2
script_user root
enable_script_security
}

vrrp_script check_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 3
fall 5
rise 2
weight -150
}

vrrp_instance VI_1 {
state BACKUP
interface enp0s5
virtual_router_id 200
priority 200
advert_int 1

virtual_ipaddress {
192.168.1.200/24
}

track_script {
check_apiserver
}

garp_master_repeat 5
garp_master_refresh 10

notify_master "/etc/keepalived/status_capture.sh MASTER"
notify_backup "/etc/keepalived/status_capture.sh BACKUP"
notify_fault  "/etc/keepalived/status_capture.sh FAULT"
}

Create scripts on all load balancers

Check API server script - fail if the local HAProxy listener isn't returning HTTP 200: sudo vi /etc/keepalived/check_apiserver.sh

#!/bin/sh
set -eu

VIP="192.168.1.200/24"
IFACE="enp0s5"
URL_LOCAL="https://localhost:8443/healthz"
URL_VIP="https://192.168.1.200:8443/healthz"

try_curl() {
  curl --silent --show-error --fail --max-time 4 --insecure "$1" -o /dev/null
}

retry() {
  url="$1"; n=0
  until try_curl "$url"; do
    n=$((n+1)); [ "$n" -ge 3 ] && return 1
    sleep 1
  done
}

# Optional: only run curls if 8443 is actually listening
ss -Htl "( sport = :8443 )" | grep -q . || { echo "*** 8443 not listening" >&2; exit 1; }

retry "$URL_LOCAL" || { echo "*** HAProxy local listener unhealthy: $URL_LOCAL" >&2; exit 1; }

if ip -4 addr show dev "$IFACE" | grep -q " $VIP\\b"; then
  retry "$URL_VIP" || { echo "*** HAProxy VIP path unhealthy: $URL_VIP" >&2; exit 1; }
fi

Check keepalived status script - log to syslog: sudo vi /etc/keepalived/status_capture.sh

#!/usr/bin/env bash
set -euo pipefail

ROLE="${1:-UNKNOWN}"
MSG="$(date '+%Y-%m-%d %H:%M:%S'): The load balancer instance on $(hostname) is currently marked ${ROLE}"

# Write a small status file
STATUS_FILE="/var/run/load-balancer-status"
umask 022
echo "$MSG" > "$STATUS_FILE"

# Also send to syslog
logger -t keepalived-notify -- "$MSG"

# Optional: echo for journal
echo "$MSG"

Health check k8s script: sudo vi /etc/haproxy/haproxy.cfg

  global
      log /dev/log local0
      log /dev/log local1 notice
      chroot /var/lib/haproxy
      stats socket /run/haproxy/admin.sock mode 660 level admin
      stats timeout 30s
      user haproxy
      group haproxy
      daemon

      # Optional / irrelevant for TCP pass-through, but harmless to leave
      ca-base /etc/ssl/certs
      crt-base /etc/ssl/private
      ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
      ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
      ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets

  defaults
      log     global
      mode    tcp
      option  tcplog
      option  dontlognull
      timeout connect 10s
      timeout client  30s
      timeout server  30s
      errorfile 400 /etc/haproxy/errors/400.http
      errorfile 403 /etc/haproxy/errors/403.http
      errorfile 408 /etc/haproxy/errors/408.http
      errorfile 500 /etc/haproxy/errors/500.http
      errorfile 502 /etc/haproxy/errors/502.http
      errorfile 503 /etc/haproxy/errors/503.http
      errorfile 504 /etc/haproxy/errors/504.http

  frontend apiserver
      bind *:8443
      default_backend apiserver

  backend apiserver
      option httpchk GET /healthz
      http-check expect status 200
      balance roundrobin
      # Do a TLS handshake for the health check, skip cert verify (self-signed)
      server master-1 192.168.1.203:6443 check check-ssl verify none

Ensure both scripts can execute

sudo chmod u+x  /etc/keepalived/check_apiserver.sh
sudo chmod u+x  /etc/keepalived/status_capture.sh

Ensure HAProxy can always start with floating VIP

echo 'net.ipv4.ip_nonlocal_bind=1'|sudo tee -a /etc/sysctl.conf

Ensure the settings persist across reboots: sudo sysctl -p

Confirm keepalived and haproxy are running on all load balancers

Start and status checks

sudo service keepalived start
sudo service haproxy start
sudo service keepalived status
sudo service haproxy status

Reboot the load balancers and check logs: journalctl -u haproxy -n 20 --no-pager

Oct 03 14:33:09 lb-1 haproxy[1689]: [NOTICE]   (1689) : haproxy version is 2.8.5-1ubuntu3.3
Oct 03 14:33:09 lb-1 haproxy[1689]: [NOTICE]   (1689) : path to executable is /usr/sbin/haproxy
Oct 03 14:33:09 lb-1 haproxy[1689]: [WARNING]  (1689) : Former worker (5195) exited with code 0 (Exit)
Oct 04 00:55:33 lb-1 systemd[1]: Stopping haproxy.service - HAProxy Load Balancer...
Oct 04 00:55:33 lb-1 haproxy[1689]: [WARNING]  (1689) : Exiting Master process...
Oct 04 00:55:33 lb-1 haproxy[1689]: [ALERT]    (1689) : Current worker (6064) exited with code 143 (Terminated)
Oct 04 00:55:33 lb-1 haproxy[1689]: [WARNING]  (1689) : All workers exited. Exiting... (0)
Oct 04 00:55:33 lb-1 systemd[1]: haproxy.service: Deactivated successfully.
Oct 04 00:55:33 lb-1 systemd[1]: Stopped haproxy.service - HAProxy Load Balancer.
Oct 04 00:55:33 lb-1 systemd[1]: haproxy.service: Consumed 31.879s CPU time, 39.0M memory peak, 0B memory swap peak.
-- Boot a2259a0db8b043a3b0a4b465ec31df29 --
Oct 04 00:55:43 lb-1 systemd[1]: Starting haproxy.service - HAProxy Load Balancer...
Oct 04 00:55:43 lb-1 haproxy[770]: [NOTICE]   (770) : New worker (846) forked
Oct 04 00:55:43 lb-1 systemd[1]: Started haproxy.service - HAProxy Load Balancer.
Oct 04 00:55:43 lb-1 haproxy[770]: [NOTICE]   (770) : Loading success.
Oct 04 00:55:43 lb-1 haproxy[846]: [WARNING]  (846) : Server apiserver/master-1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 1ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Oct 04 00:55:43 lb-1 haproxy[846]: [ALERT]    (846) : backend 'apiserver' has no server available!
Oct 04 00:55:43 lb-1 haproxy[846]: Server apiserver/master-1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 1ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Oct 04 00:55:43 lb-1 haproxy[846]: Server apiserver/master-1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 1ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Oct 04 00:55:43 lb-1 haproxy[846]: backend apiserver has no server available!
Oct 04 00:55:43 lb-1 haproxy[846]: backend apiserver has no server available!

At this stage, before the cluster is deployed, we are looking for any log entries that indicate mis-configuration, possible errors to be resolved include:
```
sendmsg()/writev() failed in logger #1: Permission denied (errno=13)
```

Warning: Before the cluster is deployed, the backend will show no server available and that is expected.

Kubernetes load balancer setup (dev env)​

Master load balancer: lb-1​

Backup load balancer: dev-lb-v2​

Create scripts on all load balancers​

Confirm keepalived and haproxy are running on all load balancers​

Kubernetes load balancer setup (dev env)

Master load balancer: lb-1

Backup load balancer: dev-lb-v2

Create scripts on all load balancers

Confirm keepalived and haproxy are running on all load balancers