Engineering Blog

Back
Published on January 7, 2025 by

Using Our Load Balancer to Set Up a Highly Available Kubernetes Control Plane

Set up a highly available Kubernetes control plane using cloudscale's load balancer. This guide walks you through provisioning with Terraform, configuring Kubernetes, and testing failover step by step.

One of the great things about working as an engineer at cloudscale is that we get to work on many different products, technologies, and projects. When we began developing the load balancer as a service product, a crucial requirement was that it must be usable for creating highly available Kubernetes control planes. During the whole development process, I was looking forward to bootstrapping my first cluster using this new product. And now, that we have this blog, I want to share a few notes on how to create a highly available, stacked, Kubernetes control plane on three cloudscale Ubuntu VMs using containerd.

Provisioning the Cloud Infrastructure

The Kubernetes Documentation instructs us to set up the following:

In a cloud environment you should place your control plane nodes behind a TCP forwarding load balancer. This load balancer distributes traffic to all healthy control plane nodes in its target list. The health check for an apiserver is a TCP check on the port the kube-apiserver listens on (default value :6443).

This means that we'll need:

  • A Load Balancer
  • A Load Balancer Listener using port 6443
  • A Load Balancer Pool with round-robin algorithm
  • A Load Balancer Pool Member for each VM
  • A Load Balancer Health Monitor checking Port 6443 on the VMs

Since the easiest way to set all this up and get all the VMs running is with Terraform, I have provided a Terraform file (see appendix) if you want a quick start. The Terraform file setups of the following:

Three VMs/nodes and a load balancer are connected privately. The load balancer consists of multiple VMs.

Ensure Terraform is installed on your machine. Then, navigate to the Terraform file’s directory and run it.

terraform init

Once initialized, export a read/write API token from a, preferably, empty project and create the infrastructure by running:

export CLOUDSCALE_API_TOKEN="..."
terraform apply

Terraform will display a preview of the resources it plans to create or update and prompt you for confirmation. Type yes to proceed.

Terraform will also output three variables at the end: kube_api_lb_ip, server_ips_private, and server_ips_public. We'll need this information later on. Ensure that you can SSH into all VMs using the ubuntu user using the public IP addresses.

Installing kubeadm and containerd

Now, fasten your seatbelt. Here is a condensed summary of the following articles from the Kubernetes Documentation: Installing kubeadm, Container Runtimes, Creating Highly Available Clusters with kubeadm. I take some shortcuts and have left out some things I deem not relevant for non-production setups.

All commands must be run on all nodes.

Configure Kubernetes’ apt repository. Replace 1.32 with the desired Kubernetes version.

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.32/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

Download the necessary packages.

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

Enable IP forwarding.

echo "net.ipv4.ip_forward = 1" | sudo tee /etc/sysctl.d/k8s.conf
sudo sysctl --system

Install containerd and configure the systemd cgroup driver for runc.

sudo apt install -y containerd
sudo mkdir /etc/containerd
containerd config default | sed 's/SystemdCgroup = false/SystemdCgroup = true/' | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd

Initializing the Cluster and Installing a CNI Plugin

Next, we'll initialize the cluster from control-node-1 and install Cilium as a CNI (Container Network Interface) plugin. In the kubeadm init command, pass the IPv4 IP address of the load balancer as --control-plane-endpoint (shown as kube_api_lb_ip in terraform show) and the private IP address of node 1 as --apiserver-advertise-address.

sudo kubeadm init --control-plane-endpoint "KUBE_API_LB_IP:6443" --apiserver-advertise-address="10.11.12.21" --upload-certs

Next, set up your $HOME/.kube/config file as shown in the output of kubeadm init and keep the join commands in a safe place.

At this point, I usually check the nodes and pods in my cluster to see if everything looks good. It's perfectly normal that the node is NotYetReady and that coredns pods are not yet running. But the other pods should be running.

kubectl get nodes -o wide
kubectl get pods -A -o wide

In our experience, Cilium is the most worry-free CNI plugin to install, so let's do that and admire some colorful ASCII art until it is ready.

wget https://github.com/cilium/cilium-cli/releases/download/v0.16.22/cilium-linux-amd64.tar.gz
tar xvf cilium-linux-amd64.tar.gz
./cilium install
./cilium status --wait

Now the node should be ready and coredns pods should also come up within a short amount of time.

Joining the Other Nodes

Now join the other two nodes into the cluster using kubeadm. The command was shown in the kubeadm join output. Be sure to add the --apiserver-advertise-address="10.11.12.2x" using the respective private IPs (.22 and .23, shown as server_ips_private in terraform show).

sudo kubeadm join <your-kube-api-lb-ip>:6443 --token <your-token> \
  --discovery-token-ca-cert-hash sha256:<your-ca-cert-hash> \
  --control-plane --certificate-key <your-certificate-key> \
  --apiserver-advertise-address="10.11.12.<your-server-private-ip>"

After a certain period, all nodes listed in kubectl get nodes -o wide should be marked as Ready. And the coredns pods become running.

Seeing It in Action: Shutting Down a Node

Now, this blog post would, of course, not be complete without proofing that we can take down a control node. I suggest that you copy the $HOME/.kube/config to your local machine and use kubectl from there.

Let’s list all pods in the cluster. Pay attention to the coredns pods. They’re most likely scheduled on control-node-1.

kubectl get pods -A -o wide

Now drain control-node-1:

kubectl drain control-node-1 --ignore-daemonsets

After a few seconds, the coredns pods should have been successfully moved to the other control nodes.

kubectl get pods -A -o wide

It’s now safe to shut down the drained node.

sudo init 0

If everything worked, well, kubectl still works like a charm from your local machine and the other two control nodes. Another fun thing to do is to navigate to the private network named backend in the cloudscale Control Panel and look at the Ports tab. There, you'll see the network ports of the load balancer and the VMs. control-node-1 should be shown as down. The same applies to the "monitor_status" property if you query the pool members using our API.

After restarting the node, make it schedulable again.

kubectl uncordon control-node-1

And now you are ready to add worker nodes to the cluster. Or installing our Cloud Controller Manager (CCM) to configure a Load Balancer for managing external traffic, or setting up our Container Storage Interface (CSI) driver for persistent storage.

Lessons Learned and Final Words

In my initial test cluster, I made a mistake by not adding load balancer pool members for control-node-2 and control-node-3. As a result, when I shut down control-node-1, everything stopped working. So once, again I was reminded of: HA systems are worthless if failover testing is not done.

I hope this guide was interesting to you. If you find this type of content valuable, please send us an email because this could be the beginning of a small miniseries on running Kubernetes on cloudscale Infrastructure. As mentioned earlier, there’s much more to cover.

Have fun experimenting with Kubernetes on our infrastructure, but please read the complete documentation linked above before deploying production workloads!

Appendix: Terraform File

terraform {
  required_providers {
    cloudscale = {
      source  = "cloudscale-ch/cloudscale"
      version = "4.4.0"
    }
  }
}

provider "cloudscale" {
  # Add your provider configuration here if necessary
}

variable "control_node_count" {
  description = "Number of control nodes"
  type        = number
  default     = 3
}

variable "network_cidr" {
  description = "CIDR block for the backend network"
  type        = string
  default     = "10.11.12.0/24"
}

variable "zone_slug" {
  description = "Zone slug for the resources"
  type        = string
  default     = "lpg1"
}

variable "ssh_key_path" {
  description = "Path to the SSH public key file"
  type        = string
  default     = "~/.ssh/id_ed25519.pub" # Replace with your SSH key file path
}

# Create a network
resource "cloudscale_network" "backend" {
  name                    = "backend"
  zone_slug               = var.zone_slug
  auto_create_ipv4_subnet = "false"
}

# Create a subnet
resource "cloudscale_subnet" "backend-subnet" {
  cidr         = var.network_cidr
  network_uuid = cloudscale_network.backend.id
}

# Server Group for Control Nodes
resource "cloudscale_server_group" "control-plane-group" {
  name      = "control-plane-group"
  type      = "anti-affinity"
  zone_slug = var.zone_slug
}

# Control Nodes
resource "cloudscale_server" "control-nodes" {
  count            = var.control_node_count
  name             = "control-node-${count.index + 1}"
  flavor_slug      = "flex-8-4"
  image_slug       = "ubuntu-24.04"
  volume_size_gb   = 50
  ssh_keys         = [file(var.ssh_key_path)]
  server_group_ids = [cloudscale_server_group.control-plane-group.id]
  zone_slug        = var.zone_slug

  interfaces {
    type = "public"
  }

  interfaces {
    type = "private"
    addresses {
      subnet_uuid = cloudscale_subnet.backend-subnet.id
      address     = "10.11.12.${count.index + 21}"
    }
  }
}

# Kube-API Load Balancer
resource "cloudscale_load_balancer" "kube-api-lb" {
  name        = "kube-api-lb"
  flavor_slug = "lb-standard"
  zone_slug   = var.zone_slug
}


# Create a load balancer pool
resource "cloudscale_load_balancer_pool" "kube-api-pool" {
  name               = "kube-api-pool"
  algorithm          = "round_robin"
  protocol           = "tcp"
  load_balancer_uuid = cloudscale_load_balancer.kube-api-lb.id
}

# Create a load balancer listener
resource "cloudscale_load_balancer_listener" "kube-api-listener" {
  name          = "kube-api-listener"
  pool_uuid     = cloudscale_load_balancer_pool.kube-api-pool.id
  protocol      = "tcp"
  protocol_port = 6443
}


# Create a load balancer pool member
resource "cloudscale_load_balancer_pool_member" "kube-api-pool-member" {
  count         = var.control_node_count
  name          = "kube-api-${count.index}"
  pool_uuid     = cloudscale_load_balancer_pool.kube-api-pool.id
  protocol_port = 6443

  # Get the private IP address of the control node
  address = flatten([
    for iface in cloudscale_server.control-nodes[count.index].interfaces : [
      for addr in iface.addresses : addr.address
      if iface.type == "private"
    ]
  ])[0]
  subnet_uuid = cloudscale_subnet.backend-subnet.id
}

# Create a load balancer pool member
resource "cloudscale_load_balancer_health_monitor" "lb1-health-monitor" {
  pool_uuid = cloudscale_load_balancer_pool.kube-api-pool.id
  type      = "tcp"
}


output "kube_api_lb_ip" {
  value       = cloudscale_load_balancer.kube-api-lb.vip_addresses[0].address
  description = "IPv4 Address of the Load Balancer"
}

output "server_ips_public" {
  value = [
    for node in cloudscale_server.control-nodes :
    flatten([
      for iface in node.interfaces : [
        for addr in iface.addresses : addr.address
        if iface.type == "public"
      ]
    ])[0]
  ]
  description = "The public IP addresses of the control nodes."
}

output "server_ips_private" {
  value = [
    for node in cloudscale_server.control-nodes :
    flatten([
      for iface in node.interfaces : [
        for addr in iface.addresses : addr.address
        if iface.type == "private"
      ]
    ])[0]
  ]
  description = "The private IP addresses of the control nodes."
}
Back to overview