Kubernetes Leader Election
Using lease locks to facilitate leader election.
Distributed systems are becoming more popular. More companies are using complex distributed systems to handle billions of requests and upgrade without downtime.
Handling failures in distributed systems is crucial for higher availability. A leader election process ensures that if the leader fails, the candidate replicas can be elected as the leader.
Inspiration
I had explored the Raft Consensus Algorithm a few months back, and when I started exploring Kubernetes, a thought came to my mind:
“Kubernetes being a container orchestrator should have leader election implemented in its backend. π€”
And this is where I found client-go. On further exploration, I got to know about the leaderelection package.
What is Leader Election?
Leader election involves the selection of a leader among the healthy candidates eligible for the election. Once the election is won, the leader continually “heartbeats” to renew their position as the leader, and the other candidates periodically make new attempts to become the leader. Once the leaderβs health check fails, the candidates start a re-election to elect the next leader among them. This ensures that a new leader is elected quickly, if the current leader fails for some reason.
Why Leader Election for Kubernetes?
When a Kubernetes controller is deployed as multiple instances, you would want to prevent any unexpected behavior, so, running a leader election process would ensure that a leader is elected amongst the replicas, and is the only one actively reconciling the cluster. The idea behind this is that the other instances should remain inactive, but ready to take over if the leader instance fails.
How Kubernetes Leader Election works
Leader election in Kubernetes is simple. It begins with the creation of a lock object, where the leader updates the current timestamp at regular intervals as a way of informing other replicas of its current status, i.e, leadership.
This lock object holds the details about the renew deadlines, identity of the current leader, etc. If the leader fails to update the timestamp within the given interval, it is assumed to have been crashed. The candidates replicas then race to acquire leadership by updating the lock with their identity. The pod which successfully acquires the lock gets to be the new leader.
Here is a visualization:
Green pods are leaders, blue pods are candidates and red pods are crashed pods.


Writing code
We’ll need to set up a local Kubernetes cluster first. I’ll be using kind instead of minikube, but feel free to choose your own tool.
Here is the entire code:
// main.go
package main
import (
"context"
"flag"
"fmt"
"log"
"os"
"time"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/client-go/rest"
"k8s.io/client-go/tools/leaderelection"
"k8s.io/client-go/tools/leaderelection/resourcelock"
"k8s.io/klog"
)
// create a new clientset
var client *clientset.Clientset
// createLease creates a new lease lock object.
func createLease(leaseName, podName, namespace string) *resourcelock.LeaseLock {
fmt.Println("Creating lease using the following metadata:")
fmt.Println("Lease Name: " + leaseName)
fmt.Println("Pod Name: " + podName)
fmt.Println("Namespace: " + namespace)
return &resourcelock.LeaseLock{
LeaseMeta: metav1.ObjectMeta{
Name: leaseName,
Namespace: namespace,
},
Client: client.CoordinationV1(),
LockConfig: resourcelock.ResourceLockConfig{
Identity: podName,
},
}
}
// elect helps in electing a new leader by using the leaderelection API.
func elect(lock *resourcelock.LeaseLock, ctx context.Context, id string) {
leaderelection.RunOrDie(ctx, leaderelection.LeaderElectionConfig{
Lock: lock,
LeaseDuration: 15 * time.Second,
RenewDeadline: 10 * time.Second,
RetryPeriod: 2 * time.Second,
Callbacks: leaderelection.LeaderCallbacks{
OnStartedLeading: func(c context.Context) {
sampleTask()
},
OnStoppedLeading: func() {
klog.Info("Evicted as leader: finding new leaders..")
},
OnNewLeader: func(identity string) {
if identity == id {
klog.Info("I'm the new leader! π")
return
}
klog.Info("New leader is: " + identity)
},
},
ReleaseOnCancel: true,
})
}
// sampleTask is ran when a LeaderElector starts running.
func sampleTask() {
for {
klog.Info("k8sensus is running sample task.")
time.Sleep(10 * time.Second)
}
}
func main() {
var leaseName string
var leaseNamespace string
var podName = os.Getenv("POD_NAME")
flag.StringVar(&leaseName, "lease-name", "", "Lease Name (Lock Name)")
flag.StringVar(&leaseNamespace, "lease-namespace", "default", "Lease Namespace")
flag.Parse()
// validate lease name
if leaseName == "" {
log.Fatalln("Lease Name not found. Provide a valid lease name through --lease-name.")
}
// validate lease namespace
if leaseNamespace == "" {
log.Fatalln("Lease Namespace not found. Provide a valid lease namespace through --lease-namespace.")
}
fmt.Println("π’ποΈ k8sensus is running!")
config, err := rest.InClusterConfig()
// dies if no config is given
client = clientset.NewForConfigOrDie(config)
if err != nil {
log.Fatalln("Failed to get kube config.")
}
// create context
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// create a lease lock
lock := createLease(leaseName, podName, leaseNamespace)
// run leader election
elect(lock, ctx, podName)
}
Inside main
, we first parse the required flags and create a new client using the clientset
declared above. The createLease
function is where the creation of the lease lock occurs, which is then used to run elect
; the function for leader election.
Inside elect
, we can see that RunOrDie
is being called. RunOrDie
internally creates a LeaderElector
, which calls the Run
method.
The Run
method is responsible for acquiring the lock, renewing the lease and running the OnStartedLeading
callback (which we had defined in elect
).
Before running the whole thing, we can create a Dockerfile
for creating and pushing the image to Docker Hub:
// Dockerfile
FROM golang:1.16-alpine as builder
WORKDIR /app
COPY go.mod go.mod
COPY go.sum go.sum
RUN go mod download
COPY main.go main.go
RUN CGO_ENABLED=0 GOOS=linux go build -o k8sensus main.go
FROM alpine
RUN apk --no-cache add ca-certificates
COPY --from=builder /app/k8sensus .
ENTRYPOINT ["./k8sensus"]
After pushing the image to Docker Hub, we need to create a deployment file for deploying the leader election process to our local Kubernetes cluster.
But, here’s a catch:
The default Service Account does not have access to the coordination API.
So, weβll need to create another account and set up RBAC accordingly:
# rbac.yaml
# The default ServiceAccount doesn't have access to the coordination API
# So, a new RBAC is created in this definition
apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
name: leaderelection-sa
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: leaderelection-role
rules:
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- '*'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: leaderelection-rolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: leaderelection-role
subjects:
- kind: ServiceAccount
name: leaderelection-sa
And, then use the account on the deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
app: k8sensus
name: k8sensus
spec:
replicas: 3
selector:
matchLabels:
app: k8sensus
template:
metadata:
labels:
app: k8sensus
spec:
automountServiceAccountToken: true
serviceAccount: leaderelection-sa
containers:
- image: docker.io/burntcarrot/k8sensus
name: k8sensus
args:
- --lease-name=k8sensus-lease
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
Running the leader election process
First things first, apply both of the definitions:
kubectl apply -f rbac.yaml
kubectl apply -f deployment.yaml
After applying the definitions, check the pods:
β― kubectl get pods
NAME READY STATUS RESTARTS AGE
k8sensus-67798d9cf6-gqfjf 1/1 Running 0 2m2s
k8sensus-67798d9cf6-qwxj6 1/1 Running 0 2m2s
k8sensus-67798d9cf6-vtlx6 1/1 Running 0 15s
To simulate leader election, we would need to delete a pod:
β― kubectl delete pod k8sensus-67798d9cf6-n4sgl
At this point, the candidate pods should be trying to acquire the lease lock and promote themselves as the new leader.
Now, we can check the logs of all pods:
β― kubectl logs k8sensus-67798d9cf6-vtlx6
π’ποΈ k8sensus is running!
Creating lease using the following metadata:
Lease Name: k8sensus-lease
Pod Name: k8sensus-67798d9cf6-vtlx6
Namespace: default
I1103 10:34:56.358804 1 leaderelection.go:248] attempting to acquire leader lease default/k8sensus-lease...
I1103 10:34:56.374900 1 main.go:60] New leader is: k8sensus-67798d9cf6-n4sgl
I1103 10:35:07.734547 1 main.go:60] New leader is: k8sensus-67798d9cf6-qwxj6
It’s working! A pod has been elected as the new leader. Let’s see the logs of the new leader pod:
β― kubectl logs k8sensus-67798d9cf6-qwxj6
π’ποΈ k8sensus is running!
Creating lease using the following metadata:
Lease Name: k8sensus-lease
Pod Name: k8sensus-67798d9cf6-qwxj6
Namespace: default
I1103 10:33:37.924516 1 leaderelection.go:248] attempting to acquire leader lease default/k8sensus-lease...
I1103 10:33:37.938187 1 main.go:60] New leader is: k8sensus-67798d9cf6-n4sgl
I1103 10:35:07.562421 1 leaderelection.go:258] successfully acquired lease default/k8sensus-lease
I1103 10:35:07.562627 1 main.go:57] I'm the new leader! π
I1103 10:35:07.562845 1 main.go:70] k8sensus is running sample task.
I1103 10:35:17.563958 1 main.go:70] k8sensus is running sample task.
I1103 10:35:27.566011 1 main.go:70] k8sensus is running sample task.
As expected, the leader pod runs our sample task successfully!
Let’s inspect the lease lock object too:
β― kubectl describe lease k8sensus-lease
Name: k8sensus-lease
Namespace: default
Labels: <none>
Annotations: <none>
API Version: coordination.k8s.io/v1
Kind: Lease
Metadata:
Creation Timestamp: 2021-11-03T10:33:30Z
Managed Fields:
API Version: coordination.k8s.io/v1
Fields Type: FieldsV1
fieldsV1:
f:spec:
f:acquireTime:
f:holderIdentity:
f:leaseDurationSeconds:
f:leaseTransitions:
f:renewTime:
Manager: k8sensus
Operation: Update
Time: 2021-11-03T10:33:30Z
Resource Version: 15947
UID: 7b1aac4c-baa8-4c3d-a6c8-18deff2f0c2f
Spec:
Acquire Time: 2021-11-03T10:35:07.497077Z
Holder Identity: k8sensus-67798d9cf6-qwxj6
Lease Duration Seconds: 15
Lease Transitions: 1
Renew Time: 2021-11-03T10:39:26.731022Z
Events: <none>
Upon inspection of the Holder Identity
field under Spec
, we can see that the leader’s pod name matches the holder identity, which proves that the lease has been acquired by the new leader.
If you want to cleanup all the deployments deploying during this post, use:
β― kubectl delete deployment k8sensus
deployment.apps "k8sensus" deleted
This was fun! I hope you learnt more about Kubernetes and leader election from this post; this post is the brief overview of my exploration of the Go client for Kubernetes.
I used Carlos Becker’s post as a reference while I was writing this post. Thank you Carlos!