Leader election using etcd

In a distributed system, leader election can be critical to coordinate work. In this post, I will quickly show how you can use etcd, a consistent key-value store which uses RAFT protocol for consensus and is used by famous projects like Kubernetes etc, to add leader election to your distributed application.

First, we need a c++ etcd v3 client. We will use GitHub — etcd-cpp-apiv3/etcd-cpp-apiv3 (follow the instructions on the ReadMe to build).

Approach

We will use the add and watch api provided by etcd.

Step 1: Each app instance takes lease with a keep alive time and using the lease ID tries to add a key value pair where key is “election key for our distributed application” and value is “unique ID of the app instance”. The writes in etcd are guaranteed to be atomic. Thus, the very first instance which is able to write the key with the lease becomes the leader. Any other instance which started the request to add key to become a leader in parallel will see that the key already exists and will take the existing value of key as the leader ID.

Now, in case the leader app instance dies due to any failure, the lease will expire and the election key will be deleted by etcd. We want at this moment to start a new election. This is where we will use watch api from etcd.

Step 2: Each app instance creates a watcher instance on the election key and reacts to “delete” action on the key to trigger the election again. In case the leader instance is slow and keep alive update is not sent in time and the lease expires even though the leader app instance has not crashed yet, we want to make sure that this also informs the current leader of the new election contest result. Thus, app also reacts to “set” event on the election key to read the new elected leader ID.

Note: The following section assumes that etcd is already up & running.

Code

Let’s say this is our app, where the instance Id is provided to the ctor:

Our aim is to implement StartElection and WatchForLeaderChange member functions. In order to do so, we require an etcd client, keepalive instance which takes care of refreshing lease based on our keep alive time and a watcher instance. As per etcd client documentation, watcher instance takes a callback. In our case, we will use WatchForLeaderChangeCallback member function for that purpose. Let’s fill these functions in the above class now.

Let’s look at the final code with main function:

Let’s compile and run:

# compile
g++ -std=c++14 -I /usr/local/include /usr/local/lib/libetcd-cpp-api.dylib MyApp.cpp -o MyApp
# run instance 1
./MyApp id1
# in another terminal, run instance 2
./MyApp id2
# in another terminal, run instance 3
./MyApp id3

Since we started instance 1 first, it becomes the leader. Try stopping the first instance, after 10 seconds (our lease keep alive time), etcd will delete the key and a delete action would be processed by watcher per instance triggering a new election.

Result: Leader election is enabled for our distributed application.

Note: Leader unavailability post current leader crash till new leader is elected is sensitive to keep alive time for the lease on election key. Tune it accordingly for your application.

I write about tech