How to use LeaderElection recipe efficiently using Curator for Zookeeper?
Asked Answered
T

4

11

I am using Apache Curator library for doing leadership election on the Zookeeper. I have my application code deployed in various machines and I need to execute my code from one machine only so that's why I am doing leadership election on the zookeeper so that I can check if I am the leader, then execute this code.

Below is my LeaderElectionExecutor class which makes sure I am having one Curator instance per application

public class LeaderElectionExecutor {

    private ZookeeperClient zookClient;

    private static final String LEADER_NODE = "/testleader";

    private static class Holder {
        static final LeaderElectionExecutor INSTANCE = new LeaderElectionExecutor();
    }

    public static LeaderElectionExecutor getInstance() {
        return Holder.INSTANCE;
    }

    private LeaderElectionExecutor() {
        try {
            String hostname = Utils.getHostName();

            String nodes = "host1:2181,host2:2181;

            zookClient = new ZookeeperClient(nodes, LEADER_NODE, hostname);
            zookClient.start();

            // added sleep specifically for the leader to get selected
            // since I cannot call isLeader method immediately after starting the latch
            TimeUnit.MINUTES.sleep(1);
        } catch (Exception ex) {
            // logging error
            System.exit(1);
        }
    }

    public ZookeeperClient getZookClient() {
        return zookClient;
    }
}

And below is my ZookeeperClient code -

// can this class be improved in any ways?
public class ZookeeperClient {

    private CuratorFramework client;
    private String latchPath;
    private String id;
    private LeaderLatch leaderLatch;

    public ZookeeperClient(String connString, String latchPath, String id) {
        client = CuratorFrameworkFactory.newClient(connString, new ExponentialBackoffRetry(1000, Integer.MAX_VALUE));
        this.id = id;
        this.latchPath = latchPath;
    }

    public void start() throws Exception {
        client.start();
        leaderLatch = new LeaderLatch(client, latchPath, id);
        leaderLatch.start();
    }

    public boolean isLeader() {
        return leaderLatch.hasLeadership();
    }

    public Participant currentLeader() throws Exception {
        return leaderLatch.getLeader();
    }

    public void close() throws IOException {
        leaderLatch.close();
        client.close();
    }

    public CuratorFramework getClient() {
        return client;
    }

    public String getLatchPath() {
        return latchPath;
    }

    public String getId() {
        return id;
    }

    public LeaderLatch getLeaderLatch() {
        return leaderLatch;
    }
}

Now in my application, I am using the code like this -

public void method01() {
    ZookeeperClient zookClient = LeaderElectionExecutor.getInstance().getZookClient();
    if (zookClient.isLeader()) {
        // do something
    }
}

public void method02() {
    ZookeeperClient zookClient = LeaderElectionExecutor.getInstance().getZookClient();
    if (zookClient.isLeader()) {
        // do something
    }
}

Problem Statement:-

In the Curator library - Calling isLeader() immediately after starting the latch will not work. It takes time for the leader to get selected. And because of this reason only, I have added a sleep of 1 minute in my LeaderElectionExecutor code which works fine but I guess is not the right way to do this.

Is there any better way of doing this? Keeping this in mind, I need a way to check whether I am the leader then execute this piece of code. I cannot do everything in a single method so I need to call isLeader method from different classes and methods to check if I am the leader then execute this piece of code only.

I am using Zookeeper 3.4.5 and Curator 1.7.1 version.

Tweezers answered 18/1, 2015 at 4:13 Comment(0)
V
1

Once I solved a problem very similar to yours. This is how I did it.

First, I had my objects managed by Spring. So, I had a LeaderLatch that was injectable through the container. One of the components that used the LeaderLatch was a LeadershipWatcher, an implementation of Runnable interface that would dispatch the leadership event to other components. These last components were implementations of an interface that I named LeadershipObserver. The implementation of the LeadershipWatcher was mostly like the following code:

@Component
public class LeadershipWatcher implements Runnable {
  private final LeaderLatch leaderLatch;
  private final Collection<LeadershipObserver> leadershipObservers;

  /* constructor with @Inject */

  @Override
  public void run() {
    try {
      leaderLatch.await();

      for (LeadershipObserver observer : leadershipObservers) {
        observer.granted();
      }
    } catch (InterruptedException e) {
      for (LeadershipObserver observer : leadershipObservers) {
        observer.interrupted();
      }
    }
  }
}

As this is just a sketch-up, I recommend you to enhance this code, maybe applying the command pattern for calling the observers, or even submitting the observers to thread pools, if their job are blocking or long-running CPU intensive tasks.

Valerivaleria answered 13/2, 2015 at 16:34 Comment(0)
M
0

I've not worked with zookeeper or curator before, so take my answer with a grain of salt.

Set a flag.

Boolean isLeaderSelected = false;

At the beginning of the Latch, set the flag to false. When the leader has been selected, set the flag to true.

In the isLeader() function:

isLeader(){
while(!isLeaderSelected){} //waits until leader is selected

//do the rest of the function
}

This is also a relatively hacky workaround, but it should allow the isLeader method to execute as soon as it can. In the case that they are in different classes, a getter should be able to provide isLeaderSelected.

Mattiematting answered 8/2, 2015 at 9:53 Comment(0)
R
0
leaderLatch = new LeaderLatch(curatorClient, zkPath, String.valueOf(new Random().nextInt()));
leaderLatch.start();
Participant participant;
while(true) {
  participant = leaderLatch.getLeader();
  // Leader election happens asynchronously after calling start, this is a hack to wait until election happens
  if (!(participant.getId().isEmpty() || participant.getId().equalsIgnoreCase(""))) {
    break;
  }
}
if(leaderLatch.hasLeadership()) {
...
}

Note that getLeader returns a dummy participant with id "" until it elects a leader.

Remodel answered 31/8, 2015 at 22:25 Comment(0)
Y
0

Here's to reviving an old question...

This is similar to the answer srav gave, but I would caution against using that code because it utilizes a busy-wait and can cause certain callbacks that are issued in-thread to never be called, possibly blocking forever. Furthermore, it could retry forever if there are real issues.

This was my solution, which utilizes the CuratorClient's retry policy to attempt waiting on leadership election if necessary.

    RetryPolicy retryPolicy = _client.getZookeeperClient().getRetryPolicy();
    RetrySleeper awaitLeadership = _leaderLatch::await;

    final long start = System.currentTimeMillis();
    int count = 0;

    do {
        try {
            // curator will return a dummy leader in the case when a leader has
            // not yet actually been elected. This dummy leader will have isLeader
            // set to false, so we need to check that we got a true leader
            if (_leaderLatch.getLeader().isLeader()) {
                return;
            }
        } catch (KeeperException.NoNodeException e) {
            // this is the case when the leader node has not yet been created
            // by any client - this is fine because we are still waiting for
            // the algorithm to start up so we ignore the error
        }
    } while (retryPolicy.allowRetry(count++, System.currentTimeMillis() - start, awaitLeadership));

    // we have exhausted the retry policy and still have not elected a leader
    throw new IOException("No leader was elected within the specified retry policy!");

Though taking a look at your CuratorFramework initialization I'd caution against using Integer.MAX_VALUE when specifying the retry policy...

I hope this helps!

Yarkand answered 10/5, 2017 at 4:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.