Scheduled tasks in cluster using zookeeper

Asked 15/12, 2016 at 6:39 Answered 19/12, 2016 at 18:57

spring apache-zookeeper distributed-system apache-curator spring-scheduled

We use Spring to run scheduled tasks which works fine with single node. We want to run these scheduled tasks in cluster of N nodes such that the tasks are executed atmost by one node at a point of time. This is for enterprise use case and we may expect upto 10 to 20 nodes.

I looked into various options:

Use Quartz which seems to be a popular choice for running scheduled tasks in a cluster. Drawback: Database dependency which I want to avoid.
Use zookeeper and always run the scheduled tasks only on the leader/master node. Drawback: Task execution load is not distributed
Use zookeeper and have the scheduled tasks invoke on all nodes. But before the task runs acquire distributed lock and release once execution is complete. Drawback: The system clock on all nodes should be in sync which may be an issue if application is overloaded causing system clock drift.
Use zookeeper and let the master node keep producing the task as per the schedule and assign it to a random worker. A new task is not assigned if previous scheduled task has not been worked on yet. Drawback: This appears to add too much complexity.

I am inclining towards using #3 which appears to be a safe solution assuming the zookeeper ensemble nodes run on a separate cluster with system clock in sync using NTP. This is also on assumption that if system clocks are in sync, then all nodes have equal chance of acquiring the lock to execute a task.
EDIT: After some more thought I realize this may not be a safe solution either since the system clock should be in sync between the nodes where the scheduled tasks are running not just the zookeeper cluster nodes. I am saying not safe because the nodes where the tasks are running can be overloaded with GC pauses and other reasons and there is possibility of clocks going out of sync. But again I would think this is a standard problem with distributed systems.

Could you please advise if my understanding on each of the options is accurate? Or may be there is a better approach than the listed options to solved this problem.

Accumbent answered 15/12, 2016 at 6:39 Comment(1)

FYI - I wrote a distributed task scheduler for Curator/ZooKeeper a while back. github.com/NirmataOSS/workflow – Jinny 19/12, 2016 at 16:25

Well, you can improve the #3 like this.

Zookeeper provide watchers. That is, you can set a watcher on a given ZNode (say at path /some/path). All your nodes in the cluster are watching the same Znode. Whenever a node thinks(as scheduled or whatever way) it should now run the scheduled task,

First it create a PERSISTENT_SEQUENTIAL child node under /some/path (which all the nodes are watching). Also, you can set the data of that node as you wish. It may be a json string specifying the details about the task to be run. The new ZNode path will look like /some/path/prefix_<sequence-number>.
Then, all the nodes in the cluster will be notified about the child node created. All of them then fetch the newly created ZNode's data and decode the task.
Now, each node try to acquire a distributed lock. Whoever acquiring it first can execute it. Once executed, that node should report (Say by creating a new ZNode under /some/path/prefix_<sequence-number> with name success), that that task was executed. Then release the lock.
Whenever a node is trying to execute a task, before trying to acquire the distributed lock, it should check if that ZNode already has a success child node.

This design ensures that no task is run twice by checking the child node with name success under a given ZNode created to notify to start a task.

I have used the above design for an enterprise solution. Actually for a distributed command framework ;-)

Minton answered 19/12, 2016 at 4:36 Comment(3)

I think this will work, however it appears similar to #4 I listed, except that all nodes are receiving the task and contending for a lock to execute. #4 also uses watchers but with the master-worker approach as described in link and it seems to be better option since only one random worker will get the task and there is no need of a distributed lock. – Accumbent 20/12, 2016 at 8:26

But for #4 wanted to check for inputs if there are any flaws or drawbacks I may be missing other than the complexity factor. – Accumbent 20/12, 2016 at 8:46

In this case, you don't have to worry about master and slave complexity. Just let them compete for a lock and do the task. – Minton 20/12, 2016 at 10:24

Zookeeper or Etcd aren't the best tools for this use case.

If your environment allows you to use akka it would be easier for you to use akka cluster + smallest mailbox router or whatever cluster router you prefer. Then push schedule jobs to the ActorRef for the cluster. Easier to set up, you can set up thousands of nodes in a cluster using it (it uses swim the protocol cassandra and nomad use).

Scalecube also would do it rather easily again it uses SWIM.

Varela answered 19/12, 2016 at 18:57 Comment(0)

Recommended topics

Hot tags