How to implement distributed rate limiter?

Asked 19/7, 2015 at 9:53 Answered 28/12, 2021 at 0:16

guava distributed-computing apache-zookeeper rate-limiting apache-curator

Let's say, I have P processes running some business logic on N physical machines. These processes call some web service S, say. I want to ensure that not more than X calls are made to the service S per second by all the P processes combined.

How can such a solution be implemented?

Google Guava's Rate Limiter works well for processes running on single box, but not in distributed setup.

Are there any standard, ready to use, solutions available for JAVA? [may be based on zookeeper]

Thanks!

Zavala answered 19/7, 2015 at 9:53 Comment(4)

Curator (a commonly used Java ZooKeeper client) doesn't have this exact recipe. However Curator has a Distributed Long recipe that could be used as the basis for this: curator.apache.org/curator-recipes/distributed-atomic-long.html – Unassuming 19/7, 2015 at 17:36

Hm, isn't shared semaphore a suitable recipe for this case? curator.apache.org/curator-recipes/shared-semaphore.html – Bea 19/7, 2015 at 21:2

Problem with shared semaphore is that there has to someone who releases all the locks every second.I think distributed atomic long or integer will be suitable, but I haven't implemented my solution yet.. so not sure – Zavala 20/7, 2015 at 5:45

How about letting your processes fire as many requests to your web service as they want but load balance your access to this service via a web service gateway like spring cloud? – Mojave 22/7, 2015 at 8:7

Bucket4j is java implementation of "token-bucket" rate limiting algorithm. It works both locally and distributed(on top of JCache). For distributed use case you are free to choose any JCache implementation like Hazelcast or Apache Ignite. See this example of using Bucket4j in cluster.

Rickrack answered 28/4, 2017 at 21:31 Comment(1)

The link seems to be broken :( – Interlineate 16/1, 2023 at 7:55

I have been working on an opensource solution for these kind of problems.

Limitd is a "server" for limits. The limits are implemented using the Token Bucket Algorithm.

Basically you define limits in the service configuration like this:

buckets:
  "request to service a":
     per_minute: 10
  "request to service b":
     per_minute: 5

The service is run as a daemon listening on a TCP/IP port.

Then your application does something along these lines:

var limitd = new Limitd('limitd://my-limitd-address');

limitd.take('request to service a', 'app1' 1, function (err, result) {
  if (result.conformant) {
    console.log('everything is okay - this should be allowed');
  } else {
    console.error('too many calls to this thing');
  }
});

We are currently using this for rate-limiting and debouncing some application events.

The server is on:

https://github.com/auth0/limitd

We are planning to work on several SDKs but for now we only have node.js and partially implemented go:

https://github.com/limitd

Jojo answered 23/7, 2015 at 22:38 Comment(1)

Just curious and how do you HA your Limitd server wouldn't you have to share the state between your limitd HA setup.? – Harter 27/6, 2019 at 15:54

https://github.com/jdwyah/ratelimit-java provides distributed rate limits that should do just this. You can configure your limit as S per second / minute etc and choose burst size / refill rate of the leaky bucket that is under the covers.

Ritenuto answered 20/2, 2017 at 3:30 Comment(0)

Simple rate limiting in java where you want to achieve a concurrency of 3 transactions every 3 seconds. If you want to centralize this then either store the tokens array in elasticache or any database. And in place of synchronized block you will have to implement a lock flag as well.

import java.util.Date;

public class RateLimiter implements Runnable {

private long[] tokens = new long[3];

public static void main(String[] args) {
    // TODO Auto-generated method stub
    RateLimiter rateLimiter = new RateLimiter();
    for (int i=0; i<20; i++) {
        Thread thread = new Thread(rateLimiter,"Thread-"+i );
        thread.start();
    }
    
}

@Override
public void run() {
    // TODO Auto-generated method stub
    long currentStartTime = System.currentTimeMillis();
    while(true) {
    
    if(System.currentTimeMillis() - currentStartTime > 100000 ) {
        throw new RuntimeException("timed out");
    }else {
        if(getToken()) {
        System.out.println(Thread.currentThread().getName() + 
                " at " + 
                new Date(System.currentTimeMillis()) + " says hello");
        break;
        }
    }
    
    }
    
}

synchronized private boolean getToken() {
    // TODO Auto-generated method stub
    
    for (int i = 0; i< 3; i++) {
        if(tokens[i] == 0 || System.currentTimeMillis() - tokens[i] > 3000) {
            tokens[i] = System.currentTimeMillis();
            return true;
        }
    }
    
    return false;
}

}

Udelle answered 10/10, 2021 at 0:8 Comment(0)

So with all distributed rate limiting architecture, you'll need a single backend store that acts as single source of true to track the number of requests. You can always use zookeeper as a in memory datastore for this out of convenience, although there are better choices such as Redis.

Staging answered 28/12, 2021 at 0:16 Comment(1)

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center. – Shroud 28/12, 2021 at 8:36

Recommended topics

Hot tags