How to deal with api that rate limits requests?

Asked 27/2, 2018 at 17:7 Answered 23/12, 2019 at 22:9

node.js architecture api-design throttling

For small app they are no problem.

But for apps with traffic you can hit limits easily.

Http protocol is req-res driven. Just because your backend is stuck with limit, you can't really wait to send respond back until rate limit allows you to resume making your api calls.

What do you do?

I can think of several scenarios:

Wait it out: while it sucks, but sometimes it's easy fix, as you don't need to do anything.

Queue it: this a lot of work oppose to making just api call. This requires that first you store it in database, then have background task go through database and do the task. Also user would be told "it is processing" not "it's done"

Use lot of apis: very hacky... and lot of trouble to manage. Say you are using amazon, now you would have to create, verify, validate like 10 accounts. Not even possible for where you need to verify with say domain name. Since amazon would know account abc already owns it.

Internalize answered 27/2, 2018 at 17:7 Comment(2)

Pay for access? – Amylum 27/2, 2018 at 17:29

by paying you still have limits, they are just higher – Internalize 27/2, 2018 at 17:59

There are two reasons why rate limits may cause you problems.

Chronic: (that is, a sustained situation). You are hitting rate limits because your sustained demand exceeds your allowance. In this case, consider a local cache, so you don't ask for the same thing twice. Hopefully the API you are using has a reliable "last-modified" date so you can detect when your cache is stale. With this approach, your API calling is to refresh your cache, and you serve requests from your cache.

If that can't help, you need higher rate limits

Acute: your application makes bursts of calls that exceed the rate limit, but on average your demand is under the limit. So you have a short term problem. I have settled on a brute-force solution for this ("shoot first, ask permission later"). I burst until I hit the rate limit, then I use retry logic, which is easy as my preferred tool is python, which supports this easily. The returned error is trapped and retry handling takes over. I think every mature library would have something like this.

https://urllib3.readthedocs.io/en/latest/reference/urllib3.util.html

The default retry logic is to backoff in increasingly big steps of time. This has a starvation risk, I think. That is, if there are multiple clients using the same API, they share the same rate limit as a pool. On your nth retry, your backoff may be so long that newer clients with shorter backoff times are stealing your slots ... by the time your long backoff time expires, the rate limit has already been consumed by a younger competitor, so you now retry even longer, making the problem worse,although at the limit, this just means the same as the chronic situation: the real problem is your total rate limit is insufficient, but you might not be sharing fairly among jobs due to starvation. An improvement is to provide a less naive algorithm, it's the same locking problem that you do in computer science (introducing randomisation is a big improvement). Once again, a mature library is aware of this and should help with built-in retry options.

Moresque answered 23/12, 2019 at 22:9 Comment(0)

To expand on what your queueing options are:

Unless you can design the problem of hitting this rate limit out of existence as @Hammerbot walks through, I would go with some implementation of queue. The solution can scale in complexity and robustness according to what loads you're facing and how many rate limited APIs you're dealing with.

Recommended

You use some library to take care of this for you. Node-rate-limiter looks promising. It still appears you would have to worry about how you handle your user interaction (make them wait, write to a db/cache-service and notify them later).

"Simplest case" - not recommended

You can implement a minimally functioning queue and back it with a database or cache. I've done this before and it was fine, initially. Just remember you'll run into needing to implement your own retry logic, will have to worry about things like queue starvation **. Basically, the caveats of rolling your own < insert thing whose implementation someone already worried about > should be taken into consideration.

**(e.g. your calls keep failing for some reason and all of a sudden your background process is endlessly retrying large numbers of failing queue work elements and your app runs out of memory).

Complex case:

You have a bunch of API calls that all get rate-limited and those calls are all made at volumes that make you start considering decoupling your architecture so that your user-facing app doesn't have to worry about handling this asynchronous background processing.

High-level architecture:

Your user-facing server pushes work units of different type onto different queues. Each of these queues corresponds to a differently rate-limited processing (e.g. 10 queries per hour, 1000 queries per day). You then have a "rate-limit service" that acts as a gate to consuming work units off the different queues. Horizontally distributed workers then only consume items from the queues if and only if the rate limit service says they can. The results of these workers could then be written to a database and you could have some background process to then notify your users of the result of the asynchronous work you had to perform.

Of course, in this case you're wading into a whole world of infrastructure concerns.

For further reading, you could use Lyft's rate-limiting service (which I think implements the token bucket algorithm to handle rate limiting). You could use Amazon's simple queueing service for the queues and Amazon lambda as the queue consumers.

Upanchor answered 27/2, 2018 at 18:7 Comment(1)

this is what i was afraid of, this isn't just few extra steps it's whooole new universe - what you are describing is pretty much microsrevices.. which means managing them, which means Kubernetes+docker+event system.. list goes on and on. Oh well. – Internalize 27/2, 2018 at 20:52

I think that this depends on which API you want to call and for what data.

For example, Facebook limits their API call to 200 requests per hour and per user. So if your app grows, and you are using their OAuth implementation correctly, you shouldn't be limited here.

Now, what data do you need? Do you really need to make all these calls? Is the information you call somewhat storable on any of your server?

Let's imagine that you need to display an Instagram feed on a website. So at each visitor request, you reach Instagram to get the pictures you need. And when your app grows, you reach the API limit because you have more visitors than what the Instagram API allows. In this case, you should definitely store the data on your server once per hour, and let your users reach your database rather than Instagram's one.

Now let's say that you need specific information for every user at each request. Isn't it possible to let that user handle his connection to the API's? Either by implementing the OAuth 2 flow of the API or by asking the user their API informations (not very secure I think...)?

Finally, if you really can't change the way you are working now, I don't see any other options that the ones you listed here.

EDIT: And Finally, as @Eric Stein stated in his comment, some APIs allow you to rise your API limit by paying (a lot of SaaS do that), so if your app grows, you should afford to pay for those services (they are delivering value to you, it's fair to pay them back)

Vesicate answered 27/2, 2018 at 17:28 Comment(0)

Recommended topics

Hot tags