C#: Throttle/rate limit outgoing HTTP requests with Polly

Asked 13/9, 2022 at 14:12 Answered 14/9, 2022 at 11:1

c#dotnet-httpclient rate-limiting polly resiliency

I am developing an integration solution that accesses a rate limited API. I am performing a variety of CRUD operations on the API using multiple HTTP verbs on different endpoints (on the same server though). I have been pointed towards Polly multiple times, but I haven't managed to come up with a solution that actually works.

This is what I have in my startup:

builder.Services
    .AddHttpClient("APIClient", client =>
    {
        client.BaseAddress = new Uri(C.Configuration.GetValue<string>("APIBaseAddress"));
    })
    .AddTransientHttpErrorPolicy(builder => 
        builder.WaitAndRetryAsync(new []
        {
           TimeSpan.FromSeconds(1),
           TimeSpan.FromSeconds(5),
           TimeSpan.FromSeconds(15),
        }));

This is just resilience to retry in case of failure. I have a RateLimit policy in a singleton ApiWrapper class:

public sealed class ApiWrapper
{
    private static readonly Lazy<ApiWrapper> lazy = new Lazy<ApiWrapper>(() => new ApiWrapper());
    public static ApiWrapper Instance { get { return lazy.Value; } }
    private IHttpClientFactory _httpClientFactory;
    public readonly AsyncRateLimitPolicy RateLimit = Policy.RateLimitAsync(150, TimeSpan.FromSeconds(10), 50); // 150 actions within 10 sec, 50 burst

    private ApiWrapper()
    {
    }

    public void SetFactory(IHttpClientFactory httpClientFactory)
    {
        _httpClientFactory = httpClientFactory;
    }

    public HttpClient GetApiClient()
    {
        return _httpClientFactory.CreateClient("APIClient");
    }
}

That policy is used in multiple other classes like this:

public class ApiConsumer
{
    private HttpClient _httpClient = ApiWrapper.Instance.GetApiClient();

    public async Task<bool> DoSomethingWithA(List<int> customerIDs)
    {
        foreach (int id in customerIDs)
        {
            HttpResponseMessage httpResponse = await ApiWrapper.Instance.RateLimit.ExecuteAsync(() => _httpClient.GetAsync($"http://some.endpoint"));
        }
    }
}

My expectation was that the rate limiter would not fire more requests than configured, but that does not seem to be true. From my understanding the way it works is that the rate limiter just throws an exception if there are more calls than the limit that has been configured. That's where I thought the Retry policy would come into play, so just try again after 5 or 15 seconds if it did not go through the limiter.

Then I played around a bit with Polly's Bulkhead policy, but as far as I can see that is meant to limit the amount of parallel executions.

I have multiple threads that may use different HttpClients (all created by the Factory like in the example above) with different methods and endpoints, but all use the same policies. Some threads run in parallel, some sequentially as I have to wait for their response before sending the next requests.

Any suggestions on how this can or should be achieved with Polly? (Or any other extension if there is good reason to)

Came answered 13/9, 2022 at 14:12 Comment(10)

With some APIs, when they return 429 Too Many Requests, the response includes a parameter that says when to try again (either in seconds or an absolute time). So basically, the API tells you when to try again, rather than you trying again and immediately being rejected. – Revile 13/9, 2022 at 14:18

That's not Polly's job. Polly is used for recovery, retries and ensuring you don't exceed the rate limit. How are you going to handle throttled requests though? Will you reject them? Queue them? That's an important application feature, not something you can just get out of a library – Tare 13/9, 2022 at 14:19

What about the code that made those requests? Do you allow it to keep generating requests that can't be served or do you use a backpressure mechanism to tell it to slow down? That's the job of Channels, the Dataflow library, Parallel.ForEachAsync etc. – Tare 13/9, 2022 at 14:21

Thanks all for the headsup. Probably it was too simple to assume that Polly would just queue those requests and only send them one by one ensuring the rate limit is not hit. @Panagiotis: I assume I need to create a queue, some mechanism to process the items on the queue and return the responses to the requesting thread. Is there any extension/framework you would recommend to look at? Will have a look at Dataflow, but not sure it is what I need as the requests per se can be fired sequentially, no need to have them in parallel... (Sorry, I'm quite new to C#) – Came 13/9, 2022 at 14:38

@Neil: I don't want to get a 429 in the first place... – Came 13/9, 2022 at 14:38

The point of a 429 is to let the server rate limit the client, not the other way around. If you call the server, get a 429, it says "try again in 500ms". If you try again in 490ms, you will get another 429 saying "try again in 10ms". Rate limiting is not under the clients (your) control. – Revile 13/9, 2022 at 16:2

@Came It is unclear to me which part of the resilience protocol are you at? Do you want to impose rate limiting on the server-side? Or do you want to define resilient clients which can overcome on throttling? Or both? – Alpenstock 13/9, 2022 at 16:17

@Peter: I want my client not to send more requests than the server is willing to handle, thus I'd like to throttle the outgoing requests. – Came 13/9, 2022 at 20:41

@Neil: I don't really agree, the whole idea of rate limiting is to free up resources, to ensure availability and prevent a server from being flooded. While technically you are absolutely right, I feel it makes more sense to have the clients throttle the requests which would reduce traffic and server load. – Came 13/9, 2022 at 20:46

How do you know a server is 'flooded'? By rate limiting yourself, you are only slowing yourself down. If the server can process 1M requests per second, and you are the only user, why not use all 1M slots? If there 1M requests from other clients, and you send another one, you slowing yourself down isn't going to get the request done any faster, and you may well get a 429 anyway, as that's the standard mechanism. – Revile 14/9, 2022 at 8:27

In this post I would like to clarify things around rate limiter and rate gate

Similarity

Both concepts can be used to throttle requests.
They sit between the clients and the server and they know about the server's capacity.

Difference

The limiter as its name implies limits the transient traffic. It short-cuts the requests if there are too many.
The gate on the other hand holds/delays the requests until there is enough capacity.

Algorithms

The rate limiter usually implements the leaky-bucket or token-bucket algorithm
- Polly's ratelimiter implements token bucket
The rate gate usually utilizies some queuing mechanism and timers
- Sample implementation

Alpenstock answered 14/9, 2022 at 11:1 Comment(4)

Thanks a lot for clarifying the terminology, I was not aware of that. Obviously the author of the RateLimiter package that I now use was not aware of that neither, so the package should actually be named RateGate or sth like that as it clearly implements timing functionality. – Came 14/9, 2022 at 11:9

@Came Yes exactly, the proper name of that package should use the rate gate. – Alpenstock 14/9, 2022 at 12:16

Do you have any sources you can cite for the term "rate gate" as I've only heard this called "throttling". – Climatology 19/9, 2024 at 18:11

@Climatology I was reading about the term 'rate gate' in a whitepaper which was listing common and less common resilience strategies (like rate gate, hedging, load shedding, etc.) I try to find that whitepaper I'm sure it is still available now. – Alpenstock 20/9, 2024 at 8:7

Thanks again to @Neil and @Panagiotis for pointing me in the right direction. I wrongly assumed that the Polly rate limiter would actually delay API calls. I found a workaround that probably is not particularly nice, but for my purpose it does the trick.

I installed David Desmaisons RateLimiter package which is super simple to use. In my singleton I have now this:

public TimeLimiter RateLimiter = TimeLimiter.GetFromMaxCountByInterval(150, TimeSpan.FromSeconds(10));

I use this RateLimiter everywhere I make calls to an API endpoint like this:

HttpResponseMessage httpResponse = await ApiWrapper.Instance.RateLimiter.Enqueue(() => _httpClient.GetAsync($"http://some.endpoint"), _cancellationToken);

Does exactly what I originally expected from Polly.

Came answered 14/9, 2022 at 8:10 Comment(1)

A life-saver! 🆘 – Quinquefid 4/9, 2023 at 17:50

Similarity

Difference

Algorithms

Recommended topics

Hot tags