Can't control rate limit on Google Cloud Tasks API

Asked 27/7, 2019 at 3:14 Answered 25/10, 2019 at 0:39

I'm trying to rate limit Google Cloud Tasks to no more than 1 processed task per second.

I've created my queue with:

gcloud tasks queues create my-queue \
          --max-dispatches-per-second=1 \
          --max-concurrent-dispatches=1 \
          --max-attempts=2 \
          --min-backoff=60s

Describing it gives me:

name: projects/my-project/locations/us-central1/queues/my-queue
rateLimits:
  maxBurstSize: 10
  maxConcurrentDispatches: 1
  maxDispatchesPerSecond: 1.0
retryConfig:
  maxAttempts: 2
  maxBackoff: 3600s
  maxDoublings: 16
  minBackoff: 60s
state: RUNNING

After creating a bunch of tasks I can see in the logs that many of them are undesirably being processed in the time period of 1 second:

2019-07-27 02:37:48 default[20190727t043306]  Received task with payload: {'id': 51}
2019-07-27 02:37:48 default[20190727t043306]  "POST /my_handler HTTP/1.1" 200
2019-07-27 02:37:49 default[20190727t043306]  Received task with payload: {'id': 52}
2019-07-27 02:37:49 default[20190727t043306]  "POST /my_handler HTTP/1.1" 200
2019-07-27 02:37:49 default[20190727t043306]  Received task with payload: {'id': 53}
2019-07-27 02:37:49 default[20190727t043306]  "POST /my_handler HTTP/1.1" 200
2019-07-27 02:37:49 default[20190727t043306]  Received task with payload: {'id': 54}
2019-07-27 02:37:49 default[20190727t043306]  "POST /my_handler HTTP/1.1" 200
2019-07-27 02:37:49 default[20190727t043306]  Received task with payload: {'id': 55}
2019-07-27 02:37:49 default[20190727t043306]  "POST /my_handler HTTP/1.1" 200
2019-07-27 02:37:49 default[20190727t043306]  Received task with payload: {'id': 56}
2019-07-27 02:37:49 default[20190727t043306]  "POST /my_handler HTTP/1.1" 200
2019-07-27 02:37:49 default[20190727t043306]  Received task with payload: {'id': 57}
2019-07-27 02:37:49 default[20190727t043306]  "POST /my_handler HTTP/1.1" 200
2019-07-27 02:37:49 default[20190727t043306]  Received task with payload: {'id': 58}

How do I properly enforce it to run no more than 1 task during this 1 second time interval?

Update 30/06:

I've tried it again with a basic setup, same issue.

More details on setup and process:

Source code https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/appengine/flexible/tasks, no modifications
Deploy app.yaml, not app.flexible.yaml
Trigger a task several times: python create_app_engine_queue_task.py --project=$PROJECT_ID --queue=$QUEUE_ID --location=$LOCATION_ID --payload=hello
Check logs: gcloud app logs read

This time they took a while to start processing, but after that it seems they were all processed more or less simultaneously:

Full logs:

2019-07-30 00:22:37 default[20190730t021951]  [2019-07-30 00:22:37 +0000] [9] [INFO] Starting gunicorn 19.9.0
2019-07-30 00:22:37 default[20190730t021951]  [2019-07-30 00:22:37 +0000] [9] [INFO] Listening at: http://0.0.0.0:8081 (9)
2019-07-30 00:22:37 default[20190730t021951]  [2019-07-30 00:22:37 +0000] [9] [INFO] Using worker: threads
2019-07-30 00:22:37 default[20190730t021951]  [2019-07-30 00:22:37 +0000] [23] [INFO] Booting worker with pid: 23
2019-07-30 00:22:37 default[20190730t021951]  [2019-07-30 00:22:37 +0000] [26] [INFO] Booting worker with pid: 26
2019-07-30 00:27:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:27:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:27:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:27:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:27:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:27:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:41 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:41 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:42 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:42 default[20190730t021951]  Received task with payload: hello
2019-07-30 00:37:43 default[20190730t021951]  "POST /example_task_handler HTTP/1.1" 200
2019-07-30 00:37:43 default[20190730t021951]  Received task with payload: hello

Caseous answered 27/7, 2019 at 3:14 Comment(15)

I've recreated your queue settings and I'm properly seeing tasks being executed at 1 per second. Can you give more details on how your process and setup? – Palacio 29/7, 2019 at 20:42

@AveriKitsch I've added some details – Caseous 30/7, 2019 at 0:49

@SamuelRizzo, can you also set maxBurstSize/max-burst-size to 1? I think this may have to do with the burst – Untouchable 30/7, 2019 at 3:57

@TarunLalwani this property is read-only, its value is automatically picked by the platform, according to cloud.google.com/tasks/docs/reference/rest/v2/…. I don't know how to manipulate it to become 1. – Caseous 30/7, 2019 at 8:20

I would think maxBurstSize would be set to the same as maxDispatchesPerSecond. Maybe it's a bug on the platform, or I'm failing to see how it's supposed to work. – Caseous 30/7, 2019 at 8:25

On the initial looks, it looks like a bug, you should open a support ticket. Because as per the params definition, you are doing everything right – Untouchable 30/7, 2019 at 8:27

Max Burst Size is not an editable field. It's is set by the service. It doesn't effect this. I haven't seen or heard of any bugs with this system so we will need to just check our bases. Are you using the correct queue? You have created "my-queue" but if you follow the sample exact it uses "my-appengine-queue". Can you describe your queue again to make sure the parameters are still set how you want them? – Palacio 30/7, 2019 at 15:46

Also check in your console to make sure the parameters are set correctly console.cloud.google.com/cloudtasks – Palacio 30/7, 2019 at 15:51

Pausing and restarting your queue also causes delays in processing. – Palacio 30/7, 2019 at 15:52

@AveriKitsch Thank you. I have double checked I'm using the correct queue (I even had only 1 when I first run into this issue) and I also have double checked the params in the console. Isn't Max Burst Size related to that? Curiously, in the logs there are exactly 10 tasks processed at 00:37:41. my-queue is not my queue name, I have replaced it for privacy reasons (and set the correct env var in the example) – Caseous 30/7, 2019 at 22:6

@SamuelRizzo Thanks for checking these parameters. Max Burst Size shouldn't be a factor here. However, I am in contact with another engineer to see if this is potentially a bug. – Palacio 31/7, 2019 at 16:13

I'm curious if you send 100 tasks, after the first 10 does your queue finally start processing at the 1/s rate? – Palacio 31/7, 2019 at 17:30

@AveriKitsch I've tested it with 100, that's exactly what happened. I've put the logs here gist.github.com/srizzo/38b7a810339c997deff817e8d1725b04. – Caseous 31/7, 2019 at 18:42

@AveriKitsch you can also see a screenshot of my console dropbox.com/s/bpr8gaawuexbgmz/tasks.png?dl=0. See how the parameters are correctly set, and how there are 65 completed tasks in the last minute. This number grows up to 70 and then starts to go down. I'm still suspicious Max Burst Size is what causes it. – Caseous 31/7, 2019 at 18:50

@SamuelRizzo Thank you so much for all this info. After the first 2 seconds, which processes 11 tasks, it does go down to 1 task per second. I will get more information on why this is happening and how to prevent it. – Palacio 31/7, 2019 at 19:54

tl;dr It's probably working as intended. You can expect an initial burst of maxBurstSize tasks which then slows down to maxDispatchesPerSecond.

The reason for this is the "token bucket" algorithm: there's a bucket which can hold at most maxBurstSize tokens, and initially has that many tokens in it. A task is dispatched if it's scheduled time has arrived AND there's a token in the bucket AND fewer than maxConcurrentDispatches are in flight, otherwise we wait for those conditions to be met. When a task is dispatched, a token is removed from the bucket. When the bucket is not full, tokens are added at a rate of maxDispatchesPerSecond. So the rate isn't precisely a limit on task dispatch. Tasks can be sent at an arbitrary rate as long as there are tokens in the bucket and tasks ready to run. Only when tasks have to wait for tokens do we have to slow down to the given rate. Since the bucket starts full you can get an initial burst.

In the Cloud Tasks API and console the bucket size is read only (and the API calls it max_burst_size). But using the older queue.yaml configuration you can control bucket size along with the other parameters, e.g.

queue:
- name: my-appengine-queue
  rate: 2/s
  bucket_size: 20
  max_concurrent_requests: 5

Then gcloud app deploy queue.yaml. If you do that however, be aware of these pitfalls: https://cloud.google.com/tasks/docs/queue-yaml#pitfalls

FYI there's an issue open to see if docs can be improved.

Motorcar answered 1/8, 2019 at 19:1 Comment(7)

Thank you. That means right now with the Tasks Api it's just impossible to reliably limit it to no more than 1 message per second, if tasks are queued in bursts, which is my case. Are there plans to make bucket_size also configurable in the Tasks Api? – Caseous 1/8, 2019 at 19:21

It was configurable with queue.yaml, then when Cloud Tasks was introduced it was read-only there. Must have been a reason for that. Probably some new information/need would have to come to light to reverse that decision. If you put a clump of tasks in the queue, scheduled for the same time, when the bucket has tokens, then they will come off all at once. You could control things by spreading out the schedule. – Motorcar 1/8, 2019 at 19:47

If I scheduled each task with a 1 second gap... what if I have to pause the queue at some point, and resume it after more than 10 are now scheduled for the past... wouldn't I see the burst effect again? – Caseous 1/8, 2019 at 23:39

I've opened a feature request, it would be nice to have this configuration ported to the new Tasks API issuetracker.google.com/issues/138813037 – Caseous 1/8, 2019 at 23:40

Yes when the queue is unpaused, min(ready tasks, maxBurstSize) will come off all at once. Subject to maxConcurrentDispatches of course, i.e. depends on how fast they are handled. Is there some reason queue.yaml does not solve your problem? – Motorcar 2/8, 2019 at 15:16

Thank you. I'm mostly evaluating apis and alternatives and have already come up with a solution that worked for me. When I first opened this question I thought I was doing something wrong, but it seems now it's just a limitation of the Tasks API (a huge one in my opinion). In regards to queue.yaml, I understand it's not an option if I want to have HTTP targets outside of App Engine (the reason why I thought Tasks were awesome in the first place), but otherwise, yes, they would do what I wanted. – Caseous 2/8, 2019 at 18:52

HTTP targets outside of AppEngine is in beta: cloud.google.com/tasks/docs/creating-http-target-tasks As far as I know, everything about it including parameters set by queue.yaml or otherwise, works the same as with targets inside AppEngine. In case it wasn't clear, you CAN use queue.yaml to set the bucket size to one on your Cloud Tasks queues. It's NOT just for the old AppEngine queues. – Motorcar 2/8, 2019 at 19:32

The queues created using queue.yaml and the queues created by Cloud Tasks - however you do it - are the same queues. There are, however, issues that can arise if you mix using queue.yaml and Cloud Tasks Queue Management methods. See https://cloud.google.com/tasks/docs/queue-yaml for more information.

Olio answered 25/10, 2019 at 0:39 Comment(1)

Thanks, I understand that. I assumed one would be able to do with the Cloud Tasks api everything one used to be able to with queue.yaml, but that's just not the case right now. I've opened an issue and hope they will port it issuetracker.google.com/issues/138813037 – Caseous 25/10, 2019 at 12:42

Recommended topics

Hot tags