We use Lambda to power APIs (via API Gateway) accessed via news media websites, receiving a fluctuating but high load of traffic. We began experiencing throttles, so we raised our concurrency limit to 2000. However, we still experience throttles multiple times per day.
Oddly in CloudWatch metrics, the concurrent requests peak at around 600 or lower when we're throttled. See this CloudWatch chart as an example:
Has anyone experienced this before? Why do you think this is happening? What can we do about it?
More Information
- This chart is across all Lambdas for our entire region.
- When throttling occurs, it happens across all Lambda instances.
- We primarily trigger Lambdas via API Gateway, but there's a few that are triggered via SNS (fairly high rate of data).
- We have CloudFront in front of all APIs, and with some of them we have a 5 second cache time (for the super frequently requested APIs - saves us $$$)
Additionally, here's an image that also shows total invocation count and average duration over the same time period. It's hard to know what's causal (duration up because of throttling, or vice versa, because some of the lambdas do call other lambdas). Please see the appropriate axis because the scales are quite different.