On occasion I'm getting a rate limit error without being over my rate limit. I'm using the text completions endpoint on the paid api which has a rate limit of 3,000 requests per minute. I am using at most 3-4 requests per minute.
Sometimes I will get the following error from the api:
- Status Code:
429
(Too Many Requests) - Open Ai error type:
server_error
- Open Ai error message:
That model is currently overloaded with other requests. You can retry your request, or contact us through our help center at help.openai.com if the error persists.
Open ai documentation states that a 429 error indicates that you have exceeded your rate limit which clearly I have not. https://help.openai.com/en/articles/6891829-error-code-429-rate-limit-reached-for-requests
The weird thing is the open ai error message is not stating that. It is giving the response I usually get from a 503
error (service unavailable).
I'd love to hear some thoughts on this, any theories, or if anyone else has been experiencing this.