Instead of using requests
, you could use httpx
, which offers an async
API as well (httpx
is also suggested in FastAPI's documentation when performing async
tests, as well as FastAPI/Starlette recently replaced the HTTP client on TestClient
from requests
to httpx
).
The below example is based on the one given in httpx
documentation, demonstrating how to use the library for making an asynchronous HTTP(s) request, and subsequently, streaming the response back to the client. The httpx.AsyncClient()
is what you can use instead of requests.Session()
, which is useful when several requests are being made to the same host, as the underlying TCP connection will be reused, instead of recreating one for every single request—hence, resulting in a significant performance improvement. Additionally, it allows you to reuse headers
and other settings (such as proxies
and timeout
), as well as persist cookies
, across requests. You spawn a Client
and reuse it every time you need it. You can use await client.aclose()
to explicitly close the client once you are done with it (you could do that inside a shutdown
event handler). Examples and more details can also be found in this answer.
Example
from fastapi import FastAPI
from starlette.background import BackgroundTask
from fastapi.responses import StreamingResponse
import httpx
app = FastAPI()
@app.on_event("startup")
async def startup_event():
app.state.client = httpx.AsyncClient()
@app.on_event('shutdown')
async def shutdown_event():
await app.state.client.aclose()
@app.get('/')
async def home():
client = app.state.client
req = client.build_request('GET', 'https://www.example.com/')
r = await client.send(req, stream=True)
return StreamingResponse(r.aiter_raw(), background=BackgroundTask(r.aclose))
Example (Updated)
Since startup
and shutdown
have now been deprecated (and might be completely removed in the future), you could instead use a lifespan
handler to initialise the httpx
Client, as well as close the Client instance on shutdown, similar to what has been demonstrated in this answer. Starlette specifically provides an example using a lifespan
handler and httpx
Client in their documentation page. As described in Starlette's documentation:
The lifespan
has the concept of state
, which is a dictionary that
can be used to share the objects between the lifespan, and the
requests.
The state
received on the requests is a shallow copy of the state
received on the lifespan handler.
Hence, objects added to the state in the lifespan handler can be accessed inside endpoints using request.state
. The example below uses a streaming response to both communicate with the external server, as well as send the response back to the client. See here for more details on the async
response streaming methods of httpx
(i.e., aiter_bytes()
, aiter_text()
, aiter_lines()
, etc.).
If you would like to use a media_type
for the StreamingResponse
, you could use the one from the original response like this: media_type=r.headers['content-type']
. However, as described in this answer, you need to make sure that the media_type
is not set to text/plain
; otherwise, the content would not stream as expected in the browser, unless you disable MIME Sniffing (have a look at the linked answer for more details and solutions).
from fastapi import FastAPI, Request
from contextlib import asynccontextmanager
from fastapi.responses import StreamingResponse
from starlette.background import BackgroundTask
import httpx
@asynccontextmanager
async def lifespan(app: FastAPI):
# Initialise the Client on startup and add it to the state
async with httpx.AsyncClient() as client:
yield {'client': client}
# The Client closes on shutdown
app = FastAPI(lifespan=lifespan)
@app.get('/')
async def home(request: Request):
client = request.state.client
req = client.build_request('GET', 'https://www.example.com')
r = await client.send(req, stream=True)
return StreamingResponse(r.aiter_raw(), background=BackgroundTask(r.aclose))
If, for any reason, you need to read the content chunk by chunk on server side before responding back to the client, you could do this as follows:
@app.get('/')
async def home(request: Request):
client = request.state.client
req = client.build_request('GET', 'https://www.example.com')
r = await client.send(req, stream=True)
async def gen():
async for chunk in r.aiter_raw():
yield chunk
await r.aclose()
return StreamingResponse(gen())
If you don't want to use a streaming response, but rather have httpx
reading the response for you in the first place (which would store the response data to the server's RAM; hence, you should make sure there is enough space available to accommodate the data), you could use the following. Note that using r.json()
should only apply to cases where the response data are in JSON format; otherwise, you could return a PlainTextResponse
or a custom Response
directly, as demonstrated below.
from fastapi import Response
from fastapi.responses import PlainTextResponse
@app.get('/')
async def home(request: Request):
client = request.state.client
req = client.build_request('GET', 'https://www.example.com')
r = await client.send(req)
content_type = r.headers.get('content-type')
if content_type == 'application/json':
return r.json()
elif content_type == 'text/plain':
return PlainTextResponse(content=r.text)
else:
return Response(content=r.content)
Using the async
API of httpx
would mean that you have to define your endpoints with async def
; otherwise, you would have to use the standard synchronous API (for def
vs async def
see this answer), and as described in this github discussion:
Yes. HTTPX
is intended to be thread-safe, and yes, a single
client-instance across all threads will do better in terms of
connection pooling, than using an instance-per-thread.
You can also control the connection pool size using the limits
keyword argument on the Client
(see Pool limit configuration). For example:
limits = httpx.Limits(max_keepalive_connections=5, max_connections=10)
client = httpx.Client(limits=limits)