Change the connection pool size for Python's "requests" module when in Threading
Asked Answered
M

4

72

(edit: Perhaps I am wrong in what this error means. Is this indicating that the connection pool at my CLIENT is full? or a connection pool at the SERVER is full and this is the error my client is being given?)

I am attempting to make a large number of http requests concurrently using the python threading and requests module. I am seeing this error in logs:

WARNING:requests.packages.urllib3.connectionpool:HttpConnectionPool is full, discarding connection:

What can I do to increase the size of the connection pool for requests?

Myramyrah answered 27/8, 2013 at 12:55 Comment(0)
R
142

This should do the trick:

import requests.adapters

session = requests.Session()
adapter = requests.adapters.HTTPAdapter(pool_connections=100, pool_maxsize=100)
session.mount('http://', adapter)
response = session.get("/mypage")
Rooke answered 17/9, 2013 at 9:21 Comment(9)
That worked after replacing http with https. Also I think pool_connections is unnecessary.Panpipe
Does each session have its own connection pool or do multiple sessions share a connection pool?Panpipe
@Panpipe probably it's possible to share it by adding one instance of adapter to multiple sessions. But probably it's not really good idea to do.Acidfast
How can I check the size of the current pool before increasing it?Wince
@JohnStrood check out sess.adapters['https://']._pool_maxsize and session.adapters['https://']._pool_connections. Seems like both of them are equal to 10 by default.Hum
Note that (pool_connections=100, pool_maxsize=100) are very high values. You should tailor them to your actual scenario, considering the number of different hosts you're connecting to and how many worker threads you're using.Matthaeus
@JohnStrood: a more compliant way, without relying on any "private" attributes, to check it for a given URL would be sess.get_adapter(url).poolmanager.connection_pool_kw['maxsize'].Matthaeus
@Rooke - nice one - how would I use this with webdriver/selenium (is there something like webdriver=webdriver.Session()?Stollings
@JohnStrood - could you give an example on how you would implement that?Clippard
G
34

Note: Use this solution only if you cannot control the construction of the connection pool (as described in @Jahaja's answer).

The problem is that the urllib3 creates the pools on demand. It calls the constructor of the urllib3.connectionpool.HTTPConnectionPool class without parameters. The classes are registered in urllib3 .poolmanager.pool_classes_by_scheme. The trick is to replace the classes with your classes that have different default parameters:

def patch_http_connection_pool(**constructor_kwargs):
    """
    This allows to override the default parameters of the 
    HTTPConnectionPool constructor.
    For example, to increase the poolsize to fix problems 
    with "HttpConnectionPool is full, discarding connection"
    call this function with maxsize=16 (or whatever size 
    you want to give to the connection pool)
    """
    from urllib3 import connectionpool, poolmanager

    class MyHTTPConnectionPool(connectionpool.HTTPConnectionPool):
        def __init__(self, *args,**kwargs):
            kwargs.update(constructor_kwargs)
            super(MyHTTPConnectionPool, self).__init__(*args,**kwargs)
    poolmanager.pool_classes_by_scheme['http'] = MyHTTPConnectionPool

Then you can call to set new default parameters. Make sure this is called before any connection is made.

patch_http_connection_pool(maxsize=16)

If you use https connections you can create a similar function:

def patch_https_connection_pool(**constructor_kwargs):
    """
    This allows to override the default parameters of the
    HTTPConnectionPool constructor.
    For example, to increase the poolsize to fix problems
    with "HttpSConnectionPool is full, discarding connection"
    call this function with maxsize=16 (or whatever size
    you want to give to the connection pool)
    """
    from urllib3 import connectionpool, poolmanager

    class MyHTTPSConnectionPool(connectionpool.HTTPSConnectionPool):
        def __init__(self, *args,**kwargs):
            kwargs.update(constructor_kwargs)
            super(MyHTTPSConnectionPool, self).__init__(*args,**kwargs)
    poolmanager.pool_classes_by_scheme['https'] = MyHTTPSConnectionPool
Glyceric answered 7/3, 2014 at 15:6 Comment(9)
Requests has a built-in API for supplying ConnectionPool constructor params, patching the constructor is unnecessary. (See @Jahaja's answer.)Soulier
It depends on the context. If you have control over creating the HTTPAdapter, using the constructor is the correct solution. But there are cases where the connection pool is initialised somewhere deeply buried in some framework or library. In those cases you can patch the library or patch the connection pool constructor as I have described above.Glyceric
I added a clarification to my solution.Glyceric
I suppose that's a valuable reference, but honestly it's answering a different question. :) The original question pertains specifically to how to change this in Requests, not a different hypothetical library.Soulier
Yes it may be the answer to a different question, but this is the question that I found when I searched for something like: HttpConnectionPool is full, discarding connection python. But the solution did not help me, because my connection pool is created by some library (in my case pyes).Glyceric
@Glyceric did you ever find a way to silence this warning in the case of not having control of the code/library?Hattiehatton
@shazow, firstly ConnectionPool is just a base class and the only thing you can do is to subclass it, but not passing pool_maxsize or any other (only host and port). And secondly, the initial question was addressed exactly to requests/urllib3 library, cause it is the best pythonic solution for handling HTTP, so I don't see any prohibitions answering specifically in the context of those libsSleeping
@Glyceric I was wondering how you could obtain the size of the existing connection pool? Before increasing the pool.Wince
Any way to also increase the number of pools used? It's 10 by default... github.com/urllib3/urllib3/blob/…Vulva
M
11

Jahaja's answer already gives the recommended solution to your problem, but it does not answer what is going on or, as you asked, what this error means.

Some very detailed information about this is in urllib3 official documentation, the package requests uses under the hood to actually perform its requests. Here are the relevant parts for your question, adding a few notes of my own and ommiting code examples since requests have a different API:

The PoolManager class automatically handles creating ConnectionPool instances for each host as needed. By default, it will keep a maximum of 10 ConnectionPool instances [Note: That's pool_connections in requests.adapters.HTTPAdapter(), and it has the same default value of 10]. If you’re making requests to many different hosts it might improve performance to increase this number

However, keep in mind that this does increase memory and socket consumption.

Similarly, the ConnectionPool class keeps a pool of individual HTTPConnection instances. These connections are used during an individual request and returned to the pool when the request is complete. By default only one connection will be saved for re-use [Note: That's pool_maxsize in HTTPAdapter(), and requests changes the default value from 1 to 10]. If you are making many requests to the same host simultaneously it might improve performance to increase this number

The behavior of the pooling for ConnectionPool is different from PoolManager. By default, if a new request is made and there is no free connection in the pool then a new connection will be created. However, this connection will not be saved if more than maxsize connections exist. This means that maxsize does not determine the maximum number of connections that can be open to a particular host, just the maximum number of connections to keep in the pool. However, if you specify block=True [Note: Available as pool_block in HTTPAdapter()] then there can be at most maxsize connections open to a particular host

Given that, here's what happened in your case:

  • All pools mentioned are CLIENT pools. You (or requests) have no control over any server connection pools
  • That warning is about HttpConnectionPool, i.e, the number of simultaneous connections made to the same host, so you could increase pool_maxsize to match the number of workers/threads you're using to get rid of the warning.
  • Note that requests is already opening as many simultaneous connections as you ask for, regardless of pool_maxsize. If you have 100 threads, it will open 100 connections. But with the default value only 10 of them will be kept in the pool for later reuse, and 90 will be discarded after completing the request.
  • Thus, a larger pool_maxsize increases performance to a single host by reusing connections, not by increasing concurrency.
  • If you're dealing with multiple hosts, then you might change pool_connections instead. The default is 10 already, so if all your requests are to the same target host, increasing it will not have any effect on performance (but it will increase the resources used, as said in above documentation)
Matthaeus answered 17/3, 2021 at 11:42 Comment(0)
I
2

In case anyone needs to do it with Python Zeep and wants to safe bit of time to figure out here is a quick recipe:

from zeep import Client
from requests import adapters as request_adapters

soap = "http://example.com/BLA/sdwl.wsdl"
wsdl_path = "http://example.com/PATH/TO_WSLD?wsdl"
bind = "Binding"
client = Client(wsdl_path)  # Create Client

# switch adapter
session = client.transport.session
adapter = request_adapters.HTTPAdapter(pool_connections=10, pool_maxsize=10)
# mount adapter
session.mount('https://', adapter)
binding = '{%s}%s' % (soap, bind)

# Create Service
service = client.create_service(binding, wsdl_path.split('?')[0])

Basically the connection should be created before creating the service

The answer is actualy taken from the python-zeep Repo from a closed issue, for refence I'll add it --> here

Incurious answered 13/1, 2022 at 16:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.