In requests library, how can I avoid "HttpConnectionPool is full, discarding connection" warning?
Asked Answered
V

2

18

I'm using python requests library with sessions:

def _get_session(self):
    if not self.session:
        self.session = requests.Session()
    return self.session

And sometimes I'm getting this warning in my logs:

[2014/May/12 14:40:04 WARNING ] HttpConnectionPool is full, discarding connection: www.ebi.ac.uk

My question is: why this is warning and not an exception?

This is the code responsible for this (from http://pydoc.net/Python/requests/0.8.5/requests.packages.urllib3.connectionpool/):

def _put_conn(self, conn):
    try:
        self.pool.put(conn, block=False)
    except Full:
        # This should never happen if self.block == True
        log.warning("HttpConnectionPool is full, discarding connection: %s"
                    % self.host)

Why this exception is catched here? If it was reraised, I could handle this exception in my code, by creating new session and deleting the old one.

If it's only a warning, does it mean it doesn't affect my results in any way? Can I ignore it? If not, how can I handle this situation?

Vacuity answered 13/5, 2014 at 13:36 Comment(7)
did you try setting self.block to True?Glendon
Do I really want my requests to block? Maybe this warning will disappear but are there any other consequences? There is some reason this is not True by default, right?Vacuity
:param block: If set to True, no more than **maxsize** connections will be used at a time. When no free connections are available, the call will block until a connection has been released.Glendon
So, when there are no free connections, you either have to have attempted connections block, or to discard them. Your choice.Glendon
But does it mean, that if it's set to False, more than maxsize connections will be used, so I'm safe anyway?Vacuity
No. If it's set to false, any attempted connections past maxsize are simply discarded on the spot (as shown in your log).Glendon
@roippi: wrong. If block==False, any attempted connections past maxsize will be normally performed. The only difference is that those extra connections will not be kept in the pool afterwards, hence "discarded". That's why this is just a warning, not an exception. All connections are made.Senna
T
16

From Requests docs in http://docs.python-requests.org/en/latest/api/

 class requests.adapters.HTTPAdapter(pool_connections=10, pool_maxsize=10, max_retries=0, pool_block=False)

The built-in HTTP Adapter for urllib3.

Provides a general-case interface for Requests sessions to contact HTTP and HTTPS urls by implementing the Transport Adapter interface. This class will usually be created by the Session class under the covers.

Parameters:

  • pool_connections – The number of urllib3 connection pools to cache.
  • pool_maxsize – The maximum number of connections to save in the pool.
  • max_retries (int) – The maximum number of retries each connection should attempt. Note, this applies only to failed connections and timeouts, never to requests where the server returns a response.
  • pool_block – Whether the connection pool should block for connections.

and a little below, comes an example

import requests
s = requests.Session()
a = requests.adapters.HTTPAdapter(max_retries=3)
s.mount('http://', a)

Try this

a = requests.adapters.HTTPAdapter(pool_connections = N, pool_maxsize = M)

Where N and M are suitable for your program.

Tinctorial answered 13/5, 2014 at 13:49 Comment(6)
In my case, ideal value for M is infinity :) If I use any specific value of M, then I need to count how many requests had been made in this session and then discard it, right?Vacuity
Are you firing connections in parallel or in sequence? How about close() explicitly Response objects ?Tinctorial
Then, you have to choose how to handle when pool limit is over. Or increase the limit or wait to make new requests. Other approach is look for code that is keeping unnecessary response objects and close them explicitly.Tinctorial
I guess waiting makes more sense here. OK, thanks a lot!Vacuity
I am wondering that is there any formula to calculate the M value?Pillage
@jerryleooo: a good value to start is the number of worker threads you're using.Senna
D
3

I'd like to clarify some stuff here.

What pool_maxsize argument does is limit the number of TCP connections that can be stored in the connection pool simultaneously. Normally, when you want to execute a HTTP requests, requests will try to take a TCP connection from its connection pool. If there are no available connections, requests will create a new TCP connection, and when it is done making a HTTP request, it will try to put it back in the pool (it will not remember whether the connection was taken from the connection pool or not).

The HttpConnectionPool is full warning being raised in requests code is just an example of a common Python pattern usually paraphrased as it is easier to ask for forgiveness than for permission. It has nothing with dropping TCP connections.

Damicke answered 18/4, 2019 at 13:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.