Is the Session object from Python's Requests library thread safe?
Asked Answered
D

3

73

Python's popular Requests library is said to be thread-safe on its home page, but no further details are given. If I call requests.session(), can I then safely pass this object to multiple threads like so:

session = requests.session()
for i in xrange(thread_count):
    threading.Thread(
        target=target,
        args=(session,),
        kwargs={}
    )

and make requests using the same connection pool in multiple threads?

If so, is this the recommended approach, or should each thread be given its own connection pool? (Assuming the total size of all the individual connection pools summed to the size of what would be one big connection pool, like the one above.) What are the pros and cons of each approach?

Dino answered 12/8, 2013 at 13:19 Comment(4)
Did you figure out which is better? I'm currently running into nearly the same question. I was thinking a new session for each thread so as to not bottleneck all requests in a single connection pool.Aragats
@Marcel Wilson Not exactly. Although for one of my projects where I was using a session object to request the same URL over and over again, I sent the same session object to all of the threads. The application does seem to work, but I am still not sure what the better approach is. Note, though, that my problem was not with bottlenecking the connection pools, but was instead with opening too many connections and sending too many requests at a time.Dino
Requests is built on top of urllib3. The thread-safety of requests is largely due to the thread-safety of urllib3, the doucmentation for which discusses thread safety in greater detail.Lesser
@dg123 I ended up creating a session in the for loop. Each thread gets it's own connection pool.Aragats
F
34

After reviewing the source of requests.session, I'm going to say the session object might be thread-safe, depending on the implementation of CookieJar being used.

Session.prepare_request reads from self.cookies, and Session.send calls extract_cookies_to_jar(self.cookies, ...), and that calls jar.extract_cookies(...) (jar being self.cookies in this case).

The source for Python 2.7's cookielib acquires a lock (threading.RLock) while it updates the jar, so it appears to be thread-safe. On the other hand, the documentation for cookielib says nothing about thread-safety, so maybe this feature should not be depended on?

UPDATE

If your threads are mutating any attributes of the session object such as headers, proxies, stream, etc. or calling the mount method or using the session with the with statement, etc. then it is not thread-safe.

Fictitious answered 8/12, 2013 at 19:4 Comment(1)
Yet, I assume if you do my_session.get(url, headers={"something": "something"}) this should not be an issue.Imaginative
F
34

https://github.com/psf/requests/issues/1871 implies that Session is not thread-safe, and that at least one maintainer recommends one Session per thread.

I just opened https://github.com/psf/requests/issues/2766 to clarify the documentation.

Flashbulb answered 10/9, 2015 at 13:50 Comment(2)
It looks like this depends on urllib3 being thread safe, which I don't believe it is based on github.com/urllib3/urllib3/issues/1252Southwards
Looks like that issue (#2766) got closed a couple of weeks ago. My interpretation of the close message and preceding discussion is that all known sources of threading issues have been fixed (notably: the redirect cache is gone, urllib3 has fixed its thread-safety issues, and the cookie jar is thread safe), and at this point there are no remaining reasons to believe that Session is not thread safe. That said, there doesn't appear to have been a careful audit of the code for thread safety either.Thelma
S
4

I also faced the same question and went to the source code to find a suitable solution for me. In my opinion Session class generally has various problems.

  1. It initializes the default HTTPAdapter in the constructor and leaks it if you mount another one to 'http' or 'https'.
  2. HTTPAdapter implementation maintains the connection pool, I think it is not something to create on each Session object instantiation.
  3. Session closes HTTPAdapter, thus you can't reuse the connection pool between different Session instances.
  4. Session class doesn't seem to be thread safe according to various discussions.
  5. HTTPAdapter internally uses the urlib3.PoolManager. And I didn't find any obvious problem related to the thread safety in the source code, so I would rather trust the documentation, which says that urlib3 is thread safe.

As the conclusion from the above list I didn't find anything better than overriding Session class

class HttpSession(Session):
    def __init__(self, adapter: HTTPAdapter):
        self.headers = default_headers()
        self.auth = None
        self.proxies = {}
        self.hooks = default_hooks()
        self.params = {}
        self.stream = False
        self.verify = True
        self.cert = None
        self.max_redirects = DEFAULT_REDIRECT_LIMIT
        self.trust_env = True
        self.cookies = cookiejar_from_dict({})
        self.adapters = OrderedDict()
        self.mount('https://', adapter)
        self.mount('http://', adapter)

    def close(self) -> None:
        pass

And creating the connection factory like:

class HttpSessionFactory:
    def __init__(self,
             pool_max_size: int = DEFAULT_CONNECTION_POOL_MAX_SIZE,
             retry: Retry = DEFAULT_RETRY_POLICY):
        self.__http_adapter = HTTPAdapter(pool_maxsize=pool_max_size, max_retries=retry)

    def session(self) -> Session:
        return HttpSession(self.__http_adapter)

    def close(self):
        self.__http_adapter.close()

Finally, somewhere in the code I can write:

with self.__session_factory.session() as session:
    response = session.get(request_url)

And all my session instances will reuse the same connection pool. And somewhere at the end when the application stops I can close the HttpSessionFactory. Hope this will help somebody.

Schwa answered 7/9, 2021 at 21:32 Comment(3)
2. I agree with you but I guess the requests library has to make some stricter assumptions about use cases beyond ours. 3. Only if you do close the Session, which is unnecessary precisely for reusable HTTP adapters. Do not use Sessions as context managers because of tradition!Walleyed
Overall, your snippet does not address Session thread-unsafety or avoid thread-blocking, which would mainly come, AFAIU, from the cookie jar and the redirection cache. It is simply about reusing the connection pool cross-thread. This is nice but this is not about thread-safety. Also, you might be interested in this simpler approach to the same idea: global adapter instead of a session factory https://mcmap.net/q/275752/-reusing-connections-in-django-with-python-requestsWalleyed
> 4. Session class doesn't seem to be thread safe according to various discussions. — which discussions? Could you link some?Eton

© 2022 - 2024 — McMap. All rights reserved.