socket ResourceWarning using urllib in Python 3
Asked Answered
L

2

11

I am using a urllib.request.urlopen() to GET from a web service I'm trying to test.

This returns an HTTPResponse object, which I then read() to get the response body.

But I always see a ResourceWarning about an unclosed socket from socket.py

Here's the relevant function:

from urllib.request import Request, urlopen

def get_from_webservice(url):
    """ GET from the webservice  """
    req = Request(url, method="GET", headers=HEADERS)
    with urlopen(req) as rsp:
        body = rsp.read().decode('utf-8')
        return json.loads(body)

Here's the warning as it appears in the program's output:

$ ./test/test_webservices.py
/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/socket.py:359: ResourceWarning: unclosed <socket.socket object, fd=5, family=30, type=1, proto=6>
self._sock = None
.s
----------------------------------------------------------------------
Ran 2 tests in 0.010s

OK (skipped=1)

If there's anything I can do to the HTTPResponse (or the Request?) to make it close its socket cleanly, I would really like to know, because this code is for my unit tests; I don't like ignoring warnings anywhere, but especially not there.

Lipscomb answered 18/2, 2013 at 14:35 Comment(1)
I can't reproduce it on Python 3.3.1. Could you test it on the latest Python version; there were a couple of bugs related to closing the socket (ResourceWarning on timeout) and "Connection: close" response header (shows there are different code paths depending on the header).Schlimazel
L
5

I don't know if this is the answer, but it is part of the way to an answer.

If I add the header "connection: close" to the response from my web services, the HTTPResponse object seems to clean itself up properly without a warning.

And in fact, the HTTP Spec (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html) says:

HTTP/1.1 applications that do not support persistent connections MUST include the "close" connection option in every message.

So the problem was on the server end (i.e. my fault!). In the event that you don't have control over the headers coming from the server, I don't know what you can do.

Lipscomb answered 23/5, 2013 at 11:4 Comment(0)
W
1

I had the same problem with urllib3 and I just added a context manager to close connection automatically:

import urllib3

def get(addr, headers):
    """ this function will close the connection after a http request. """
    with urllib3.PoolManager() as conn:
        res = conn.request('GET', addr, headers=headers)
        if r.status == 200:
            return res.data
        else:
            raise ConnectionError(res.reason)

Note that urllib3 is designed to have a pool of connections and to keep connections alive for you. This can significantly speed up your application, if it needs to make a series of requests, e.g. few calls to the backend API.

Please read urllib3 documentation re connection pools here: https://urllib3.readthedocs.io/en/1.5/pools.html

P.S. you could also use requests lib, which is not a part of the Python standard lib (at 2019) but is very powerful and simple to use: http://docs.python-requests.org/en/master/

Werewolf answered 14/2, 2019 at 10:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.