I'm new to Python and reading someone else's code:
should urllib.urlopen()
be followed by urllib.close()
? Otherwise, one would leak connections, correct?
I'm new to Python and reading someone else's code:
should urllib.urlopen()
be followed by urllib.close()
? Otherwise, one would leak connections, correct?
The close
method must be called on the result of urllib.urlopen
, not on the urllib
module itself as you're thinking about (as you mention urllib.close
-- which doesn't exist).
The best approach: instead of x = urllib.urlopen(u)
etc, use:
import contextlib
with contextlib.closing(urllib.urlopen(u)) as x:
...use x at will here...
The with
statement, and the closing
context manager, will ensure proper closure even in presence of exceptions.
contextlib.closing
in this (ahem) context? –
Hep urllib.urlopen
doesn't exist at all. –
Miry Like @Peter says, out-of-scope opened URLs will become eligible for garbage collection.
However, also note that in CPython URLopener
defines:
def __del__(self):
self.close()
This means that when the reference count for that instance reaches zero, its __del__
method will be called, and thus its close
method will be called as well. The most "normal" way for the reference count to reach zero is to simply let the instance go out of scope, but there's nothing strictly stopping you from an explicit del x
early (however it doesn’t directly call __del__
but just decrements the reference count by one).
It's certainly good style to explicitly close your resources -- especially when your application runs the risk of using too much of said resources -- but Python will automatically clean up for you if you don't do anything funny like maintaining (circular?) references to instances that you don't need any more.
gc.collect()
call, or a close()
, cleaned things up]. –
Candless Strictly speaking, this is true. But in practice, once (if) urllib
goes out of scope, the connection will be closed by the automatic garbage collector.
gc.disable
can disable the GC in most Python implementations. –
Birthright You basically do need to explicitly close your connection when using IronPython. The automatic closing on going out of scope relies on the garbage collection. I ran into a situation where the garbage collection did not run for so long that Windows ran out of sockets. I was polling a webserver at high frequency (i.e. as high as IronPython and the connection would allow, ~7Hz). I could see the "established connections" (i.e. sockets in use) go up and up on PerfMon. The solution was to call gc.collect()
after every call to urlopen
.
urllib.request module uses HTTP/1.1 and includes
Connection:close
header in its HTTP requests.
It's from official docs, you can check it here.
© 2022 - 2024 — McMap. All rights reserved.
data = urllib2.urlopen('url').read()
– Unpeg