I've got the following code to run a continuous loop to fetch some content from a website:
from http.cookiejar import CookieJar
from urllib import request
cj = CookieJar()
cp = request.HTTPCookieProcessor(cj)
hh = request.HTTPHandler()
opener = request.build_opener(cp, hh)
while True:
# build url
req = request.Request(url=url)
p = opener.open(req)
c = p.read()
# process c
p.close()
# check for abort condition, or continue
The contents are correctly read. But for some reason, the TCP connections won't close. I'm observing the active connection count from a dd-wrt router interface, and it goes up consistently. If the script continue to run, it'll exhaust the 4096 connection limit of the router. When this happens, the script simply enter waiting state (the router won't allow new connections, but timeout hasn't hit yet). After couple minutes, those connections will be closed and the script can resume again.
I was able to observe the state of those hanging connections from the router. They share the same state: TIME_WAIT .
I'm expecting this script to use no more than 1 TCP connection simultaneously. What am I doing wrong?
I'm using Python 3.4.2 on Mac OS X 10.10.