Given a standard urllib.request
object, retrieved so:
req = urllib.urlopen('http://example.com')
If I read its contents via req.read()
, afterwards the request object will be empty.
Unlike normal file-like objects, however, the request object does not have a seek
method, for I am sure are excellent reasons.
However, in my case I have a function, and I want it to make certain determinations about a request and then return that request "unharmed" so that it can be read again.
I understand that one option is to re-request it. But I'd like to be able to avoid making multiple HTTP requests for the same url & content.
The only other alternative I can think of is to have the function return a tuple of the extracted content and the request object, with the understanding that anything that calls this function will have to get the content in this way.
Is that my only option?
urllib.urlopen
- Also note that the urllib.urlopen() function has been removed in Python 3 in favor of urllib2.urlopen() – Integumenturllib2.urlopen
is the same. – Salience