Under the hood, requests
uses urllib3
to do most of the http heavy lifting. When used properly, it should be mostly the same unless you need more advanced configuration.
Except, in your particular example they're not the same:
In the urllib3 example, you're re-using connections whereas in the requests example you're not re-using connections. Here's how you can tell:
>>> import requests
>>> requests.packages.urllib3.add_stderr_logger()
2016-04-29 11:43:42,086 DEBUG Added a stderr logging handler to logger: requests.packages.urllib3
>>> requests.get('https://www.google.com/')
2016-04-29 11:45:59,043 INFO Starting new HTTPS connection (1): www.google.com
2016-04-29 11:45:59,158 DEBUG "GET / HTTP/1.1" 200 None
>>> requests.get('https://www.google.com/')
2016-04-29 11:45:59,815 INFO Starting new HTTPS connection (1): www.google.com
2016-04-29 11:45:59,925 DEBUG "GET / HTTP/1.1" 200 None
To start re-using connections like in a urllib3 PoolManager, you need to make a requests session.
>>> session = requests.session()
>>> session.get('https://www.google.com/')
2016-04-29 11:46:49,649 INFO Starting new HTTPS connection (1): www.google.com
2016-04-29 11:46:49,771 DEBUG "GET / HTTP/1.1" 200 None
>>> session.get('https://www.google.com/')
2016-04-29 11:46:50,548 DEBUG "GET / HTTP/1.1" 200 None
Now it's equivalent to what you were doing with http = PoolManager()
. One more note: urllib3 is a lower-level more explicit library, so you explicitly create a pool and you'll explicitly need to specify your SSL certificate location, for example. It's an extra line or two of more work but also a fair bit more control if that's what you're looking for.
All said and done, the comparison becomes:
1) Using urllib3:
import urllib3, certifi
http = urllib3.PoolManager(ca_certs=certifi.where())
html = http.request('GET', url).read()
soup = BeautifulSoup(html, "html5lib")
2) Using requests:
import requests
session = requests.session()
html = session.get(url).content
soup = BeautifulSoup(html, "html5lib")
requests
module uses (and bundles / vendorizesurllib3
) under the hood - but it provides a slightly more higher level and simpler API on top of it. – Questrequests
? – Weathertightrequests
. It just makes HTTP very pleasant to deal with, and if there's something you can't do withrequests
that you can with plainurllib3
, I haven't come across it yet. But that's just my opinion. – Quest