This is based on another question on this site: What's the best way to download file using urllib3 However, I cannot comment there so I ask another question:
How to download a (larger) file with urllib3?
I tried to use the same code that works with urllib2 (Download file from web in Python 3), but it fails with urllib3:
http = urllib3.PoolManager()
with http.request('GET', url) as r, open(path, 'wb') as out_file:
#shutil.copyfileobj(r.data, out_file) # this writes a zero file
shutil.copyfileobj(r.data, out_file)
This says that 'bytes' object has no attribute 'read'
I then tried to use the code in that question but it gets stuck in an infinite loop because data is always '0':
http = urllib3.PoolManager()
r = http.request('GET', url)
with open(path, 'wb') as out:
while True:
data = r.read(4096)
if data is None:
break
out.write(data)
r.release_conn()
However, if I read everything in memory, the file gets downloaded correctly:
http = urllib3.PoolManager()
r = http.request('GET', url)
with open(path, 'wb') as out:
out.write(data)
I do not want to do this, as I might potentially download very large files. It is unfortunate that the urllib documentation does not cover the best practice in this topic.
(Also, please do not suggest requests or urllib2, because they are not flexible enough when it comes to self-signed certificates.)
preload_content
is not that well documented. – Stenson