What command to use instead of urllib.request.urlretrieve?

Asked 22/2, 2013 at 23:42 Answered 21/8, 2021 at 4:42

Solved python python-3.x python-requests urllib

I'm currently writing a script that downloads a file from a URL

import urllib.request
urllib.request.urlretrieve(my_url, 'my_filename')

The docs urllib.request.urlretrieve state:

The following functions and classes are ported from the Python 2 module urllib (as opposed to urllib2). They might become deprecated at some point in the future.

Therefore I would like to avoid it so I don't have to rewrite this code in the near future.

I'm unable to find another interface like download(url, filename) in standard libraries. If urlretrieve is considered a legacy interface in Python 3, what is the replacement?

Barnabe answered 22/2, 2013 at 23:42 Comment(1)

11 years later and it is still there. – Deduce 23/3 at 18:2

Deprecated is one thing, might become deprecated at some point in the future is another. If it suits your needs, I'd continuing using urlretrieve.

That said, you can use shutil.copyfileobj:

from urllib.request import urlopen
from shutil import copyfileobj

with urlopen(my_url) as in_stream, open('my_filename', 'wb') as out_file:
    copyfileobj(in_stream, out_file)

Gildagildas answered 23/2, 2013 at 0:22 Comment(5)

Thanks a lot. I had no idea the shutil library even existed :) – Barnabe 24/2, 2013 at 10:4

Wow, this worked beautifully. I was looking for the same thing, but for small pieces of audio, and couldn't figure out how to save it to a file. Didn't find urlretrieve for a while, and I didn't realize that urlopen would give me the bytes -- I thought I should look for a method that can save the file, not go into the file and read/save the bits. I guess this is standard practice with programming so it should make sense, but I've never worked with I/O before. Thank you, it's way simpler than I feared it might be. – Vivacious 31/3, 2016 at 9:46

Thanks for copyfileobj! The problem with urlretrieve is that it has no way to determine server failure (502/404/whatever). – Kathlenekathlin 18/1, 2017 at 15:39

I'm using the reporthook in urlretrieve(..) to update a progressbar, but that argument doesn't exist in urlopen(..). How can I regularly update my progressbar if I switch to urlopen(..)? – Maladjustment 6/7, 2019 at 11:57

@y0prst: "no way" is wrong. If the server fails, you get an exception: the exact same exception (HTTPError) you would get if you'd used urlopen() directly -- no surprise here: urlretrieve() is a just convenience wrapper around urlopen(): if it works for you, there is no need to reimplement it yourself. – Lipocaic 9/11, 2020 at 16:26

requests is really nice for this. There are a few dependencies though to install it. Here is an example.

import requests
r = requests.get('imgurl')
with open('pic.jpg','wb') as f:
  f.write(r.content)

Min answered 22/2, 2013 at 23:51 Comment(3)

Doesn't this store the response in memory though? Is there a requests alternative to urlretrieve that streams to disk? – Ellswerth 1/12, 2019 at 23:55

@KyleBarron: Yes, it seems so. I suppose to properly emulate urlretrireve using requests library, one would need to use streaming as shown here. – Infallibilism 12/1, 2020 at 22:54

It is not nice for this; it reads files into memory. Not good if the files are large. – Illumine 20/6 at 19:50

Another solution without the use of shutil and no other external libraries like requests.

import urllib.request

image_url = "https://cdn.sstatic.net/Sites/stackoverflow/img/apple-touch-icon.png"
response = urllib.request.urlopen(image_url)
image = response.read()

with open("image.png", "wb") as file:
    file.write(image)

Shults answered 28/1, 2018 at 4:41 Comment(3)

Hi and welcome to SO. Code-only answers generally considered as less useful. Please edit your answer to add some context and explain your solution. – Xenogenesis 28/1, 2018 at 4:51

Looks perfect to me: any negative comment? – Gluey 22/2 at 9:41

Negative comment: same as the one on the non-streaming requests.get - it assumes that the entire response content can fit in memory. – Hoyle 8/5 at 19:4

Not sure if this is what you're looking for, or if there's a "better" way, but this is what I added to the top of my script after the libraries, to make my script compatible with Python 2/3.

# Python version compatibility
if version.major == 3:
    from urllib.error import HTTPError
    from urllib.request import urlopen, urlretrieve

elif version.major == 2:
    from urllib2 import HTTPError, urlopen

    def urlretrieve(url, data):
        url_data = urlopen(url)
        with open(data, "wb") as local_file:
            local_file.write(url_data.read())
else:
    raise ValueError('No valid Python interpreter found.')

It at least seems like a handy trick, and I hope this might help someone.

Best!

Honorary answered 21/8, 2021 at 4:42 Comment(0)

Recommended topics

Hot tags