What command to use instead of urllib.request.urlretrieve?
Asked Answered
B

4

41

I'm currently writing a script that downloads a file from a URL

import urllib.request
urllib.request.urlretrieve(my_url, 'my_filename')

The docs urllib.request.urlretrieve state:

The following functions and classes are ported from the Python 2 module urllib (as opposed to urllib2). They might become deprecated at some point in the future.

Therefore I would like to avoid it so I don't have to rewrite this code in the near future.

I'm unable to find another interface like download(url, filename) in standard libraries. If urlretrieve is considered a legacy interface in Python 3, what is the replacement?

Barnabe answered 22/2, 2013 at 23:42 Comment(1)
11 years later and it is still there.Deduce
G
35

Deprecated is one thing, might become deprecated at some point in the future is another. If it suits your needs, I'd continuing using urlretrieve.

That said, you can use shutil.copyfileobj:

from urllib.request import urlopen
from shutil import copyfileobj

with urlopen(my_url) as in_stream, open('my_filename', 'wb') as out_file:
    copyfileobj(in_stream, out_file)
Gildagildas answered 23/2, 2013 at 0:22 Comment(5)
Thanks a lot. I had no idea the shutil library even existed :)Barnabe
Wow, this worked beautifully. I was looking for the same thing, but for small pieces of audio, and couldn't figure out how to save it to a file. Didn't find urlretrieve for a while, and I didn't realize that urlopen would give me the bytes -- I thought I should look for a method that can save the file, not go into the file and read/save the bits. I guess this is standard practice with programming so it should make sense, but I've never worked with I/O before. Thank you, it's way simpler than I feared it might be.Vivacious
Thanks for copyfileobj! The problem with urlretrieve is that it has no way to determine server failure (502/404/whatever).Kathlenekathlin
I'm using the reporthook in urlretrieve(..) to update a progressbar, but that argument doesn't exist in urlopen(..). How can I regularly update my progressbar if I switch to urlopen(..)?Maladjustment
@y0prst: "no way" is wrong. If the server fails, you get an exception: the exact same exception (HTTPError) you would get if you'd used urlopen() directly -- no surprise here: urlretrieve() is a just convenience wrapper around urlopen(): if it works for you, there is no need to reimplement it yourself.Lipocaic
M
26

requests is really nice for this. There are a few dependencies though to install it. Here is an example.

import requests
r = requests.get('imgurl')
with open('pic.jpg','wb') as f:
  f.write(r.content)
Min answered 22/2, 2013 at 23:51 Comment(3)
Doesn't this store the response in memory though? Is there a requests alternative to urlretrieve that streams to disk?Ellswerth
@KyleBarron: Yes, it seems so. I suppose to properly emulate urlretrireve using requests library, one would need to use streaming as shown here.Infallibilism
It is not nice for this; it reads files into memory. Not good if the files are large.Illumine
S
4

Another solution without the use of shutil and no other external libraries like requests.

import urllib.request

image_url = "https://cdn.sstatic.net/Sites/stackoverflow/img/apple-touch-icon.png"
response = urllib.request.urlopen(image_url)
image = response.read()

with open("image.png", "wb") as file:
    file.write(image)
Shults answered 28/1, 2018 at 4:41 Comment(3)
Hi and welcome to SO. Code-only answers generally considered as less useful. Please edit your answer to add some context and explain your solution.Xenogenesis
Looks perfect to me: any negative comment?Gluey
Negative comment: same as the one on the non-streaming requests.get - it assumes that the entire response content can fit in memory.Hoyle
H
0

Not sure if this is what you're looking for, or if there's a "better" way, but this is what I added to the top of my script after the libraries, to make my script compatible with Python 2/3.

# Python version compatibility
if version.major == 3:
    from urllib.error import HTTPError
    from urllib.request import urlopen, urlretrieve

elif version.major == 2:
    from urllib2 import HTTPError, urlopen

    def urlretrieve(url, data):
        url_data = urlopen(url)
        with open(data, "wb") as local_file:
            local_file.write(url_data.read())
else:
    raise ValueError('No valid Python interpreter found.')

It at least seems like a handy trick, and I hope this might help someone.

Best!

Honorary answered 21/8, 2021 at 4:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.