How do I download a file using urllib.request in Python 3?

Asked 6/4, 2013 at 1:25 Answered 21/8, 2017 at 21:39

So, I'm messing around with urllib.request in Python 3 and am wondering how to write the result of getting an internet file to a file on the local machine. I tried this:

g = urllib.request.urlopen('http://media-mcw.cursecdn.com/3/3f/Beta.png')
with open('test.png', 'b+w') as f:
    f.write(g)

But I got this error:

TypeError: 'HTTPResponse' does not support the buffer interface

What am I doing wrong?

NOTE: I have seen this question, but it's related to Python 2's urllib2 which was overhauled in Python 3.

Equilibrant answered 6/4, 2013 at 1:25 Comment(1)

possible duplicate of Download file from web in Python 3 – Democratic 27/7, 2015 at 22:41

change

f.write(g)

f.write(g.read())

Wiatt answered 6/4, 2013 at 1:32 Comment(0)

An easier way I think (also you can do it in two lines) is to use:

import urllib.request
urllib.request.urlretrieve('http://media-mcw.cursecdn.com/3/3f/Beta.png', 'test.png')

As for the method you have used. When you use g = urllib.request.urlopen('http://media-mcw.cursecdn.com/3/3f/Beta.png') you are just fetching the file. You must use g.read(), g.readlines() or g.readline() to read it it.

It's just like reading a normal file (except for the syntax) and can be treated in a very similar way.

Bicapsular answered 21/8, 2017 at 21:39 Comment(8)

The PEP20 would have you use Request from urllib.request but yours would have a line less of code. Information about PEP20 for Request. You can use open() chained to file.write(url.read()) like you mentioned. – Whitney 14/2, 2018 at 2:36

@Whitney Are you sure? The link mentioned Open the URL url, which can be either a string or a Request object., here I specified a string so I don't think Request is required in this case. – Bicapsular 14/2, 2018 at 8:46

That worked on debian9 using python3.5. I don't use 2.7 too much. – Whitney 13/3, 2018 at 4:27

This doesn't work if you have to get round the 403: Forbidden issue using https://mcmap.net/q/167716/-urllib2-httperror-http-error-403-forbidden – Chirurgeon 29/4, 2020 at 10:56

@Sevenearths That's true. However that's a different issue. Out of all the files I have used python to download/read, only a handful have ever given me a 403 error. I don't think this is a big enough reason not to warrent the use of urlretrieve(). Obviously if that issue is encounted, then what you have linked is the way forward – Bicapsular 29/4, 2020 at 11:54

Interesting how experiences differ. While writing my app the first url I tried https://medium.com/@tomaspueyo/coronavirus-the-hammer-and-the-dance-be9337092b56 and it gave me the 403: Forbidden. I wonder if it's just a Medium related issue – Chirurgeon 29/4, 2020 at 12:46

@Sevenearths 403 is a Forbidden error. This usually happens when a website (server) attempts to block a bot. Or you try to access a webpage with incorrect login/cert information (usually cookie related from my experience, like passing outdated information, or similar). Seen as the solution you listed uses a user agent, it strongly looks like that site attepts to block bots (which makes sense since it's a news site) a user agent tricks the server into thinking it's a legitimate browser. – Bicapsular 29/4, 2020 at 13:36

@Sevenearths Personally I usually use dedicated APIs (and this sort of thing never comes up, as they expect bots), which is probably why I don't encounter the problem much. – Bicapsular 29/4, 2020 at 13:36

Recommended topics

Hot tags