Asked 14/5, 2015 at 4:28 Answered 7/11, 2022 at 1:46

127

I got a problem when I am using python to save an image from url either by urllib2 request or urllib.urlretrieve. That is the url of the image is valid. I could download it manually using the explorer. However, when I use python to download the image, the file cannot be opened. I use Mac OS preview to view the image. Thank you!

UPDATE:

The code is as follow

def downloadImage(self):
    request = urllib2.Request(self.url)
    pic = urllib2.urlopen(request)
    print "downloading: " + self.url
    print self.fileName
    filePath = localSaveRoot + self.catalog  + self.fileName + Picture.postfix
    # urllib.urlretrieve(self.url, filePath)
    with open(filePath, 'wb') as localFile:
        localFile.write(pic.read())

The image URL that I want to download is http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg

This URL is valid and I can save it through the browser but the python code would download a file that cannot be opened. The Preview says "It may be damaged or use a file format that Preview doesn't recognize." I compare the image that I download by Python and the one that I download manually through the browser. The size of the former one is several byte smaller. So it seems that the file is uncompleted, but I don't know why python cannot completely download it.

Woollyheaded answered 14/5, 2015 at 4:28 Comment(4)

Why can't it be opened? What error do you get? What does file <filename> tell you? Did the file download correctly or were you blocked by User-Agent or Cookie restrictions or similar? – Johannajohannah 14/5, 2015 at 4:31

Include the python code you are trying in the question please – Hartford 14/5, 2015 at 4:32

Sorry for the confusing. I have provided more details. Thanks a lot. I wonder if it is because the HTTP request in python is different with downloading by a browser so python cannot bring me a completed image file. – Woollyheaded 14/5, 2015 at 6:50

It seems that requests is a much better module than urllib and urllib2 – Woollyheaded 14/5, 2015 at 8:15

A sample code that works for me on Windows:

import requests

with open('pic1.jpg', 'wb') as handle:
    response = requests.get(pic_url, stream=True)

    if not response.ok:
        print(response)

    for block in response.iter_content(1024):
        if not block:
            break

        handle.write(block)

Poolroom answered 14/5, 2015 at 4:34 Comment(6)

That's perfect! Thank you so much! I don't know why requests module could complete that while urllib and urllib2 cannot do that, but anyways. – Woollyheaded 14/5, 2015 at 7:24

It does not work for the following URL; any idea how to fix it? genome.jp/pathway/ko02024+K07173 – Keys 17/10, 2021 at 20:3

@Keys That's not an image – Ordinarily 4/12, 2021 at 22:49

This saves the image to a folder, but when I open the image it says that windows does not support the file format, despite it being a simple jpg. Anyone who knows why? – Acalia 13/4, 2022 at 9:38

@Acalia this can occur when you try to save a non-jpg as a jpg – Psilomelane 27/10, 2022 at 16:26

How did you decide on what content size to use? i.e., what's special about 1024, or is an arbitrary value? – Nightcap 19/5, 2023 at 7:2

221

import requests

img_data = requests.get(image_url).content
with open('image_name.jpg', 'wb') as handler:
    handler.write(img_data)

Mohawk answered 14/6, 2016 at 20:30 Comment(8)

@vlad what if we are not aware of the image extension from the URL but we know it is an image? – Tanishatanitansy 2/4, 2018 at 3:2

@MonaJalal you don't have to specify an extension, as long as you have valid qualified URL address. – Mohawk 2/4, 2018 at 11:14

pip install requests if you don't have – Quiver 15/1, 2021 at 10:53

Using '.content' after requests.get() is the key to save an image – Sodom 24/6, 2021 at 23:24

It does not work for the following URL; any idea how to fix it? genome.jp/pathway/ko02024+K07173 – Keys 17/10, 2021 at 20:3

@VladBezden - This saves the image to a folder, but when I open the image it says that windows does not support the file format, despite it being a simple jpg. Do you know why? – Acalia 13/4, 2022 at 9:37

When downloading a webp, the files are corrupted somehow. Using ffprobe, I am told missing RIFF tag and Could not find codec parameters for stream 0 (Video: webp, none): – Colpin 6/8, 2022 at 4:4

Note that this downloads the whole image to memory first and then writes it to a file. If you want to stream the data directly to a file use e.g. this answer – Sugary 4/12, 2022 at 12:17

A sample code that works for me on Windows:

import requests

with open('pic1.jpg', 'wb') as handle:
    response = requests.get(pic_url, stream=True)

    if not response.ok:
        print(response)

    for block in response.iter_content(1024):
        if not block:
            break

        handle.write(block)

Poolroom answered 14/5, 2015 at 4:34 Comment(6)

That's perfect! Thank you so much! I don't know why requests module could complete that while urllib and urllib2 cannot do that, but anyways. – Woollyheaded 14/5, 2015 at 7:24

It does not work for the following URL; any idea how to fix it? genome.jp/pathway/ko02024+K07173 – Keys 17/10, 2021 at 20:3

@Keys That's not an image – Ordinarily 4/12, 2021 at 22:49

This saves the image to a folder, but when I open the image it says that windows does not support the file format, despite it being a simple jpg. Anyone who knows why? – Acalia 13/4, 2022 at 9:38

@Acalia this can occur when you try to save a non-jpg as a jpg – Psilomelane 27/10, 2022 at 16:26

How did you decide on what content size to use? i.e., what's special about 1024, or is an arbitrary value? – Nightcap 19/5, 2023 at 7:2

It is the simplest way to download and save the image from internet using urlib.request package.

Here, you can simply pass the image URL(from where you want to download and save the image) and directory(where you want to save the download image locally, and give the image name with .jpg or .png) Here I given "local-filename.jpg" replace with this.

Python 3

import urllib.request
imgURL = "http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg"

urllib.request.urlretrieve(imgURL, "D:/abc/image/local-filename.jpg")

You can download multiple images as well if you have all the image URLs from the internet. Just pass those image URLs in for loop, and the code automatically download the images from the internet.

Anyways answered 1/1, 2020 at 4:30 Comment(2)

I tried this but I get an error: HTTPError: Forbidden. Do you know why this is? I'm using this URL: assets.ellosgroup.com/i/ellos/ell_1682670-01_Fs. – Acalia 13/4, 2022 at 9:46

@Parseval, adding this code fixed for me (agent was needed) ``` import urllib.request as urlopen opener = urlopen.build_opener() opener.addheaders = [('User-Agent', 'Chrome')] urlopen.install_opener(opener)` – Richelieu 20/4, 2023 at 18:4

Python code snippet to download a file from an url and save with its name

import requests

url = 'http://google.com/favicon.ico'
filename = url.split('/')[-1]
r = requests.get(url, allow_redirects=True)
open(filename, 'wb').write(r.content)

Xochitlxp answered 22/5, 2018 at 12:35 Comment(0)

import random
import urllib.request

def download_image(url):
    name = random.randrange(1,100)
    fullname = str(name)+".jpg"
    urllib.request.urlretrieve(url,fullname)     
download_image("http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg")

Upolu answered 9/9, 2018 at 14:17 Comment(2)

Welcome to Stackoverflow and thanks for your contribution! Could you add an explanation to your answer what the code does and why it works? Thanks! – Squamous 9/9, 2018 at 14:40

How do I add the headers for url in urlretrieve? I had a problem with images opening in the browser but not through code using urlretrive. I have tried urlopen but I don't know how to download the image using urlopen. – Transitive 27/3, 2019 at 14:38

You can pick any arbitrary image from Google Images, copy the url, and use the following approach to download the image. Note that the extension isn't always included in the url, as some of the other answers seem to assume. You can automatically detect the correct extension using imghdr, which is included with Python 3.9.

import requests, imghdr

gif_url = 'https://media.tenor.com/images/eff22afc2220e9df92a7aa2f53948f9f/tenor.gif'
img_url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQwXRq7zbWry0MyqWq1Rbq12g_oL-uOoxo4Yw&usqp=CAU'
for url, save_basename in [
    (gif_url, 'gif_download_test'),
    (img_url, 'img_download_test')
]:
    response = requests.get(url)
    if response.status_code != 200:
        raise URLError
    extension = imghdr.what(file=None, h=response.content)
    save_path = f"{save_basename}.{extension}"
    with open(save_path, 'wb') as f:
        f.write(response.content)

Swanky answered 5/7, 2022 at 10:7 Comment(3)

This seems like the most upvoted answer, except with extra steps, doesn't it? – Digitalism 6/7, 2022 at 11:49

The extra step is there to determine the correct file extension. The most upvoted answer doesn't do this. People were asking why the most upvoted answer doesn't work for all image urls. It's because you can't always assume that the image is jpg. You can't save a jpg image as png, and you can't save a png image as jpg. This is a problem when you don't know the correct extension beforehand. As an example, try downloading this image and see what happens: encrypted-tbn0.gstatic.com/… – Swanky 12/7, 2022 at 5:24

Fair point, that makes sense. Thanks! – Digitalism 12/7, 2022 at 5:38

For linux in case; you can use wget command

import os
url1 = 'YOUR_URL_WHATEVER'
os.system('wget {}'.format(url1))

Intravenous answered 22/2, 2019 at 7:52 Comment(2)

That gives me an empty image for the following URL: genome.jp/pathway/ko02024+K07173 Any idea how to fix this? – Keys 17/10, 2021 at 19:51

@Keys That's because the url you provided doesn't belong to an image. Try it with url1 = 'https://www.genome.jp/tmp/mark_pathway1641220140108369/ko02024.png' in this case – Finance 3/1, 2022 at 14:32

Anyone who is wondering how to get the image extension then you can try split method of string on image url:

str_arr = str(img_url).split('.')
img_ext = '.' + str_arr[3] #www.bigbasket.com/patanjali-atta.jpg (jpg is after 3rd dot so)
img_data = requests.get(img_url).content
with open(img_name + img_ext, 'wb') as handler:
    handler.write(img_data)

Oppose answered 4/12, 2019 at 7:3 Comment(0)

download and save image to directory

import requests

headers = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0",
           "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
           "Accept-Language": "en-US,en;q=0.9"
           }

img_data = requests.get(url=image_url, headers=headers).content
with open(create_dir() + "/" + 'image_name' + '.png', 'wb') as handler:
    handler.write(img_data)

for creating directory

def create_dir():
    # Directory
    dir_ = "CountryFlags"
    # Parent Directory path
    parent_dir = os.path.dirname(os.path.realpath(__file__))
    # Path
    path = os.path.join(parent_dir, dir_)
    os.mkdir(path)
    return path

Adessive answered 2/10, 2021 at 19:44 Comment(0)

if you want to stick to 2 lines? :

with open(os.path.join(dir_path, url[0]), 'wb') as f:
    f.write(requests.get(new_url).content)

Millstream answered 7/11, 2022 at 1:46 Comment(1)

what are you importaing? os does not exist – Comedy 19/11, 2023 at 17:6

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

download and save image to directory

Recommended topics

Hot tags