python save image from url
Asked Answered
W

10

127

I got a problem when I am using python to save an image from url either by urllib2 request or urllib.urlretrieve. That is the url of the image is valid. I could download it manually using the explorer. However, when I use python to download the image, the file cannot be opened. I use Mac OS preview to view the image. Thank you!

UPDATE:

The code is as follow

def downloadImage(self):
    request = urllib2.Request(self.url)
    pic = urllib2.urlopen(request)
    print "downloading: " + self.url
    print self.fileName
    filePath = localSaveRoot + self.catalog  + self.fileName + Picture.postfix
    # urllib.urlretrieve(self.url, filePath)
    with open(filePath, 'wb') as localFile:
        localFile.write(pic.read())

The image URL that I want to download is http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg

This URL is valid and I can save it through the browser but the python code would download a file that cannot be opened. The Preview says "It may be damaged or use a file format that Preview doesn't recognize." I compare the image that I download by Python and the one that I download manually through the browser. The size of the former one is several byte smaller. So it seems that the file is uncompleted, but I don't know why python cannot completely download it.

Woollyheaded answered 14/5, 2015 at 4:28 Comment(4)
Why can't it be opened? What error do you get? What does file <filename> tell you? Did the file download correctly or were you blocked by User-Agent or Cookie restrictions or similar?Johannajohannah
Include the python code you are trying in the question pleaseHartford
Sorry for the confusing. I have provided more details. Thanks a lot. I wonder if it is because the HTTP request in python is different with downloading by a browser so python cannot bring me a completed image file.Woollyheaded
It seems that requests is a much better module than urllib and urllib2Woollyheaded
P
97

A sample code that works for me on Windows:

import requests

with open('pic1.jpg', 'wb') as handle:
    response = requests.get(pic_url, stream=True)

    if not response.ok:
        print(response)

    for block in response.iter_content(1024):
        if not block:
            break

        handle.write(block)
Poolroom answered 14/5, 2015 at 4:34 Comment(6)
That's perfect! Thank you so much! I don't know why requests module could complete that while urllib and urllib2 cannot do that, but anyways.Woollyheaded
It does not work for the following URL; any idea how to fix it? genome.jp/pathway/ko02024+K07173Keys
@Keys That's not an imageOrdinarily
This saves the image to a folder, but when I open the image it says that windows does not support the file format, despite it being a simple jpg. Anyone who knows why?Acalia
@Acalia this can occur when you try to save a non-jpg as a jpgPsilomelane
How did you decide on what content size to use? i.e., what's special about 1024, or is an arbitrary value?Nightcap
M
221
import requests

img_data = requests.get(image_url).content
with open('image_name.jpg', 'wb') as handler:
    handler.write(img_data)
Mohawk answered 14/6, 2016 at 20:30 Comment(8)
@vlad what if we are not aware of the image extension from the URL but we know it is an image?Tanishatanitansy
@MonaJalal you don't have to specify an extension, as long as you have valid qualified URL address.Mohawk
pip install requests if you don't haveQuiver
Using '.content' after requests.get() is the key to save an imageSodom
It does not work for the following URL; any idea how to fix it? genome.jp/pathway/ko02024+K07173Keys
@VladBezden - This saves the image to a folder, but when I open the image it says that windows does not support the file format, despite it being a simple jpg. Do you know why?Acalia
When downloading a webp, the files are corrupted somehow. Using ffprobe, I am told missing RIFF tag and Could not find codec parameters for stream 0 (Video: webp, none):Colpin
Note that this downloads the whole image to memory first and then writes it to a file. If you want to stream the data directly to a file use e.g. this answerSugary
P
97

A sample code that works for me on Windows:

import requests

with open('pic1.jpg', 'wb') as handle:
    response = requests.get(pic_url, stream=True)

    if not response.ok:
        print(response)

    for block in response.iter_content(1024):
        if not block:
            break

        handle.write(block)
Poolroom answered 14/5, 2015 at 4:34 Comment(6)
That's perfect! Thank you so much! I don't know why requests module could complete that while urllib and urllib2 cannot do that, but anyways.Woollyheaded
It does not work for the following URL; any idea how to fix it? genome.jp/pathway/ko02024+K07173Keys
@Keys That's not an imageOrdinarily
This saves the image to a folder, but when I open the image it says that windows does not support the file format, despite it being a simple jpg. Anyone who knows why?Acalia
@Acalia this can occur when you try to save a non-jpg as a jpgPsilomelane
How did you decide on what content size to use? i.e., what's special about 1024, or is an arbitrary value?Nightcap
A
38

It is the simplest way to download and save the image from internet using urlib.request package.

Here, you can simply pass the image URL(from where you want to download and save the image) and directory(where you want to save the download image locally, and give the image name with .jpg or .png) Here I given "local-filename.jpg" replace with this.

Python 3

import urllib.request
imgURL = "http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg"

urllib.request.urlretrieve(imgURL, "D:/abc/image/local-filename.jpg")

You can download multiple images as well if you have all the image URLs from the internet. Just pass those image URLs in for loop, and the code automatically download the images from the internet.

Anyways answered 1/1, 2020 at 4:30 Comment(2)
I tried this but I get an error: HTTPError: Forbidden. Do you know why this is? I'm using this URL: assets.ellosgroup.com/i/ellos/ell_1682670-01_Fs.Acalia
@Parseval, adding this code fixed for me (agent was needed) ``` import urllib.request as urlopen opener = urlopen.build_opener() opener.addheaders = [('User-Agent', 'Chrome')] urlopen.install_opener(opener)`Richelieu
X
19

Python code snippet to download a file from an url and save with its name

import requests

url = 'http://google.com/favicon.ico'
filename = url.split('/')[-1]
r = requests.get(url, allow_redirects=True)
open(filename, 'wb').write(r.content)
Xochitlxp answered 22/5, 2018 at 12:35 Comment(0)
U
8
import random
import urllib.request

def download_image(url):
    name = random.randrange(1,100)
    fullname = str(name)+".jpg"
    urllib.request.urlretrieve(url,fullname)     
download_image("http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg")
Upolu answered 9/9, 2018 at 14:17 Comment(2)
Welcome to Stackoverflow and thanks for your contribution! Could you add an explanation to your answer what the code does and why it works? Thanks!Squamous
How do I add the headers for url in urlretrieve? I had a problem with images opening in the browser but not through code using urlretrive. I have tried urlopen but I don't know how to download the image using urlopen.Transitive
S
3

You can pick any arbitrary image from Google Images, copy the url, and use the following approach to download the image. Note that the extension isn't always included in the url, as some of the other answers seem to assume. You can automatically detect the correct extension using imghdr, which is included with Python 3.9.

import requests, imghdr

gif_url = 'https://media.tenor.com/images/eff22afc2220e9df92a7aa2f53948f9f/tenor.gif'
img_url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQwXRq7zbWry0MyqWq1Rbq12g_oL-uOoxo4Yw&usqp=CAU'
for url, save_basename in [
    (gif_url, 'gif_download_test'),
    (img_url, 'img_download_test')
]:
    response = requests.get(url)
    if response.status_code != 200:
        raise URLError
    extension = imghdr.what(file=None, h=response.content)
    save_path = f"{save_basename}.{extension}"
    with open(save_path, 'wb') as f:
        f.write(response.content)
Swanky answered 5/7, 2022 at 10:7 Comment(3)
This seems like the most upvoted answer, except with extra steps, doesn't it?Digitalism
The extra step is there to determine the correct file extension. The most upvoted answer doesn't do this. People were asking why the most upvoted answer doesn't work for all image urls. It's because you can't always assume that the image is jpg. You can't save a jpg image as png, and you can't save a png image as jpg. This is a problem when you don't know the correct extension beforehand. As an example, try downloading this image and see what happens: encrypted-tbn0.gstatic.com/…Swanky
Fair point, that makes sense. Thanks!Digitalism
I
2

For linux in case; you can use wget command

import os
url1 = 'YOUR_URL_WHATEVER'
os.system('wget {}'.format(url1))
Intravenous answered 22/2, 2019 at 7:52 Comment(2)
That gives me an empty image for the following URL: genome.jp/pathway/ko02024+K07173 Any idea how to fix this?Keys
@Keys That's because the url you provided doesn't belong to an image. Try it with url1 = 'https://www.genome.jp/tmp/mark_pathway1641220140108369/ko02024.png' in this caseFinance
O
1

Anyone who is wondering how to get the image extension then you can try split method of string on image url:

str_arr = str(img_url).split('.')
img_ext = '.' + str_arr[3] #www.bigbasket.com/patanjali-atta.jpg (jpg is after 3rd dot so)
img_data = requests.get(img_url).content
with open(img_name + img_ext, 'wb') as handler:
    handler.write(img_data)
Oppose answered 4/12, 2019 at 7:3 Comment(0)
A
1

download and save image to directory

import requests

headers = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0",
           "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
           "Accept-Language": "en-US,en;q=0.9"
           }

img_data = requests.get(url=image_url, headers=headers).content
with open(create_dir() + "/" + 'image_name' + '.png', 'wb') as handler:
    handler.write(img_data)

for creating directory

def create_dir():
    # Directory
    dir_ = "CountryFlags"
    # Parent Directory path
    parent_dir = os.path.dirname(os.path.realpath(__file__))
    # Path
    path = os.path.join(parent_dir, dir_)
    os.mkdir(path)
    return path
Adessive answered 2/10, 2021 at 19:44 Comment(0)
M
0

if you want to stick to 2 lines? :

with open(os.path.join(dir_path, url[0]), 'wb') as f:
    f.write(requests.get(new_url).content)
Millstream answered 7/11, 2022 at 1:46 Comment(1)
what are you importaing? os does not existComedy

© 2022 - 2024 — McMap. All rights reserved.