Urlretrieve and User-Agent? - Python
Asked Answered
B

5

18

I'm using urlretrieve from the urllib module.

I cannot seem to find how to add a User-Agent description to my requests.


Is it possible with urlretrieve? or do I need to use another method?

Bullnose answered 2/3, 2010 at 16:8 Comment(0)
P
9

First, set version:

urllib.URLopener.version = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36 SE 2.X MetaSr 1.0'

Then:

filename, headers = urllib.urlretrieve(url)
Pronty answered 22/8, 2015 at 10:37 Comment(0)
Q
6

I know this issue had been there for 7 years. And I reached this issue by trying to figure out how to change the User-Agent while using urlretrieve function.

To anyone who reached this issue by no luck, here is how I did:

    # proxy = ProxyHandler({'http': 'http://192.168.1.31:8888'})
    proxy = ProxyHandler({})
    opener = build_opener(proxy)
    opener.addheaders = [('User-Agent','Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/603.1.30 (KHTML, like Gecko) Version/10.1 Safari/603.1.30')]
    install_opener(opener)

    result = urlretrieve(url=file_url, filename=file_name)

The reason I added proxy is to monitor the traffic in Charles, and here is the traffic I got:

See the User-Agent

Quotation answered 19/4, 2017 at 16:39 Comment(0)
W
5

You can use URLopener or FancyURLopener classes. The 'version' argument specifies the user agent of the opener object.

opener = FancyURLopener({}) 
opener.version = 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.69 Safari/537.36'
opener.retrieve('http://example.com', 'index.html')
Womanly answered 14/8, 2011 at 19:41 Comment(0)
A
4

Something like this not using urllib tho, worked for me for an scraper

import requests

imageURL='http://image.jpg'
headers={'user-agent': 'Mozilla/5.0'}
r=requests.get(imageURL, headers=headers)
with open('image.jpg', 'wb') as f:
    f.write(r.content)
Associative answered 11/2, 2021 at 11:48 Comment(0)
S
2

I don't think it's possible with urlretrieve - at least not easily. I would propose to create an urllib2.Request object and pass the required headers to it. See

http://docs.python.org/library/urllib2.html#urllib2.urlopen

for examples.

Scleritis answered 2/3, 2010 at 16:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.