Using an HTTP PROXY - Python [duplicate]

Asked 11/4, 2011 at 10:53 Answered 10/11, 2014 at 9:38

I familiar with the fact that I should set the HTTP_RPOXY environment variable to the proxy address.

Generally urllib works fine, the problem is dealing with urllib2.

>>> urllib2.urlopen("http://www.google.com").read()

returns

urllib2.URLError: <urlopen error [Errno 10061] No connection could be made because the target machine actively refused it>

urllib2.URLError: <urlopen error [Errno 11004] getaddrinfo failed>

Extra info:

urllib.urlopen(....) works fine! It is just urllib2 that is playing tricks...

I tried @Fenikso answer but I'm getting this error now:

URLError: <urlopen error [Errno 10060] A connection attempt failed because the 
connected party did not properly respond after a period of time, or established
connection failed because connected host has failed to respond>

Any ideas?

Antique answered 11/4, 2011 at 10:53 Comment(3)

Can you post actual whole sample code which gives you the error? – Teetotalism 11/4, 2011 at 11:12

@Fenikso: this urllib2.urlopen("http://www.google.com").read() – Antique 11/4, 2011 at 11:25

So you have the proxy server set in HTTP_PROXY environment variable? Are you sure that server accepts the connection? – Teetotalism 11/4, 2011 at 11:28

You can do it even without the HTTP_PROXY environment variable. Try this sample:

import urllib2

proxy_support = urllib2.ProxyHandler({"http":"http://61.233.25.166:80"})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)

html = urllib2.urlopen("http://www.google.com").read()
print html

In your case it really seems that the proxy server is refusing the connection.

Something more to try:

import urllib2

#proxy = "61.233.25.166:80"
proxy = "YOUR_PROXY_GOES_HERE"

proxies = {"http":"http://%s" % proxy}
url = "http://www.google.com/search?q=test"
headers={'User-agent' : 'Mozilla/5.0'}

proxy_support = urllib2.ProxyHandler(proxies)
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=1))
urllib2.install_opener(opener)

req = urllib2.Request(url, None, headers)
html = urllib2.urlopen(req).read()
print html

Edit 2014: This seems to be a popular question / answer. However today I would use third party requests module instead.

For one request just do:

import requests

r = requests.get("http://www.google.com", 
                 proxies={"http": "http://61.233.25.166:80"})
print(r.text)

For multiple requests use Session object so you do not have to add proxies parameter in all your requests:

import requests

s = requests.Session()
s.proxies = {"http": "http://61.233.25.166:80"}

r = s.get("http://www.google.com")
print(r.text)

Teetotalism answered 11/4, 2011 at 11:26 Comment(11)

Thanks for the reply! :) Now I'm getting

URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>

... urllib works perfectly though. – Antique 11/4, 2011 at 11:33

@Antique - Works fine on my system. Do you have any proxy you have to use for internet access? – Teetotalism 11/4, 2011 at 11:48

@Antique - What is also the type of proxy you use? – Teetotalism 11/4, 2011 at 11:50

@Fenikso: I do have to use an http proxy for internet access, and it is the same I use for all my software to get internet access. It is the same proxy I have set within the HTTP_PROXY variable. – Antique 11/4, 2011 at 13:16

@Antique - Try setting another user-agent and switch debug mode on. I have updated my answer. – Teetotalism 11/4, 2011 at 13:32

@Antique - So was it the proxy refusing connection because of user-agent? – Teetotalism 11/4, 2011 at 14:23

I thought this was working for me, but tried putting random information passed in with proxies, and data was still retrieved each time (as long as https was used) – Adiaphorism 31/8, 2016 at 20:58

Is there a proxy for FTP? – Goolsby 18/5, 2018 at 10:57

@Fenikso, what is "http": "http://61.233.25.166:80" line proxies argument? Is it gonna be my IP address? – Cr 21/6, 2018 at 7:26

@voo_doo it is address and port of the proxy you want to use. – Teetotalism 3/8, 2018 at 8:26

did you ip bind the proxy or is the proxy allowing your accesss? – Kotz 9/1, 2021 at 23:39

I recommend you just use the requests module.

It is much easier than the built in http clients: http://docs.python-requests.org/en/latest/index.html

Sample usage:

r = requests.get('http://www.thepage.com', proxies={"http":"http://myproxy:3129"})
thedata = r.content

Toy answered 3/12, 2011 at 23:8 Comment(3)

How do you set the timeout? – Breannabreanne 7/3, 2014 at 2:5

Wonderful. This works with both https and http, whereas urllib only works with http for me with python3. – Safire 16/10, 2015 at 21:51

I thought this was working for me, but tried putting random proxy information, and data was still retrieved each time (as long as https was used) – Adiaphorism 31/8, 2016 at 21:2

Just wanted to mention, that you also may have to set the https_proxy OS environment variable in case https URLs need to be accessed. In my case it was not obvious to me and I tried for hours to discover this.

My use case: Win 7, jython-standalone-2.5.3.jar, setuptools installation via ez_setup.py

Benzoate answered 25/11, 2013 at 9:23 Comment(0)

Python 3:

import urllib.request

htmlsource = urllib.request.FancyURLopener({"http":"http://127.0.0.1:8080"}).open(url).read().decode("utf-8")

Ralline answered 26/6, 2013 at 13:54 Comment(1)

from the TraceBack: DeprecationWarning: FancyURLopener style of invoking requests is deprecated. Use newer urlopen functions/methods. – Flaming 28/12, 2020 at 17:58

I encountered this on jython client. The server was only talking TLS and the client using SSL context.

javax.net.ssl.SSLContext.getInstance("SSL")

Once the client was to TLS, things started working.

Laplante answered 10/11, 2014 at 9:38 Comment(0)

Extra info:

urllib.urlopen(....) works fine! It is just urllib2 that is playing tricks...

Recommended topics

Hot tags