How can I use a SOCKS 4/5 proxy with urllib2?
Asked Answered
M

3

48

How can I use a SOCKS 4/5 proxy with urllib2 to download a web page?

Menam answered 23/2, 2010 at 11:55 Comment(1)
Related for Tor: #1096879Felicitous
E
68

You can use SocksiPy module. Simply copy the file "socks.py" to your Python's lib/site-packages directory, and you're ready to go.

You must use socks before urllib2. (Try it pip install PySocks )

For example:

import socks
import socket
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 8080)
socket.socket = socks.socksocket
import urllib2
print urllib2.urlopen('http://www.google.com').read()

You can also try pycurl lib and tsocks, for more detail, click on here.

Erythrism answered 26/2, 2010 at 3:6 Comment(6)
One issue with that is: the DNS lookup by urllib doesn't seem to go through the proxy. (even with rdns option and SOCKS4 type)Hut
Just want to note that sockipy on sourceforge has some nasty bugs. At minimum use the fork here: code.google.com/p/socksipy-branch Since the project appears abandoned IMO someone should take that branch, change the name and write a blogpost so people don't continue to use this buggy (and imo not wonderfully written) lib.Dupin
I know this is old but what is wrong with the original sockipy? What bugs has it got?Amphiaster
Can't download socksipy anymore from your link.Selfesteem
@Hut there is another answer here https://mcmap.net/q/341052/-dns-over-proxy which also makes the host name lookups go over the SOCKS proxySelfappointed
Looks like the latest fork of SocksiPy is now here: github.com/Anorov/PySocksBainbrudge
A
21

Adding an alternative to pan's answer when you need to use many different proxies at the same time.

In that case you need to create an opener like you do with a http proxy. There is a code available in GitHub https://gist.github.com/869791

opener = urllib2.build_opener(SocksiPyHandler(socks.PROXY_TYPE_SOCKS4, 'localhost', 9999))
print opener.open('http://www.whatismyip.com/automation/n09230945.asp').read()
Allness answered 11/11, 2011 at 22:44 Comment(2)
Hey, I was using the code from github. Unfortunately, the authentication doesn't work. I've passed in right username and password in the socksipyhandler.py, however, I get error (3, 'unknown username or invalid password'). I can confirm that my username password work, since my cURL command works with the same credentials.Generally
Nevermind, figured out the issue, there was a typo in socks.py =), btw, great work. Thanks a ton!Generally
C
4

Since SOCKS is a socket level proxy, you have to replace the socket object used by urllib2. Please take a look a this solution. If monkey patching is not good enough for you, then you can try to subclass or copy-modify the code from the urllib2 standard library.

Counterplot answered 23/2, 2010 at 13:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.