Python urllib and urllib2 not opening localhost URLs?
Asked Answered
F

3

7

In Python I can use urllib2 (and urllib) to open external URLs such as Google. However, I am hitting issues when opening localhost URLs. I have a python SimpleHTTPServer running on port 8280 which I can browse to successfully using http://localhost:8280/.

python -m SimpleHTTPServer 8280

It's also worth noting that I'm running Ubuntu which has CNTLM running to handle authentication to our corporate web proxy. Therefore, wget doesn't actually work with localhost either so I don't think this is a urllib issue!

Test Script (test_urllib2.py):

import urllib2

print "Opening Google..."
google = urllib2.urlopen("http://www.google.com/")
print google.read(100)
print "Google opened."

print "Opening localhost..."
localhost = urllib2.urlopen("http://localhost:8280/")
print localhost.read(100)
print "localhost opened."

Output:

$ ./test_urllib2.py 
Opening Google...
<!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><
Google opened.
Opening localhost...
Traceback (most recent call last):
  File "./test_urllib2.py", line 10, in <module>
    localhost = urllib2.urlopen("http://localhost:8280/")
  File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.6/urllib2.py", line 397, in open
    response = meth(req, response)
  File "/usr/lib/python2.6/urllib2.py", line 510, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.6/urllib2.py", line 429, in error
    result = self._call_chain(*args)
  File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 605, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 1161, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.6/urllib2.py", line 1134, in do_open
    r = h.getresponse()
  File "/usr/lib/python2.6/httplib.py", line 986, in getresponse
    response.begin()
  File "/usr/lib/python2.6/httplib.py", line 391, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.6/httplib.py", line 355, in _read_status
    raise BadStatusLine(line)
httplib.BadStatusLine

SOLUTION: The problem was indeed because I am using CNTLM behind our corporate web proxy (specifics of why this caused a problem I can't be sure). The solution was to use the ProxyHandler:

proxy_support = urllib2.ProxyHandler({})
opener = urllib2.build_opener(proxy_support)
print opener.open("http://localhost:8380/").read(100)

Thanks to loki2302 for pointing me here.

Frederigo answered 10/8, 2011 at 14:12 Comment(3)
don't use except: without an Exception and please show us the exception raised by urllib2.urlopen.Gresham
#202015Nutting
The BadStatusLine exception suggests a malformed response header from the server. Could you have a peek and see what's being returned?Bridesmaid
F
5

Check whether the problem is really in opening localhost, or whether JBoss gives invalid response (that the browser somehow works around):

  1. try using http://127.0.0.1:8280/ instead of "localhost:8280" (if that works, it's a DNS problem)
  2. use curl or wget to test JBoss works: wget http://localhost:8280/
  3. you can try running a simple Python HTTP server to test against something other than JBoss:

    python -m SimpleHTTPServer 8280
    
Fonseca answered 11/8, 2011 at 11:18 Comment(4)
Excellent idea. Wget doesn't work! I'm using Ubuntu which has CNTLM running to handle authentication to our corporate web proxy, so this must be the root of the problem. I've updated my question accordingly. Any ideas?Frederigo
Sounds like you've got proxy set that also gets used for localhost/127.0.0.1. Depending on how it's set (I don't know about CNTLM), it may be possible to make an exception for localhost.Spaulding
Also, the link by loki2302 in the question comments might be useful, it contains a recipe on how to ignore the proxy settings, so (unless you have a transparent proxy or it's forced in some other way) it may help you.Spaulding
Accepting this as answer - I got it sorted, I used the link by loki2302 to find the solution. I'll edit my question with the solution for future reference. Thanks again Senko.Frederigo
P
3

try using urllib:

import urllib
localhost = urllib.urlopen("http://localhost:8280/")
print localhost.read(100)
Pisa answered 10/8, 2011 at 14:27 Comment(0)
W
1

I also had this problem in my webserver. But the root of the problem was that my webserver was single thread and can only answer one request. So during the process of one request it cannot answer another url I asked in urllib2

Weimaraner answered 19/9, 2011 at 14:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.