HTTPS POST request Python
Asked Answered
O

4

9

I want to make a post request to a HTTPS-site that should respond with a .csv file. I have this Python code:

url = 'https://www.site.com/servlet/datadownload'
values = {
  'val1' : '123',
  'val2' : 'abc',
  'val3' : '1b3',
}

data = urllib.urlencode(values)
req = urllib2.Request(url,data)
response = urllib2.urlopen(req)
myfile = open('file.csv', 'wb')
shutil.copyfileobj(response.fp, myfile)
myfile.close()

But 'm getting the error:

BadStatusLine: ''    (in httplib.py)

I've tried the post request with the Chrome Extension: Advanced REST client (screenshot) and that works fine.

What could be the problem and how could I solve it? (is it becasue of the HTTPS?)


EDIT, refactored code:

try:
    #conn = httplib.HTTPSConnection(host="www.site.com", port=443)

=> Gives an BadStatusLine: '' error

    conn = httplib.HTTPConnection("www.site.com");
    params  = urllib.urlencode({'val1':'123','val2':'abc','val3':'1b3'})
    conn.request("POST", "/nps/servlet/exportdatadownload", params)
    content = conn.getresponse()
    print content.reason, content.status
    print content.read()
    conn.close()
except:
    import sys
    print sys.exc_info()[:2]

Output:

Found 302

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>302 Found</TITLE>
</HEAD><BODY>
<H1>Found</H1>
The document has moved <A HREF="https://www.site.com/nps/servlet/exportdatadownload">here</A>.<P>
<HR>
<ADDRESS>Oracle-Application-Server-10g/10.1.3.5.0 Oracle-HTTP-Server Server at mp-www1.mrco.be Port 7778</ADDRESS>
</BODY></HTML>

What am I doing wrong?

Oaks answered 17/1, 2013 at 17:55 Comment(4)
What version of python are you using? I would check this answer to see if httplib is working ok with https. I can't try our your code right now, but another piece of advice would be to use a friendlier library for your requests, called... requests.Trip
What do you get if you https_handler = urllib2.HTTPSHandler(1) opener = urllib2.build_opener(https_handler) response = opener.open(req) in place of response = urllib2.urlopen(req)? You should still get the error, but that should turn on debugging in the https response, which should mean that your response will be printed, which you can then use to help track down what isn't working. If it's for some odd reason using another handler, just try the same thing with urllib2.HTTPHandler(1) or whatever handler is relevant.Knighthead
I noticed that you are using urllib and urllib2 at the same time. Is that intentional?Subaquatic
You should post the site.Awake
B
3

The BadStatusLine: '' (in httplib.py) gives away that there might be something else going on here. This may happen when the server sends no reply back at all, and just closes the connection.

As you mentioned that you're using an SSL connection, this might be particularly interesting to debug (with curl -v URL if you want). If you find out that curl -2 URL (which forces the use of SSLv2) seems to work, while curl -3 URL (SSLv3), doesn't, you may want to take a look at issue #13636 and possibly #11220 on the python bugtracker. Depending on your Python version & a possibly misconfigured webserver, this might be causing a problem: the SSL defaults have changed in v2.7.3.

Brucine answered 10/3, 2013 at 23:44 Comment(0)
E
14

Is there a reason you've got to use urllib? Requests is simpler, better in almost every way, and abstracts away some of the cruft that makes urllib hard to work with.

As an example, I'd rework you example as something like:

import requests
resp = requests.post(url, data=values, allow_redirects=True)

At this point, the response from the server is available in resp.text, and you can do what you'd like with it. If requests wasn't able to POST properly (because you need a custom SSL certificate, for example), it should give you a nice error message that tells you why.

Even if you can't do this in your production environment, do this in a local shell to see what error messages you get from requests, and use that to debug urllib.

Estimable answered 4/3, 2013 at 23:51 Comment(2)
The same error: BadStatusLine: ConnectionError: HTTPSConnectionPool(host='www.site.com', port=443): Max retries exceeded with url: /nps/servlet/exportdatadownload/ (Caused by <class 'httplib.BadStatusLine'>: '') When I browse to https://www.site.com/nps/servlet/exportdatadownload?val1=123& val2=abc&val3=1b3, the excel file is downloaded automatically , but still nog succes with a Python script...Oaks
BadStatusLine means that the server sent back an HTTP status that Python doesn't understand (and it understands all the "normal" ones). From a command-line, can you do a curl -I https://site.com (with whatever the real URL is there) and paste the results? If you don't have curl, you can also use hurl.it (in which case I'm just interested in the first paragraph of the response).Estimable
B
3

The BadStatusLine: '' (in httplib.py) gives away that there might be something else going on here. This may happen when the server sends no reply back at all, and just closes the connection.

As you mentioned that you're using an SSL connection, this might be particularly interesting to debug (with curl -v URL if you want). If you find out that curl -2 URL (which forces the use of SSLv2) seems to work, while curl -3 URL (SSLv3), doesn't, you may want to take a look at issue #13636 and possibly #11220 on the python bugtracker. Depending on your Python version & a possibly misconfigured webserver, this might be causing a problem: the SSL defaults have changed in v2.7.3.

Brucine answered 10/3, 2013 at 23:44 Comment(0)
L
1
   conn = httplib.HTTPSConnection(host='www.site.com', port=443, cert_file=_certfile)
   params  = urllib.urlencode({'cmd': 'token', 'device_id_st': 'AAAA-BBBB-CCCC',
                                'token_id_st':'DDDD-EEEE_FFFF', 'product_id':'Unit Test',
                                'product_ver':"1.6.3"})
    conn.request("POST", "servlet/datadownload", params)
    content = conn.getresponse().read()
    #print response.status, response.reason
    conn.close()
Latta answered 17/1, 2013 at 20:55 Comment(5)
I've tried your code, but adapted the first line to just httplib.HTTPSConnection('www.site.com'). When I print content.status I get Found 302. And printing the content it self, I get html code with The document has moved <A HREF="https://www.site.com/servlet/exportdatadownload">here</A>.<P> But how do I get the founed file?Oaks
I've edited my question with more information and with your code.Oaks
try url https://google.com, it feels you have some sort of server/destination issues.Latta
httplib.HTTPSConnection(host="www.google.com", port=443) gives an Not Found 404 output and httplib.HTTPConnection("www.google.com") gives Service Unavailable 503Oaks
That's good. There isn't /servlet/datadownload URL on google's website, hence the error. Now I am confident your server is the issue. Try to read something simple, like static html page(that you can access via a browser).Latta
L
0

The server may not like the missing headers, particularly user-agent and content-type. The Chrome image shows what is used for these. Maybe try adding the headers:

import httplib, urllib

host = 'www.site.com'
url = '/servlet/datadownload'

values = {
  'val1' : '123',
  'val2' : 'abc',
  'val3' : '1b3',
}

headers = {
    'User-Agent': 'python',
    'Content-Type': 'application/x-www-form-urlencoded',
}

values = urllib.urlencode(values)

conn = httplib.HTTPSConnection(host)
conn.request("POST", url, values, headers)
response = conn.getresponse()

data = response.read()

print 'Response: ', response.status, response.reason
print 'Data:'
print data

This is untested code, and you may want to experiment by adding other header values to match your screenshot. Hope it helps.

Leann answered 9/3, 2013 at 22:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.