urllib2 basic authentication oddites
Asked Answered
G

4

8

I'm slamming my head against the wall with this one. I've been trying every example, reading every last bit I can find online about basic http authorization with urllib2, but I can not figure out what is causing my specific error.

Adding to the frustration is that the code works for one page, and yet not for another. logging into www.mysite.com/adm goes absolutely smooth. It authenticates no problem. Yet if I change the address to 'http://mysite.com/adm/items.php?n=201105&c=200' I receive this error:

<h4 align="center" class="teal">Add/Edit Items</h4>
<p><strong>Client:</strong> </p><p><strong>Event:</strong> </p><p class="error">Not enough information to complete this task</p>

<p class="error">This is a fatal error so I am exiting now.</p>

Searching google has lead to zero information on this error.

The adm is a frame set page, I'm not sure if that's relevant at all.

Here is the current code:

import urllib2, urllib
import sys

import re
import base64
from urlparse import urlparse

theurl = 'http://xxxxxmedia.com/adm/items.php?n=201105&c=200'
username = 'XXXX'
password = 'XXXX'

passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, theurl,username,password)

authhandler = urllib2.HTTPBasicAuthHandler(passman)

opener = urllib2.build_opener(authhandler)

urllib2.install_opener(opener)

pagehandle = urllib2.urlopen(theurl)

url = 'http://xxxxxxxmedia.com/adm/items.php?n=201105&c=200'
values = {'AvAudioCD': 1,
          'AvAudioCDDiscount': 00, 'AvAudioCDPrice': 50,
          'ProductName': 'python test', 'frmSubmit': 'Submit' }

#opener2 = urllib2.build_opener(urllib2.HTTPCookieProcessor())
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)

This is just one of the many versions I've tried. I've followed every example from Urllib2 Missing Manual but still receive the same error.

Can anyone point to what I'm doing wrong?

Glider answered 3/2, 2011 at 5:46 Comment(5)
It looks like your code is working, but not the site you're connecting to. Does it work in a browser? Since the page contains frames, have you looked at its source?Baelbeer
Yeah, it works in the browser. I've checked out it's source with firebug. Admin site goes to html page with this style code: <frameset cols="25%,75%"> <frame src="frame_a.htm" /> <frame src="frame_b.htm" /> </frameset>Glider
@jd Yeah, it works in the browser. I've checked out it's source with firebug. I'm not entirely sure what to be looking for. I did notice that I can authenticate with python on every page except for those that have paramaters in the address IE. ..dia.com/adm/items.php?n=201105&c=200'.Glider
The http response must have the header "WWW-Authenticate". See this answer.Springs
I also found the passman stuff didn't work. Adding the base64 user/pass header as per this answer https://mcmap.net/q/169717/-python-urllib2-basic-authentication did work for me. Accessing jenkins URL like http://<jenkins:port>/job/<jobname>/lastCompletedBuild/testR‌​‌​eport/api/pythonPalaearctic
P
2

About an year ago, I went thro' the same process and documented how I solved the problem - The direct and simple way to authentication and the standard one. Choose what you deem fit.

HTTP Authentication in Python

There is an explained description, in the missing urllib2 document.

Polydactyl answered 3/2, 2011 at 7:39 Comment(1)
So, after much wiresharking it turns out I'm actually still not authenticating.. I've tried all examples in your linked post. I've downloaded the exact script from the void space website, yet while watching python with wireshark, I still get this error: <h1>Authorization Required</h1> <p>This server could not verify that you are authorized to access the document requested. Either you supplied the wrong credentials (e.g., bad password), or your browser doesn't understand how to supply the credentials required.</p> Any ideas..?Glider
B
4

Run into a similar problem today. I was using basic authentication on the website I am developing and I couldn't authenticate any users.

Here are a few things you can use to debug your problem:

  1. I used slumber.in and httplib2 for testing purposes. I ran both from ipython shell to see what responses I was receiving.
  2. Slumber actually uses httplib2 beneath the covers so they acted similarly. I used tcpdump and later tcpflow (which shows information in a much more readable form) to see what was really being sent and received. If you want a GUI, see wireshark or alternatives.
  3. I tested my website with curl and when I used curl with my username/password it worked correctly and showed the requested page. But slumber and httplib2 were still not working.
  4. I tested my website and browserspy.dk to see what were the differences. Important thing is browserspy's website works for basic authentication and my web site did not, so I could compare between the two. I read in a lot of places that you need to send HTTP 401 Not Authorized so that the browser or the tool you are using could send the username/password you provided. But what I didn't know was, you also needed the WWW-Authenticate field in the header. So this was the missing piece.
  5. What made this whole situation odd was while testing I would see httplib2 send basic authentication headers with most of the requests (tcpflow would show that). It turns out that the library does not send username/password authentication on the first request. If "Status 401" AND "WWW-Authenticate" is in the response, then the credentials are sent on the second request and all the requests to this domain from then on.

So to sum up, your application may be correct but you might not be returning the standard headers and status code for the client to send credentials. Use your debug tools to find which is which. Also, there's debug mode for httplib2, just set httplib2.debuglevel=1 so that debug information is printed on the standard output. This is much more helpful then using tcpdump because it is at a higher level.

Hope this helps someone.

Beffrey answered 13/10, 2011 at 13:43 Comment(1)
If this is true I think you may have answered the questions that I and so many have been trying to understand. This is exactly what the missing urllib2 handbook is saying here: Included in the response headers will be a 'WWW-authenticate' header. but I wasn't getting it. Thanks for spelling it all out in plain English. In my case I was trying the Github v2 api, which sends back 401, but it never sends back www-authenticate so Python urllib2 never sends the login.Springs
P
2

About an year ago, I went thro' the same process and documented how I solved the problem - The direct and simple way to authentication and the standard one. Choose what you deem fit.

HTTP Authentication in Python

There is an explained description, in the missing urllib2 document.

Polydactyl answered 3/2, 2011 at 7:39 Comment(1)
So, after much wiresharking it turns out I'm actually still not authenticating.. I've tried all examples in your linked post. I've downloaded the exact script from the void space website, yet while watching python with wireshark, I still get this error: <h1>Authorization Required</h1> <p>This server could not verify that you are authorized to access the document requested. Either you supplied the wrong credentials (e.g., bad password), or your browser doesn't understand how to supply the credentials required.</p> Any ideas..?Glider
B
1

From the HTML you posted, it still think that you authenticate successfully but encounter an error afterwards, in the processing of your POST request. I tried your URL and failing authentication, I get a standard 401 page.

In any case, I suggest you try again running your code and performing the same operation manually in Firefox, only this time with Wireshark to capture the exchange. You can grab the full text of the HTTP request and response in both cases and compare the differences. In most cases that will lead you to the source of the error you get.

Baelbeer answered 3/2, 2011 at 8:28 Comment(2)
It seems you're right. It was authenticating OK. It's just failing for some other reason. After it closes the <head> tag is when it spits out the error. I installed Wireshark and watched for differences between browser requests and python requests. Honestly I'm not entirely sure what I'm looking for.. But when connecting with python it did highlight certain frames in red, which I guess is bad tcp.. There are about 6 of these in a row: [TCP ZeroWindow] lbc-watchdog > http [ACK] Seq=181. Could this be what's causing the error? And how would I use this info to correct my problem..?Glider
In Wireshark, locate one tcp packet that belongs to the right connection (from dst/src addresses and port), then right-click Follow TCP stream: there are your client's HTTP request and the server's response.Baelbeer
P
0

I also found the passman stuff doesn't work (sometimes?). Adding the base64 user/pass header as per this answer https://mcmap.net/q/169717/-python-urllib2-basic-authentication did work for me. I am accessing jenkins URL like this: http:///job//lastCompletedBuild/testR‌​‌​eport/api/python

This works for me:

import urllib2
import base64

baseurl="http://jenkinsurl"
username=...
password=...

url="%s/job/jobname/lastCompletedBuild/testReport/api/python" % baseurl

base64string = base64.encodestring('%s:%s' % (username, password)).replace('\n', '')
request = urllib2.Request(url)
request.add_header("Authorization", "Basic %s" % base64string) 
result = urllib2.urlopen(request)
data = result.read()

This doesn't work for me, error 403 each time:

import urllib2

baseurl="http://jenkinsurl"
username=...
password=...

##urllib2.HTTPError: HTTP Error 403: Forbidden
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, username,password)
urllib2.install_opener(urllib2.build_opener(urllib2.HTTPBasicAuthHandler(passman)))
req = urllib2.Request(url)
result = urllib2.urlopen(req)
data = result.read()
Palaearctic answered 19/1, 2017 at 15:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.