urllib2 HTTP Error 400: Bad Request
Asked Answered
T

5

27

I have a piece of code like this

host = 'http://www.bing.com/search?q=%s&go=&qs=n&sk=&sc=8-13&first=%s' % (query, page)
req = urllib2.Request(host)
req.add_header('User-Agent', User_Agent)
response = urllib2.urlopen(req)

and when I input a query greater than one word like "the dog" i get the following error.

response = urllib2.urlopen(req)
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 513, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 438, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 372, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 521, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request

Can anyone point out what im doing wrong? Thanks in advance.

Thapsus answered 12/1, 2012 at 18:27 Comment(1)
I've also received "urllib2.HTTPError: HTTP Error 406: Not Acceptable" when attempting to request URLs with whitespace.Gardening
C
66

The reason that "the dog" returns a 400 Error is because you aren't escaping the string for a URL.

If you do this:

import urllib, urllib2

quoted_query = urllib.quote(query)
host = 'http://www.bing.com/search?q=%s&go=&qs=n&sk=&sc=8-13&first=%s' % (quoted_query, page)
req = urllib2.Request(host)
req.add_header('User-Agent', User_Agent)
response = urllib2.urlopen(req)

It will work.

However I highly suggest you use requests instead of using urllib/urllib2/httplib. It's much much easier and it'll handle all of this for you.

This is the same code with python requests:

import requests

results = requests.get("http://www.bing.com/search", 
              params={'q': query, 'first': page}, 
              headers={'User-Agent': user_agent})
Chadd answered 12/1, 2012 at 18:38 Comment(0)
S
7

You need to use urllib.quote() on your 'query' variable:

query = urllib.quote(query)
host = 'http://www.bing.com/search?q=%s&go=&qs=n&sk=&sc=8-13&first=%s' % (query, page)

This does the necessary URL escaping to convert the space in big dog to big%20dog.

Suffer answered 12/1, 2012 at 18:36 Comment(0)
H
4

you have to use urllib.quote

Hu answered 12/1, 2012 at 18:35 Comment(0)
K
2

Here is an example of how to use urllib.request object in Python 3.6 and above.

import urllib.request
import json
from pprint import pprint

url = "some_url"

values = {
    "first_name": "Vlad",
    "last_name": "Bezden",
    "urls": [
        "https://twitter.com/VladBezden",
        "https://github.com/vlad-bezden",
    ],
}


headers = {
    "Content-Type": "application/json",
    "Accept": "application/json",
}

data = json.dumps(values).encode("utf-8")
pprint(data)

try:
    req = urllib.request.Request(url, data, headers)
    with urllib.request.urlopen(req) as f:
        res = f.read()
    pprint(res.decode())
except Exception as e:
    pprint(e)
Knife answered 8/8, 2019 at 14:21 Comment(0)
A
0

I also encountered the same problem. Turns out the problem was the method was set inappropriately. When you include urlencoded data in urllib2.urlopen () the method should be set to POST and when you exclude it, method should be GET. So, how do you set the method is given below:

For POST request

request_object = urllib2.Request(url)
method = ("POST", "GET")
request_object.get_method = lambda: method[0] #If method is set to POST
url_handle = opener.open(req, data) #If method is set to POST

For GET request

request_object = urllib2.Request(url)
method = ("POST", "GET")
request_object.get_method = lambda: method[1] #If method is set to GET
url_handle = opener.open(req) #If method is set to GET

This will set your url request method to the appropriate required method

Adenoid answered 22/8, 2015 at 5:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.