Curl and Python Requests (get) reporting different http status code
Asked Answered
M

1

11

I have written a python script to validate url connectivity from a host. What is reporting successful (http 200) in linux curl is reported as a 403 in the python (3.6) requests module.

I'm hoping someone can help me understand the differences here in reported http status codes?

Curl from the Linux command line....

$ curl -ILs https://www.h2o.ai|egrep ^HTTP
HTTP/1.1 200 OK

Python requests module.....

>>> import requests
>>> url = 'https://www.h2o.ai'
>>> r = requests.get(url, verify=True, timeout=3)
>>> r.status_code
403
>>> requests.packages.urllib3.disable_warnings()
>>> r = requests.get(url, verify=False, timeout=3)
>>> r.status_code
403
Manolo answered 10/7, 2018 at 14:55 Comment(1)
What headers are you sending? What headers are you receiving? The reason for the 403 is probably explained in more detail in the body.Struggle
T
23

It seems the python-requests/<version> User-Agentis being served the 403 response from the site:

In [98]: requests.head('https://www.h2o.ai', headers={'User-Agent': 'Foo bar'})
Out[98]: <Response [200]>

In [99]: requests.head('https://www.h2o.ai')
Out[99]: <Response [403]>

You can contact the site owner if you want or just use a different user-agent via the User-Agent header (like i used above).


How did i debug this:

I have run curl with -v (--verbose) option to check the headers being sent, and then checked out the same with requests using response.request (assuming the response is saved as response).

I did not find any significant difference apart from the User-Agent header; hence, changing the User-Agent header worked as i expected.

Tafia answered 10/7, 2018 at 15:0 Comment(2)
This is helpful, thank you. Would you mind sharing a little background on how you reached the conclusion that: "User-Agent is being served the 403 response from the site". E.g. if I was troubleshooting, how would I know that this dummy header was the thing that h2o.ai was expecting?Manolo
This response and (especially) the update is a thing of beauty. Thank you!Manolo

© 2022 - 2024 — McMap. All rights reserved.