Http Redirection code 3XX in python requests
Asked Answered
J

4

40

I am trying to capture http status code 3XX/302 for a redirection url. But I cannot get it because it gives 200 status code.

Here is the code:

import requests
r = requests.get('http://goo.gl/NZek5')
print r.status_code

I suppose this should issue either 301 or 302 because it redirects to another page. I had tried few redirecting urls (for e.g. http://fb.com ) but again it is issuing the 200. What should be done to capture the redirection code properly?

Joaquinajoash answered 3/3, 2014 at 14:59 Comment(0)
K
79

requests handles redirects for you, see redirection and history.

Set allow_redirects=False if you don't want requests to handle redirections, or you can inspect the redirection responses contained in the r.history list.

Demo:

>>> import requests
>>> url = 'https://httpbin.org/redirect-to'
>>> params = {"status_code": 301, "url": "https://mcmap.net/q/395856/-http-redirection-code-3xx-in-python-requests"}
>>> r = requests.get(url, params=params)
>>> r.history
[<Response [301]>, <Response [302]>]
>>> r.history[0].status_code
301
>>> r.history[0].headers['Location']
'https://mcmap.net/q/395856/-http-redirection-code-3xx-in-python-requests'
>>> r.url
'https://mcmap.net/q/395856/-http-redirection-code-3xx-in-python-requests'
>>> r = requests.get(url, params=params, allow_redirects=False)
>>> r.status_code
301
>>> r.url
'https://httpbin.org/redirect-to?status_code=301&url=https%3A%2F%2Fstackoverflow.com%2Fq%2F22150023'

So if allow_redirects is True, the redirects have been followed and the final response returned is the final page after following redirects. If allow_redirects is False, the first response is returned, even if it is a redirect.

Keeley answered 3/3, 2014 at 15:1 Comment(8)
when we run the request with allow_redirects=False, this means to not allow redirects and page wont go to the redirecting page. So why it shows up 301 instead of 200?Joaquinajoash
@user2789099: Sorry, I am not following you. 301 is the redirect status code. requests always first gets the first URL; if that is a 301 redirect and allow_redirects is True, the response is added to the history list and requests makes another GET request to retrieve the new location, and so on. If allow_redirects is False, the first 301 is returned directly.Keeley
@user2789099: If request_redirects is True, what is returned is the final response. So the 200 is because requests followed the redirect and fetched the next page too.Keeley
I stumbled upon this while having the same issue using C#'s HttpWebRequest. All I had to do was: request.AllowAutoRedirect = false; and now I get the 301 I would expect.Marginate
@Marginate that's... interesting, but C# and Python are quite separate beasts.Keeley
yeah I know, but it's not too surprising that they implemented similar logic in this case: "make a web request, if get 3XX status code, do redirect unless I am told not to". Just thought I would leave it here in case anyone had the same issue as me :-)Marginate
@Ben: sounds like you were looking for How to get a redirection response then.Keeley
@Martijn You are quite right! I stopped looking as your answer got me where I needed to go, but hopefully the link will help anyone else who ends up here!Marginate
I
11

requests.get allows for an optional keyword argument allow_redirects which defaults to True. Setting allow_redirects to False will disable automatically following redirects, as follows:

In [1]: import requests
In [2]: r = requests.get('http://goo.gl/NZek5', allow_redirects=False)
In [3]: print r.status_code
301
Ivers answered 3/3, 2014 at 15:5 Comment(1)
the default is False for HEAD requestsPeril
W
2

This solution will identify the redirect and display the history of redirects, and it will handle common errors. This will ask you for your URL in the console.

import requests

def init():
    console = input("Type the URL: ")
    get_status_code_from_request_url(console)


def get_status_code_from_request_url(url, do_restart=True):
    try:
        r = requests.get(url)
        if len(r.history) < 1:
            print("Status Code: " + str(r.status_code))
        else:
            print("Status Code: 301. Below are the redirects")
            h = r.history
            i = 0
            for resp in h:
                print("  " + str(i) + " - URL " + resp.url + " \n")
                i += 1
        if do_restart:
            init()
    except requests.exceptions.MissingSchema:
        print("You forgot the protocol. http://, https://, ftp://")
    except requests.exceptions.ConnectionError:
        print("Sorry, but I couldn't connect. There was a connection problem.")
    except requests.exceptions.Timeout:
        print("Sorry, but I couldn't connect. I timed out.")
    except requests.exceptions.TooManyRedirects:
        print("There were too many redirects.  I can't count that high.")


init()
Wellfed answered 19/3, 2017 at 21:22 Comment(0)
D
0

Anyone have the php version of this code?

    r = requests.get(url)
    if len(r.history) < 1:
        print("Status Code: " + str(r.status_code))
    else:
        print("Status Code: 301. Below are the redirects")
        h = r.history
        i = 0
        for resp in h:
            print("  " + str(i) + " - URL " + resp.url + " \n")
            i += 1
    if do_restart:
Dashing answered 22/3, 2022 at 8:6 Comment(1)
If you have a new question, please ask it by clicking the Ask Question button. Include a link to this question if it helps provide context. - From ReviewShanon

© 2022 - 2024 — McMap. All rights reserved.