Jsoup.connect is throwing 403 for valid login credential / cookie value
Asked Answered
H

1

2

Below working code has suddenly started failing with error

HTTP error fetching URL. Status=403, URL=[https://www.valueresearchonline.com/funds/26123/motilal-oswal-flexi-cap-fund-regular-plan]

I have got below Cookie value by logging into https://www.valueresearchonline.com/ with ( [email protected] / 0987654321 )

I'm not understanding what is broken suddenly ?

Jsoup.connect("https://www.valueresearchonline.com/funds/26123/motilal-oswal-flexi-cap-fund-regular-plan")
                            .timeout(15000)
                            .userAgent("Mozilla")
                            .header("Cookie", "PHPSESSID=6d9v48p1i5lpgm7pi75ag0okvq; currency=INR; magnitude=LC; ad=ee5ceff9a39de83a4dfcd9cc96efd7aa04912966; ad=ee5ceff9a39de83a4dfcd9cc96efd7aa04912966; wec=296642702; nobtlgn=714443298; ac=68156306%7C526294102%7C424761678; ac=68156306%7C526294102%7C424761678; _gcl_au=1.1.443352322.1663577511; _gid=GA1.2.307799405.1663577512; _fbp=fb.1.1663577512204.684634655; _clck=fgstbv|1|f50|0; __gads=ID=5ed6c786ebad7ee2-2263761e9bd600b6:T=1663577512:S=ALNI_Mb0p_Gif2EChNCORy7JOdTU7x4kjA; __gpi=UID=000009ce9ba54907:T=1663577512:RT=1663577512:S=ALNI_MbPL5xoHxgY76gU9mzDzJltULL80Q; __cf_bm=iIVI9aabT4vAdAmvQQzQTDDs9z4MPaMB1gv602Vn2rI-1663577514-0-AQ9ZKhXneLwVKm6CKEzLoY2EKcrIlNB82wgEPDw7taV6k/fnqTzp0L5zrpAl0fnkF1dn7Ac1DyNdfOnsgCTjBZx5Y6ia4Pvj2ceyIBfyXcIYpR8JkYTYGHfqPlrncv7k6Q==; alp=VROL; aa=364476%7C230264168%7C651860858; aa=364476%7C230264168%7C651860858; arl=604590238; arl=604590238; PERMA-ALERT=0; pgv=6; _ga=GA1.1.1410692956.1663577512; _ga_N9R425YFBJ=GS1.1.1663577511.1.1.1663577540.31.0.0; _clsk=1prm412|1663577540567|4|1|l.clarity.ms/collect")
                            .method(Connection.Method.GET)
                            .execute();

Example 2 : Failing

Connection.Response initial1 = Jsoup.connect("https://www.valueresearchonline.com/login/?target=%2fmyaccount")
                .timeout(60000)
                .userAgent("Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0")
                .data("username", "[email protected]")
                .data("password", "0987654321")
                .method(Connection.Method.POST)
                .execute();
        System.out.println("Call 1.1 " + initial1.statusCode());
        System.out.println(initial1.cookies().values());

Adding login debug console from chrome

enter image description here

UPDATE

I could get this working with hard coded value of cookies from chrome console but I'm not sure how to get value of cookies dynamically ?

Please check

STEP 1 ( hard coded cookie value )

String apiUrl = "https://www.valueresearchonline.com/api/check-user/";

                                // Connect to the API URL
                                Connection.Response response1 = Jsoup.connect(apiUrl)
                                        .method(Connection.Method.GET)
                                        .timeout(60000)
                                        .header("User-Agent", "Mozilla/5.0")
                                        .header("Accept", "application/json")
                                        //Hard Coded, taken from Chrome after valid email is entered.
                                        //Can we get this cookie value dynamically or runtime ??
                                        .header("Cookie", "HARD_CODED_COOKIE_VALUE_FROM_CHRIME")
                                        .data("q", "[email protected]")
                                        .data("password", "1")
                                        .ignoreContentType(true)  // Ignore content type to parse non-HTML response
                                        .execute();

// Get the cookies from the response
            Map<String, String> cookies1 = response1.cookies();

STEP 2 - Cookie value generated from STEP 1

Connection.Response response2 = Jsoup.connect("https://www.valueresearchonline.com/login/?target=%2f%3f&utm_source=home&utm_medium=vro&utm_campaign=desktop-profile-menu")
                                                .timeout(60000)
                                                .userAgent("Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0")
                                                .header("Cookie", getCookiesString(cookies1))
                                                .header("Host", "www.valueresearchonline.com")
                                                .data("username", "[email protected]")
                                                .data("password", "0987654321")
                                                .method(Connection.Method.POST)
                                                .ignoreContentType(true)
                                                .execute();

How can I get STEP 1 cookies value without hardcoding ?

Hartmann answered 4/10, 2023 at 6:15 Comment(10)
Sorry for the delay... In this case of situation try to do the same in your favourite browser. Check the headers sent by the browser and try reproduce them with Jsoup.Jung
@Jung sorry I was working on some other project and now back to this blocking issue. Please see the screenshot attached, kindly let me know what am I missing in the program ?Hartmann
@vikramvi, the problem you are facing is regarding the authentication. What you need to do is first, check which cookie is responsible for the authentication. When you hit the url, authorise with your username and password, after that get the cookie, store the timestamp and when you again hit the url check whether that particular cookie is expired or not, if the current time stamp is greater than the cookie expired time, then login again. This way you can solve this problem. I tried and it works fine with your mentioned url. Happy Coding... !!Dunaj
@JayKanara can you please share working code which you've tried, as I'm confused with the steps you've mentioned and not able to get it working with JSOUP. Thanks in advance.Hartmann
@JayKanara I could get this partially working, please check UPDATE section, but am not sure how to get 1st step cookie value dynamically without hardcoding ? please clarifyHartmann
@Jung kindly clarify per above commentHartmann
@Hartmann steps you have to do is like, 1. write a code to authenticate, you will get a cookies as response. After that each time you request to that website, you have to check that the cookie which you got by authentication is expired or not. (How to check expired or not, when you have cookie expired time and current time when you are visiting the website. If the current time is equal or greater than the cookie expired.) if expired authentication again using the login page and you got a new cookie. I am looking forward to this. Happy Coding...!! :)Dunaj
@JayKanara I tried to get cookies when a user goes to valueresearchonline.com but it's not working as expected. Tried with both JSOUP and HttpURLConnection, please check 2nd query related to HttpURLConnection stackoverflow.com/questions/77695386/…Hartmann
@JayKanara even for non authenticated request to valueresearchonline.com, I could see cookie is getting passed in headers, not sure how and from where this value is fetched ?Hartmann
@Hartmann Simply reproduce what Chrome does with Jsoup. Happy coding !Jung
V
1

The Jsoup Connection interface have methods that allows you to declare cookie rather than add a header for Cookies

    /**
     * Set a cookie to be sent in the request.
     * @param name name of cookie
     * @param value value of cookie
     * @return this Connection, for chaining
     */
    Connection cookie(String name, String value);

    /**
     * Adds each of the supplied cookies to the request.
     * @param cookies map of cookie name {@literal ->} value pairs
     * @return this Connection, for chaining
     */
    Connection cookies(Map<String, String> cookies);

By doing so you are sure to only generate one header with Cookie name and send it with valid data.
Hope it will help you

Valene answered 27/12, 2023 at 7:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.