How to redirect from HTTP to HTTPS with Beast C++ library?
Asked Answered
S

1

7

I am studying the Boost.Beast library. I try to make a request whose response is:

HTTP/1.1 301 Moved Permanently
Cache-Control: public
Content-Type: text/html; charset=UTF-8
Location: https://www.example.com/target/xxx/

Then, I try to make a request with this location field but I receive the bad request response.

How can I do the redirection? Is there an example?

This is my code:

boost::asio::io_service ios;
tcp::resolver resolver{ios};
tcp::socket socket{ios};
auto const lookup = resolver.resolve( tcp::resolver::query(host, port) );
boost::asio::connect(socket, lookup);

// Set up an HTTP GET request message
http::request<http::string_body> req{http::verb::get, target, 11};
req.set(http::field::host, host);
req.set(http::field::user_agent, BOOST_BEAST_VERSION_STRING);

// Send the HTTP request to the remote host
http::write(socket, req);

// This buffer is used for reading and must be persisted
boost::beast::flat_buffer buffer;

// Declare a container to hold the response
http::response<http::dynamic_body> res;

// Receive the HTTP response
http::read(socket, buffer, res);

if( res.base().result_int() == 301 ) {
   req.set(http::field::location, res.base()["Location"]);
   http::write(socket, req);
   boost::beast::flat_buffer buffer1;
   http::read(socket, buffer1, res);
}
std::cout << req << std::endl;
std::cout << res << std::endl;

Thanks

Sheik answered 25/8, 2017 at 11:22 Comment(4)
Are the extra spaces really there? Looks like it should be https:/ /www.domain.com/target/xxx/Roque
No, it's a fake URL. I get location field of response and set it into the request.Sheik
What request? Can you show us actual code? Maybe we can then see things.Roque
@Roque I've added the request code.Sheik
R
9

When you redirect, you cannot just "replace" a location on the existing request. You cannot even use the same socket, except in the rare cases when the redirected target is on the same TCP endpoint.

Because the host name, protocol and path might have changed, you do have to parse the location, get the scheme, host, path parts. Then you must do proper host resolution again, and make sure to use the right host name in the Host header.

Here's a sample that shows requesting the Boost License at the "wrong" url http://boost.org/user/license.html, which will promptly redirect to http://www.boost.org/user/license.html.

NOTE I've used network::uri to do the URI parsing for us: https://github.com/reBass/uri

Demo

#include <iostream>
#include <boost/beast.hpp>
#include <boost/beast/http.hpp>
#include <network/uri.hpp>
#include <boost/asio.hpp>
#include <string>

using boost::asio::ip::tcp;
namespace http = boost::beast::http;

struct Requester {
    void do_request(std::string const& url) {
        network::uri u{url};
        auto const lookup = resolver_.resolve( tcp::resolver::query(u.host().to_string(), u.scheme().to_string()) );

        // Set up an HTTP GET request message
        tcp::socket socket{ios};
        boost::asio::connect(socket, lookup);

        http::request<http::string_body> req{http::verb::get, u.path().to_string(), 11};
        req.keep_alive(true);

        req.set(http::field::host, u.host().to_string());
        req.set(http::field::user_agent, BOOST_BEAST_VERSION_STRING);

        std::cout << "Target: " << url << "\n";
        std::cout << req << "\n";

        http::write(socket, req);
        boost::beast::flat_buffer buffer;
        http::response<http::dynamic_body> res;
        http::read(socket, buffer, res);

        switch(res.base().result_int()) {
            case 301: 
                std::cout << "Redirecting.....\n";
                do_request(res.base()["Location"].to_string());
                break;
            case 200:
                std::cout << res << "\n";
                break;
            default:
                std::cout << "Unexpected HTTP status " << res.result_int() << "\n";
                break;
        }
    }
  private:
    boost::asio::io_service ios;
    tcp::resolver resolver_{ios};
};

int main() {
    try {
        Requester requester;
        requester.do_request("http://boost.org/users/license.html"); // redirects to http://www.boost.org/...
    } catch(std::exception const& e) {
        std::cerr << "Exception: " << e.what() << "\n";
    }
}

This prints:

Target: http://boost.org/users/license.html
GET /users/license.html HTTP/1.1
Host: boost.org
User-Agent: Boost.Beast/109


Redirecting.....
Target: http://www.boost.org/users/license.html
GET /users/license.html HTTP/1.1
Host: www.boost.org
User-Agent: Boost.Beast/109


HTTP/1.1 200 OK
Date: Sun, 27 Aug 2017 22:25:20 GMT
Server: Apache/2.2.15 (CentOS)
Accept-Ranges: bytes
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html

90fd
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
  <title>Boost Software License</title>
  <meta http-equiv="Content-Type" content="text/html; charset=us-ascii" />
  <link rel="icon" href="/favicon.ico" type="image/ico" />
  <link rel="stylesheet" type="text/css" href="../style-v2/section-boost.css" />
  <!--[if IE 7]> <style type="text/css"> body { behavior: url(/style-v2/csshover3.htc); } </style> <![endif]-->
</head><!--
Note: Editing website content is documented at:
http://www.boost.org/development/website_updating.html
-->

<body>
    ENTIRE LICENSE BODY SNIPPED
</body>
</html>

0
Roque answered 27/8, 2017 at 22:28 Comment(10)
I've left using an SSL socket for HTTPS as an exercise for the reader, as the conceptual problems with what constitutes a HTTP redirect seem to take precedence.Roque
In my case, I have to use a SSL socket because the resource is available by HTTPS. I understand that it only works if the Web server accept HTTP and HTTPS requests on the same port.Sheik
Of course not. Webservers never do. Typically they run at port 80 vs. 443. That's why the resolve step uses the URI scheme. What's left is to start an Ssl socket. The samples should help you get started.Roque
Sorry, I did not explain myself well. If the endpoint is the same for HTTP and HTTPS, I can make the request for the same connection, right? Because the resource is available using both protocols.Sheik
I didn't answer clearly enough then. "Q. If the endpoint is the same for HTTP and HTTPS" "A. That's impossible". HTTPS connections start with an SSL handshake, which means HTTP is IMPOSSIBLE at that end point. End of story. You cannot reuse the same connection on any standard webserver. Usually, at least the port is different, like I said.Roque
Okay, that I understood. I misunderstood an example(advanced server flex) that is included in the library where I thought it was possible.Sheik
Actually, it is completely possible to have both HTTP and HTTP/S on the same port. A couple of the Beast examples show how this is possible, see the "HTTP, flex" server and the "Advanced, flex" server here: boost.org/doc/libs/develop/libs/beast/doc/html/beast/… This is accomplished using the "SSL detector" operation, which is described in the documentation: boost.org/doc/libs/develop/libs/beast/doc/html/beast/using_io/…Rodmann
@VinnieFalco wow. Another thing learned. I'll look into it. Regardless, OP doesn't appear to be implementing server side. Also they will still have to account for SSL handshake+stream, am I right?Roque
@VinnieFalco May I suggest you to create a boost-beast tag?Callous
@Roque Yes, if the program is redirected from a plain port to an SSL port then it will be required to run a different piece of code which uses asio::ssl::stream and performs the SSL handshake before making the HTTP request.Rodmann

© 2022 - 2024 — McMap. All rights reserved.