webextension: Why does the browser add a trailing slash to the requested URL?
Asked Answered
A

1

6

When I make a request to http://www.example.com, why does I see http://www.example.com/ in the webRequest.onBeforeRequestListener?

For example:

chrome.webRequest.onBeforeRequest.addListener(
  details => console.log('Sending request to', details.url),
  { urls: ['<all_urls>'] });
fetch('http://www.example.com');

will print

Sending request to http://www.example.com/

That is consistent with the request URL shown in the network request monitor. For example, if I take it and convert it to a curl command, the request looks like this:

curl 'http://www.example.com/' -H 'Accept: */*' -H 'Connection: keep-alive'
    -H 'Accept-Encoding: gzip, deflate' -H 'Accept-Language: en-US,en;q=0.9'
    -H 'User-Agent: ...' --compressed

So, the original request that goes out is for http://www.example.com/ not for http://www.example.com. That decision must have been made in the browser, not by the server.

The same behavior also occurs when using XMLHttpRequest instead of fetch. In my example, I used Chrome, but on Firefox it is the same.

Questions:

  • Why does the browser change it automatically? It also happens with other URLs. From my understanding, adding a trailing slash will often work, but in general, it is a breaking change.
  • If I want to filter in the onBeforeRequest listener for the current request to a specific URL, how can you reliably match it? For instance, just checking whether the URLs are identical will fail.
  • Are there more rewrite URL rules in the browser to be aware of?
Accrete answered 17/11, 2017 at 14:30 Comment(2)
Some other answers regarding the trailing slash: #2581911, webmasters.stackexchange.com/questions/35643/…Woodwork
@PredatorIWD Thanks for the link. That seems to confirm what I wrote in my answer.Cusec
A
7

Think, I found it. The browser is just fixing an invalid URL.

To cite from Wikipedia, a URL looks like this:

scheme:[//[user[:password]@]host[:port]][/path][?query][#fragment]

The path must begin with a single slash (/) if an authority part was present, and may also if one was not, but must not begin with a double slash. The path is always defined, though the defined path may be empty (zero length), therefore no trailing slash.

http://example.com has an authority part (in this example, the schema plus hostname: http://example.com), but that leaves the path empty. According to the specification, the path must start with a /, so the browser fixes it by replacing the empty path by /.

If you use a valid URL instead, like http://example.com/abc, it does not need to modify it.

Accrete answered 17/11, 2017 at 17:12 Comment(3)
AFAIR this comes from HTTP RFCs. And here is the relevant quote https://mcmap.net/q/281511/-do-web-browsers-always-send-a-trailing-slash-after-a-domain-nameSniff
@Philip, I don't think this is fixing URL, Kindly look at this question posted by me, here "/" is not getting added by itself: #76234065Caller
@SupRaviKumar There is a difference between a technical invalid URL like http://example.com and a valid URL like http://example.com/abc. Only in the first case, the "/" will be added (as described in my answer). In your linked question, it covers the second case. The browser will not change it since the URL is well-formed and changing it could change its meaning.Cusec

© 2022 - 2024 — McMap. All rights reserved.