Are URIs case-insensitive?
Asked Answered
D

6

29

When comparing two URIs to decide if they match or not, a client
SHOULD use a case-sensitive octet-by-octet comparison of the entire
URIs, with these exceptions:

I read above Sentence in Http Rfc I think Url is case-insensitive but i dont undrestand what that means ?

Demotic answered 26/3, 2013 at 16:0 Comment(2)
Technically, URIs and URLs are not equivalent terms, so this is not a duplicate of this very similar question: #7997419Clite
Does this answer your question? Should URL be case sensitive?Metatherian
S
20

In reality it depends on the web server.

IIS is not case sensitive.

Apache is.

I suspect that the decision regarding IIS is rooted in the fact that the Windows file system is not case sensitive.

IIS still meets that portion of the spec because SHOULD is a recommendation, not a requirement.

Sox answered 26/3, 2013 at 16:1 Comment(10)
SHOULD is a recommendation, not a requirement. It still meets the spec.Sox
SHOULD is allowed by RFC 2616, but not anymore by RFC 7230.Superstratum
@RemyLebeau: RFC 7230's status is PROPOSED STANDARD, which is a bit less definitive than RFC 2616's status DRAFT STANDARD :-) I bet that a great deal of applications built on IIS will break if it starts acting case sensitive. At some point IIS may come with a case-sensitive mode, but I bet that it will be able to be turned off for quite some time. Note that the scheme and host will remain case insensitive.Sox
This answer is not quite complete -- the result depends on the underlying File System, not the web server. Apache on windows (with NTFS/FAT) is not case sensitive; apache on Linux (extx) is.Bucolic
@Beel, no, it depends on the server's configuration/behaviour. Whether the URLs /myPath and /mypath are ultimately resolved to the same file by the web server due to the case-sensitivity of the underlying filesystem is a separate matter. The URLs may not even correspond to filepaths on the server, e.g. if they are API endpoints; or even if they do and the filesystem is case-insensitive, they may be translated by the web server in some way so that the result is different, e.g. /myPathC:\webroot\myfile1, and /mypathC:\webroot\myfile2.Affairs
@JivanPal, I probably should have said that the result also depends on the underlying File System. Routes are configured, as are virtual directories, but any part of a URI that maps to a file system construct will reflect the case sens. of the FS. myfile1.html and MYFILE1.HTML would both return c:\webroot\myfile1.html on windows, but not linux even if both served by apache.Bucolic
@Beel, it still depends on how the server is configured. Let's assume you're using IIS on Windows Server with your content on a FAT-32 filesystem (case-insensitive) mounted as drive D:. Even then, you could have a case-sensitive rewrite rule (<match url="..." ignoreCase="false"/>) that causes, say, the response to requests for URIs that are all uppercase to be different than otherwise, e.g. /mypage.htmlD:\mypage.html, and /MyPage.HtmlD:\mypage.html; but /MYPAGE.HTMLD:\uppercase\mypage.html.Affairs
@jivanpal ... your point only holds for a file system that is case-insensitive. My point was about case-sensitive file systems. Try your experiment on extx and see what you find.Bucolic
@Beel, no, my point is that it depends what filepath the web server resolves the URI to (if it even resolves the URI to a filepath). The filesystem is responsible for resolving a filepath to a particular file, but the web server is responsible for resolving a URI to a filepath in the first place, and the web server may do this in a case-sensitive fashion, irrespective of the case-sensitivity of the filesystem. You remark that "Apache on Windows is not case-sensitive", but this is false...Affairs
... Apache is case-sensitive everywhere; it is the filesystem (or in the case of NTFS, Windows itself) that isn't. You can set up Windows Server with EXT4 drivers, install Apache, set a directory on an EXT4 filesystem as the webroot, and Apache will happily serve the files correctly.Affairs
S
42

RFC 3986 states:

the scheme and host are case-insensitive and therefore should be normalized to lowercase. For example, the URI <HTTP://www.EXAMPLE.com/> is equivalent to <http://www.example.com/>. The other generic syntax components are assumed to be case-sensitive unless specifically defined otherwise by the scheme

RFC 2616 defines the following comparison rule for the HTTP scheme:

When comparing two URIs to decide if they match or not, a client SHOULD use a case-sensitive octet-by-octet comparison of the entire URIs, with these exceptions:

However, RFC 7230 locks it down further by stating:

The scheme and host are case-insensitive and normally provided in lowercase; all other components are compared in a case-sensitive manner.

Those rules typically apply to client side comparisons. There are no rules specifically geared for server side comparisons. Once a server breaks up a URI into its components, it should treat them according to the same rules, but I don't see that enforced in the RFCs. Some web servers, like Apache, do follow the rules. IIS doesn't, for compatibility with Windows' case-insensitive file system.

Superstratum answered 4/10, 2014 at 19:7 Comment(1)
So I guess the question becomes one of providing a good user experience. Should your site or service be case sensitive? I guess that's not a question for SO but for OpinionOverflow.comRento
S
20

In reality it depends on the web server.

IIS is not case sensitive.

Apache is.

I suspect that the decision regarding IIS is rooted in the fact that the Windows file system is not case sensitive.

IIS still meets that portion of the spec because SHOULD is a recommendation, not a requirement.

Sox answered 26/3, 2013 at 16:1 Comment(10)
SHOULD is a recommendation, not a requirement. It still meets the spec.Sox
SHOULD is allowed by RFC 2616, but not anymore by RFC 7230.Superstratum
@RemyLebeau: RFC 7230's status is PROPOSED STANDARD, which is a bit less definitive than RFC 2616's status DRAFT STANDARD :-) I bet that a great deal of applications built on IIS will break if it starts acting case sensitive. At some point IIS may come with a case-sensitive mode, but I bet that it will be able to be turned off for quite some time. Note that the scheme and host will remain case insensitive.Sox
This answer is not quite complete -- the result depends on the underlying File System, not the web server. Apache on windows (with NTFS/FAT) is not case sensitive; apache on Linux (extx) is.Bucolic
@Beel, no, it depends on the server's configuration/behaviour. Whether the URLs /myPath and /mypath are ultimately resolved to the same file by the web server due to the case-sensitivity of the underlying filesystem is a separate matter. The URLs may not even correspond to filepaths on the server, e.g. if they are API endpoints; or even if they do and the filesystem is case-insensitive, they may be translated by the web server in some way so that the result is different, e.g. /myPathC:\webroot\myfile1, and /mypathC:\webroot\myfile2.Affairs
@JivanPal, I probably should have said that the result also depends on the underlying File System. Routes are configured, as are virtual directories, but any part of a URI that maps to a file system construct will reflect the case sens. of the FS. myfile1.html and MYFILE1.HTML would both return c:\webroot\myfile1.html on windows, but not linux even if both served by apache.Bucolic
@Beel, it still depends on how the server is configured. Let's assume you're using IIS on Windows Server with your content on a FAT-32 filesystem (case-insensitive) mounted as drive D:. Even then, you could have a case-sensitive rewrite rule (<match url="..." ignoreCase="false"/>) that causes, say, the response to requests for URIs that are all uppercase to be different than otherwise, e.g. /mypage.htmlD:\mypage.html, and /MyPage.HtmlD:\mypage.html; but /MYPAGE.HTMLD:\uppercase\mypage.html.Affairs
@jivanpal ... your point only holds for a file system that is case-insensitive. My point was about case-sensitive file systems. Try your experiment on extx and see what you find.Bucolic
@Beel, no, my point is that it depends what filepath the web server resolves the URI to (if it even resolves the URI to a filepath). The filesystem is responsible for resolving a filepath to a particular file, but the web server is responsible for resolving a URI to a filepath in the first place, and the web server may do this in a case-sensitive fashion, irrespective of the case-sensitivity of the filesystem. You remark that "Apache on Windows is not case-sensitive", but this is false...Affairs
... Apache is case-sensitive everywhere; it is the filesystem (or in the case of NTFS, Windows itself) that isn't. You can set up Windows Server with EXT4 drivers, install Apache, set a directory on an EXT4 filesystem as the webroot, and Apache will happily serve the files correctly.Affairs
A
11

The host portion of the URI is not case sensitive:

http://stackoverflow.com
http://StackOverflow.com

Either of the above will get you to this site.

The rest of the URI after the host portion can be case sensitive. It depends on the server.

Anathematize answered 26/3, 2013 at 16:11 Comment(3)
Sure, both URIs will bring you to the same webserver, but isn't it up to the webserver to determine what to do with the URIs? It could conceivably do different things with the different cases.Shevlo
@Colin'tHart - You are correct that the webserver could act differently based on case of the domain. But in general webservers do not do this. If you know of one that does, I'd be interested to hear about it.Anathematize
"It depends on the server." - strictly speaking, it depends on the application running on the webserver. But the URL-path portion of the URL is defined as case-sensitive in the RFCs, so any system that processes URLs (such as Google etc.) must treat /my-url and /MY-URL as two different URLs. If these two URLs return the same content (ie. the application is treating the URL case-insensitively and not canonicalising the request) then you potentially have a duplicate content issue.Etude
P
4

Whether or not URLs are treated as case-sensitive also depends on the web server. For example, Microsoft IIS servers do not treat URLs as case-sensitive.

The following URLs (hosted on a Microsoft IIS server) are both treated as equivalent:

However, Apache servers do treat URLs as case-sensitive are classed as two different resources:

Technically, Apache is following the standards correctly here, and Microsoft is going against the specification… Oh well – “old habits die hard,” they say!

Palisade answered 4/10, 2014 at 18:30 Comment(1)
Apache was the definitive implementation of an HTTP server, intended to demonstrate the standard. Microsoft has never been good at following standards because their business model has relied on locking people into their products so it's in their interests to subvert the standards to prevent inter-operation.Carafe
M
4

As mentioned in answer by Remy Lebeau, the rules are set for client side. Actually, this means that client software should not try to make arbitrary case modifications to all parts of URIs, except for specifically stated parts. So, when a browser e.g. sees a relative URL in a page anchor, is should not convert it to lowercase before checking if it is already cached in its cache; neither should it use the URI lowercased to post to server. Also, it should not decide that two URIs that differ in case only point to same resource (thus possibly wrongly skipping a transaction and returning cached result instead).

This means that client should not assume how servers treat the URIs. It does require servers to treat some parts case-insensitive: e.g., scheme and host. But otherwise, it's up to server to decide if two URIs that differ in case point to the same resource, or not. Standard does not impose any restrictions on servers in this regards, there's nothing "server should" or "server should not" besides directly prescribed. If server decides that its URIs are case-insensitive, that's absolutely fine. If they are case-sensitive, that's fine, too.

Majuscule answered 31/7, 2016 at 10:51 Comment(0)
B
2

For a file-based URI, case-sensitivity depends more on the underlying file system, not so much the web server. Apache will happily return index.html for INDEX.html on Windows (FAT, NTFS) and mac (HFS), but not for case-sensitive file systems such as those usually used in Linux (extx and so forth).

Bucolic answered 17/8, 2015 at 1:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.