Sometimes the spaces get URL encoded to the +
sign, and some other times to %20
. What is the difference and why should this happen?
+
means a space only in application/x-www-form-urlencoded
content, such as the query part of a URL:
http://www.example.com/path/foo+bar/path?query+name=query+value
In this URL, the parameter name is query name
with a space and the value is query value
with a space, but the folder name in the path is literally foo+bar
, not foo bar
.
%20
is a valid way to encode a space in either of these contexts. So if you need to URL-encode a string for inclusion in part of a URL, it is always safe to replace spaces with %20
and pluses with %2B
. This is what, e.g., encodeURIComponent()
does in JavaScript. Unfortunately it's not what urlencode does in PHP (rawurlencode is safer).
See Also
query+name=query+value
parameter from a form with <input name="query name" value="query value">
. It will not create query%20name
from a form, but it's totally safe to use that instead, eg. if you're putting a form submission together youself for an XMLHttpRequest
. If you have a URL with a space in it, like <a href="http://www.example.com/foo bar/">
, then the browser will encode that to %20
for you to fix your mistake, but that's probably best not relied on. –
Outspan foo bar
to foo+bar
? –
Chatter encodeURIComponent(s).replace(/%20/g, '+')
if you really need +
–
Outspan ,
in the url, because I needed to compare it exactly to database data, your solution rawurlencode()
is like a magic, where urlencode()
failed. –
Burberry So, the answers here are all a bit incomplete. The use of a %20
to encode a space in URLs is explicitly defined in RFC 3986, which defines how a URI is built. There is no mention in this specification of using a +
for encoding spaces - if you go solely by this specification, a space must be encoded as %20
.
The mention of using +
for encoding spaces comes from the various incarnations of the HTML specification - specifically in the section describing content type application/x-www-form-urlencoded
. This is used for posting form data.
Now, the HTML 2.0 specification (RFC 1866) explicitly said, in section 8.2.2, that the query part of a GET request's URL string should be encoded as application/x-www-form-urlencoded
. This, in theory, suggests that it's legal to use a +
in the URL in the query string (after the ?
).
But... does it really? Remember, HTML is itself a content specification, and URLs with query strings can be used with content other than HTML. Further, while the later versions of the HTML spec continue to define +
as legal in application/x-www-form-urlencoded
content, they completely omit the part saying that GET request query strings are defined as that type. There is, in fact, no mention whatsoever about the query string encoding in anything after the HTML 2.0 specification.
Which leaves us with the question - is it valid? Certainly there's a lot of legacy code which supports +
in query strings, and a lot of code which generates it as well. So odds are good you won't break if you use +
. (And, in fact, I did all the research on this recently because I discovered a major site which failed to accept %20
in a GET query as a space. They actually failed to decode any percent encoded character. So the service you're using may be relevant as well.)
But from a pure reading of the specifications, without the language from the HTML 2.0 specification carried over into later versions, URLs are covered entirely by RFC 3986, which means spaces ought to be converted to %20
. And definitely that should be the case if you are requesting anything other than an HTML document.
%20
(<a href="?q=a b">
), but when you send a form, it uses the +
sign. You can override that by explicitly using the +
sign (<a href="?q=a+b">
), or by sending the form using XMLHTTPRequest
. –
Tumbleweed http://www.example.com/some/path/to/resource?param1=value1
The part before the question mark must use % encoding (so %20
for space), after the question mark you can use either %20
or +
for a space. If you need an actual +
after the question mark use %2B
.
decodeURIComponent
doesn't decode it. –
Vivianaviviane +
is a reserved character it will be preserved by the browser. –
Vivianaviviane +
by default ({ foo: 'bar bar'}.to_query
=> foo=bar+bar
) –
Poulin part of the old application/x-www-form-urlencoded media type that doesn't apply to URLs
. But is it known why even in the latest Java (8 as of now) in the class java.net.URLEncoder
The space character " " is converted into a plus sign "+"
? And are there other cases where "high rep" software like the Java language enforce anti-standards instead of the actual standard (not browsers, as they support + but also the actual standard) ? –
Breakup +
. I suppose that's how Google thinks too. I just had a look and they don't URL encode :
either: google.se/… — I suppose that, too, is also to make the URL a bit more readable. :
is already in URLs (http:...
) so probably fairly safe — most other stuff they seem to URL encode though. –
Passible For compatibility reasons, it's better to always encode spaces as "%20", not as "+".
It was RFC 1866 (HTML 2.0 specification), which specified that space characters should be encoded as "+" in "application/x-www-form-urlencoded" content-type key-value pairs. (see paragraph 8.2.1. subparagraph 1.). This way of encoding form data is also given in later HTML specifications, look for relevant paragraphs about application/x-www-form-urlencoded.
Here is an example of a URL string where RFC 1866 allows encoding spaces as pluses: "http://example.com/over/there?name=foo+bar". So, only after "?", spaces can be replaced by pluses, according to RFC 1866. In other cases, spaces should be encoded to %20. But since it's hard to determine the context, it's the best practice to never encode spaces as "+".
I would recommend to percent-encode all characters except "unreserved" defined in RFC 3986, p.2.3.
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
The only situation when you may want to encode spaces as "+" (one byte) rather than "%20" (three bytes) is when you know for sure how to interpret the context, and when the size of the query string is of the essence.
What's the difference? See the other answers.
When should we use +
instead of %20
? Use +
if, for some reason, you want to make the URL query string (?.....
) or hash fragment (#....
) more readable. Example: You can actually read this:
https://www.google.se/#q=google+doesn%27t+encode+:+and+uses+%2B+instead+of+spaces
(%2B
= +)
But the following is a lot harder to read (at least to me):
I would think +
is unlikely to break anything, since Google uses +
(see the 1st link above) and they've probably thought about this. I'm going to use +
myself just because readable + Google thinks it's OK.
google.se/?your?query?here
and it would be invalid, but what do they care as long as their servers interpret it correctly and return your search results? It would be more interesting to see how they encode outgoing requests (like in fetching RSS), but even then, it's not a valid argument. Google is huge: no way that the url experts look at every place url encoding is used. –
Unclasp © 2022 - 2024 — McMap. All rights reserved.