When are you supposed to use escape instead of encodeURI / encodeURIComponent?
Asked Answered
C

16

1512

When encoding a query string to be sent to a web server - when do you use escape() and when do you use encodeURI() or encodeURIComponent():

Use escape:

escape("% +&=");

OR

use encodeURI() / encodeURIComponent()

encodeURI("http://www.google.com?var1=value1&var2=value2");

encodeURIComponent("var1=value1&var2=value2");
Commensurate answered 16/9, 2008 at 19:24 Comment(6)
It's worth pointing out that encodeURIComponent("var1=value1&var2=value2") is not the typical use case. That example will encode the = and &, which is probably not what was intended! encodeURIComponent is typically applied separately to just the value in each key value pair (the part after each =).Placeman
do you need to do anything to the key? What if it has an = in it? (is that even possible?)Fordo
@Fordo I'm still new to web programming in general, but what I've used in my limited experience is to encode the key and the value separately, ensuring the '=' stays: var params = encodeURIComponent(key) + '=' + encodeURIComponent(value); - Maybe someone else knows a better way.Thoughtful
@nedshares I was playing with that, but as far as I can tell the key doesn't seem to be encoded... at least not in the same way. Maybe it's against spec to have an = in the key?Fordo
Also worth pointing out that recent JavaScript implementations provide the higher-level interfaces URL and URLSearchParams for manipulating URLs and their query strings.Mavismavra
encodeURI is for encoding an already built multi-parameter string (deals with whitespace that is invalid in a url). encodeURIComponent encodes to a single parameter value. These do more. But this is the first thought I always ask myself.Llewellyn
T
2015

escape()

Don't use it! escape() is defined in section B.2.1.1 escape and the introduction text of Annex B says:

... All of the language features and behaviours specified in this annex have one or more undesirable characteristics and in the absence of legacy usage would be removed from this specification. ...
... Programmers should not use or assume the existence of these features and behaviours when writing new ECMAScript code....

Behaviour:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/escape

Special characters are encoded with the exception of: @*_+-./

The hexadecimal form for characters, whose code unit value is 0xFF or less, is a two-digit escape sequence: %xx.

For characters with a greater code unit, the four-digit format %uxxxx is used. This is not allowed within a query string (as defined in RFC3986):

query       = *( pchar / "/" / "?" )
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded   = "%" HEXDIG HEXDIG
sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
              / "*" / "+" / "," / ";" / "="

A percent sign is only allowed if it is directly followed by two hexdigits, percent followed by u is not allowed.

encodeURI()

Use encodeURI when you want a working URL. Make this call:

encodeURI("http://www.example.org/a file with spaces.html")

to get:

http://www.example.org/a%20file%20with%20spaces.html

Don't call encodeURIComponent since it would destroy the URL and return

http%3A%2F%2Fwww.example.org%2Fa%20file%20with%20spaces.html

Note that encodeURI, like encodeURIComponent, does not escape the ' character.

encodeURIComponent()

Use encodeURIComponent when you want to encode the value of a URL parameter.

var p1 = encodeURIComponent("http://example.org/?a=12&b=55")

Then you may create the URL you need:

var url = "http://example.net/?param1=" + p1 + "&param2=99";

And you will get this complete URL:

http://example.net/?param1=http%3A%2F%2Fexample.org%2F%Ffa%3D12%26b%3D55&param2=99

Note that encodeURIComponent does not escape the ' character. A common bug is to use it to create html attributes such as href='MyUrl', which could suffer an injection bug. If you are constructing html from strings, either use " instead of ' for attribute quotes, or add an extra layer of encoding (' can be encoded as %27).

For more information on this type of encoding you can check: http://en.wikipedia.org/wiki/Percent-encoding

Touchhole answered 16/9, 2008 at 19:24 Comment(17)
@Francois, depending on the receiving server, it may not properly decode how escape encodes upper ASCII or non-ASCII characters such as: âầẩẫấậêềểễếệ For example, Python's FieldStorage class won't decode the above string properly if encoded bye escape.Overbearing
@Francois escape() encodes the lower 128 ASCII chars except letters, digits, and *@-_+./ while unescape() is the inverse of escape(). As far as I can tell, they're legacy functions designed for encoding URLs and are only still implemented for backwards compatibility. Generally, they should not be used unless interacting with an app/web service/etc designed for them.Chapple
Unless of course you're trying to pass a URL as a URI component in which case call encodeURIComponent.Schmuck
escape() doesn't support unicode well and it isn't available on some versions of browsers, e.g. FFSchmuck
Sometimes it does not escape certain characters like the ' symbol, so I use encodeURIComponent(encodeURIComponent($string)). Is that the only way of doing it?Housewifely
Why doesn't it handle the single quote?Desiree
@Desiree It does not encode single-quote, because single-quote is a completely valid character to occur within a URI (RFC-3986). The problem occurs when you embed a URI within HTML, where single-quote is not a valid character. It follows then, that URIs should also be "HTML-encoded" (which would replace ' with ') before being placed into an HTML document.Gothar
@Gothar - When you say "embed a URI within HTML," I presume you mean assigning a URI to, say, an href or src attribute. For example, <a href="URI"> or <img src="URI">.Eparchy
@Eparchy that is one of the cases that I'm referring to. In fact, ' is technically not a valid character anywhere in the normal content of an html document (though in practice, most browsers will handle it fine as long as it's not inside an attribute value). Anyway, in order to be technically correct, all content (including URIs) should be html-encoded before being placed anywhere within an html attribute, or text content of an html element. (Obviously, certain special elements like <script>, and CDATA sections are exempt from this requirement).Gothar
@Gothar - My reading of the W3C HTML5 Recommendation indicates that ' can appear in an HTML document. For instance, consider the example given in Single-quoted attribute value syntax: <input type='checkbox'>. And from Normal elements: "Normal elements can have text, character references, other elements, and comments, but the text must not contain the character "<" (U+003C) or an ambiguous ampersand." Also see this answer.Eparchy
@Eparchy ' is (and has always been) a valid delimiter around attribute values in html. Of course " is the other valid attribute value delimiter. Because of this fact, earlier versions of HTML (or possibly it was just earlier browsers) treated ', ", and a few other chars as reserved characters, and they were not handled well when encountered in normal content. (By "normal content", I mean within attribute values, and within the text content of elements.) It is true that HTML5 (and modern browsers) do allow any char in CDATA blocks as long as the use is contextually unambiguous.Gothar
@Gothar - Regarding the text content of elements and attributes: please see this answer to my question Uses for the '&quot;' entity in HTML. That question addresses the double quote; I believe similar reasoning would apply to the single quote. My conclusion: <p>"Double-quoted expression"</p> and <p>'Single-quoted expression'</p> are both valid. I invite you to comment or add your own answer to my referenced question. I think that would be better than continuing our dialog here...Eparchy
@Eparchy - not arguing with you at all. I acknowledge that I was incorrect to say, "' is not technically a valid character anywhere in HTML content". That said, I have personally and directly encountered rendering problems with use of unencoded quotes in text content of html documents, but that was long ago, when browsers were very bad at basically everything, and not even the mainstream browsers were strictly spec-compliant. I'm glad we've left those days behind. When you encounter this in old documents, know that it was based on actual real-world issues of the time.Gothar
@Eparchy - as requested, I've posted an answer at the other thread, elaborating on this in the context of that question.Gothar
Do you mean with injection bug, that we could enclose HTML tag attribute values within single quotes?Unfrock
I see escape in both the ES3 and the ES5 standards. Why do you say it is deprecated?Fetid
Great answer, thanks! I found other nice articles on the differences between these functions here at Javascripter.net, and I tried to explore and compile some of the differences into an Observable notebookBilection
F
489

The difference between encodeURI() and encodeURIComponent() are exactly 11 characters encoded by encodeURIComponent but not by encodeURI:

Table with the ten differences between encodeURI and encodeURIComponent

I generated this table easily with console.table in Google Chrome with this code:

var arr = [];
for(var i=0;i<256;i++) {
  var char=String.fromCharCode(i);
  if(encodeURI(char)!==encodeURIComponent(char)) {
    arr.push({
      character:char,
      encodeURI:encodeURI(char),
      encodeURIComponent:encodeURIComponent(char)
    });
  }
}
console.table(arr);
Flock answered 24/5, 2014 at 6:54 Comment(5)
Isn't this browser dependent?Photoflood
@bladnman encodeURI and encodeURIComponent should work this way in all major browsers. You can test the above code in Chrome and Firefox as both support console.table. In other browsers (including Firefox and Chrome) you can use the following code: var arr=[]; for(var i=0;i<256;i++){var char=String.fromCharCode(i); if(encodeURI(char)!==encodeURIComponent(char)) console.log("character: "+char + " | encodeURI: " +encodeURI(char) + " |encodeURIComponent: " + encodeURIComponent(char) ) }Flock
@Photoflood should be identical in various browsers unless the original spec is too ambiguous... also see #4408099Breda
I NEED TO UPVOTE THIS SEVERAL TIMES! Unfortunately can only upvote once.Codie
hey i can't see any resultsMellar
L
49

I found this article enlightening : Javascript Madness: Query String Parsing

I found it when I was trying to undersand why decodeURIComponent was not decoding '+' correctly. Here is an extract:

String:                         "A + B"
Expected Query String Encoding: "A+%2B+B"
escape("A + B") =               "A%20+%20B"     Wrong!
encodeURI("A + B") =            "A%20+%20B"     Wrong!
encodeURIComponent("A + B") =   "A%20%2B%20B"   Acceptable, but strange

Encoded String:                 "A+%2B+B"
Expected Decoding:              "A + B"
unescape("A+%2B+B") =           "A+++B"       Wrong!
decodeURI("A+%2B+B") =          "A+++B"       Wrong!
decodeURIComponent("A+%2B+B") = "A+++B"       Wrong!
Lory answered 9/10, 2012 at 9:26 Comment(4)
The article you link to contains a lot of nonsense. It seems to me, the author himself did not understand what the functions are properly used for...Mccraw
@Mccraw It all looks reasonable to me. In particular, I agree with him that encodeURI seems like it's only useful in a fairly obscure edge case and really need not exist. I have some differences of opinion with him, but I don't see anything outright false or idiotic in there. What exactly do you think is nonsense?Septilateral
The enctype attribute of the FORM element specifies the content type used to encode the form data set for submission to the server. application/x-www-form-urlencoded This is the default content type. Forms submitted with this content type must be encoded as follows: [...] Space characters are replaced by ``+', and [...] Non-alphanumeric characters are replaced by `%HH', [...] Ref: HTML4 SepcDroll
encodeURIComponent('A + B').replace(/\%20/g, '+') + '\n' + decodeURIComponent("A+%2B+B".replace(/\+/g, '%20'));Chic
P
44

encodeURIComponent doesn't encode -_.!~*'(), causing problem in posting data to php in xml string.

For example:
<xml><text x="100" y="150" value="It's a value with single quote" /> </xml>

General escape with encodeURI
%3Cxml%3E%3Ctext%20x=%22100%22%20y=%22150%22%20value=%22It's%20a%20value%20with%20single%20quote%22%20/%3E%20%3C/xml%3E

You can see, single quote is not encoded. To resolve issue I created two functions to solve issue in my project, for Encoding URL:

function encodeData(s:String):String{
    return encodeURIComponent(s).replace(/\-/g, "%2D").replace(/\_/g, "%5F").replace(/\./g, "%2E").replace(/\!/g, "%21").replace(/\~/g, "%7E").replace(/\*/g, "%2A").replace(/\'/g, "%27").replace(/\(/g, "%28").replace(/\)/g, "%29");
}

For Decoding URL:

function decodeData(s:String):String{
    try{
        return decodeURIComponent(s.replace(/\%2D/g, "-").replace(/\%5F/g, "_").replace(/\%2E/g, ".").replace(/\%21/g, "!").replace(/\%7E/g, "~").replace(/\%2A/g, "*").replace(/\%27/g, "'").replace(/\%28/g, "(").replace(/\%29/g, ")"));
    }catch (e:Error) {
    }
    return "";
}
Parshall answered 8/5, 2013 at 7:51 Comment(2)
It also doesn't do the # (pound/hash/number) sign, which is %23.Cahier
@Cahier What do you mean? encodeURIComponent does encode # to %23 (maybe it did not in 2014?)Orientalism
E
38

encodeURI() - the escape() function is for javascript escaping, not HTTP.

Ethnomusicology answered 16/9, 2008 at 19:26 Comment(3)
If i have a url like this: var url = "http://kuler-api.adobe.com/rss/get.cfm?startIndex=0&itemsPerPage=20&timeSpan=0&listType=rating"... And I want to access it via the Google Ajax API, like this: var gurl = "http://ajax.googleapis.com/ajax/services/feed/load?v=1.0&callback=?&q=" + url;... then I have to use escape(url). encodeURI(url) doesn't work with parameters like that it seems.Endsley
u should use ecnodeURIComponent(url)Sommerville
All the 3 functions have their issues. It's better to create your own function which does the job.Alienable
K
21

Small comparison table Java vs. JavaScript vs. PHP.

1. Java URLEncoder.encode (using UTF8 charset)
2. JavaScript encodeURIComponent
3. JavaScript escape
4. PHP urlencode
5. PHP rawurlencode

char   JAVA JavaScript --PHP---
[ ]     +    %20  %20  +    %20
[!]     %21  !    %21  %21  %21
[*]     *    *    *    %2A  %2A
[']     %27  '    %27  %27  %27 
[(]     %28  (    %28  %28  %28
[)]     %29  )    %29  %29  %29
[;]     %3B  %3B  %3B  %3B  %3B
[:]     %3A  %3A  %3A  %3A  %3A
[@]     %40  %40  @    %40  %40
[&]     %26  %26  %26  %26  %26
[=]     %3D  %3D  %3D  %3D  %3D
[+]     %2B  %2B  +    %2B  %2B
[$]     %24  %24  %24  %24  %24
[,]     %2C  %2C  %2C  %2C  %2C
[/]     %2F  %2F  /    %2F  %2F
[?]     %3F  %3F  %3F  %3F  %3F
[#]     %23  %23  %23  %23  %23
[[]     %5B  %5B  %5B  %5B  %5B
[]]     %5D  %5D  %5D  %5D  %5D
----------------------------------------
[~]     %7E  ~    %7E  %7E  ~
[-]     -    -    -    -    -
[_]     _    _    _    _    _
[%]     %25  %25  %25  %25  %25
[\]     %5C  %5C  %5C  %5C  %5C
----------------------------------------
char  -JAVA-  --JavaScript--  -----PHP------
[ä]   %C3%A4  %C3%A4  %E4     %C3%A4  %C3%A4
[ф]   %D1%84  %D1%84  %u0444  %D1%84  %D1%84
Kreit answered 8/10, 2015 at 15:20 Comment(0)
A
14

I recommend not to use one of those methods as is. Write your own function which does the right thing.

MDN has given a good example on url encoding shown below.

var fileName = 'my file(2).txt';
var header = "Content-Disposition: attachment; filename*=UTF-8''" + encodeRFC5987ValueChars(fileName);

console.log(header); 
// logs "Content-Disposition: attachment; filename*=UTF-8''my%20file%282%29.txt"


function encodeRFC5987ValueChars (str) {
    return encodeURIComponent(str).
        // Note that although RFC3986 reserves "!", RFC5987 does not,
        // so we do not need to escape it
        replace(/['()]/g, escape). // i.e., %27 %28 %29
        replace(/\*/g, '%2A').
            // The following are not required for percent-encoding per RFC5987, 
            //  so we can allow for a little better readability over the wire: |`^
            replace(/%(?:7C|60|5E)/g, unescape);
}

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent

Alienable answered 23/4, 2014 at 16:54 Comment(1)
what a great answer (if its compatible across chrome edge and firefox while not making any mistakes)Dorrie
W
13

For the purpose of encoding javascript has given three inbuilt functions -

  1. escape() - does not encode @*/+ This method is deprecated after the ECMA 3 so it should be avoided.

  2. encodeURI() - does not encode ~!@#$&*()=:/,;?+' It assumes that the URI is a complete URI, so does not encode reserved characters that have special meaning in the URI. This method is used when the intent is to convert the complete URL instead of some special segment of URL. Example - encodeURI('http://stackoverflow.com'); will give - http://stackoverflow.com

  3. encodeURIComponent() - does not encode - _ . ! ~ * ' ( ) This function encodes a Uniform Resource Identifier (URI) component by replacing each instance of certain characters by one, two, three, or four escape sequences representing the UTF-8 encoding of the character. This method should be used to convert a component of URL. For instance some user input needs to be appended Example - encodeURIComponent('http://stackoverflow.com'); will give - http%3A%2F%2Fstackoverflow.com

All this encoding is performed in UTF 8 i.e the characters will be converted in UTF-8 format.

encodeURIComponent differ from encodeURI in that it encode reserved characters and Number sign # of encodeURI

Willhite answered 21/4, 2017 at 7:55 Comment(0)
C
10

Also remember that they all encode different sets of characters, and select the one you need appropriately. encodeURI() encodes fewer characters than encodeURIComponent(), which encodes fewer (and also different, to dannyp's point) characters than escape().

Changeling answered 16/9, 2008 at 19:40 Comment(0)
T
6

Inspired by Johann's table, I've decided to extend the table. I wanted to see which ASCII characters get encoded.

screenshot of console.table

var ascii = " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~";

var encoded = [];

ascii.split("").forEach(function (char) {
    var obj = { char };
    if (char != encodeURI(char))
        obj.encodeURI = encodeURI(char);
    if (char != encodeURIComponent(char))
        obj.encodeURIComponent = encodeURIComponent(char);
    if (obj.encodeURI || obj.encodeURIComponent)
        encoded.push(obj);
});

console.table(encoded);

Table shows only the encoded characters. Empty cells mean that the original and the encoded characters are the same.


Just to be extra, I'm adding another table for urlencode() vs rawurlencode(). The only difference seems to be the encoding of space character.

screenshot of console.table

<script>
<?php
$ascii = str_split(" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~", 1);
$encoded = [];
foreach ($ascii as $char) {
    $obj = ["char" => $char];
    if ($char != urlencode($char))
        $obj["urlencode"] = urlencode($char);
    if ($char != rawurlencode($char))
        $obj["rawurlencode"] = rawurlencode($char);
    if (isset($obj["rawurlencode"]) || isset($obj["rawurlencode"]))
        $encoded[] = $obj;
}
echo "var encoded = " . json_encode($encoded) . ";";
?>
console.table(encoded);
</script>
Timm answered 11/2, 2019 at 11:56 Comment(0)
H
6

Just try encodeURI() and encodeURIComponent() yourself...

console.log(encodeURIComponent('@#$%^&*'));

Input: @#$%^&*. Output: %40%23%24%25%5E%26*. So, wait, what happened to *? Why wasn't this converted? It could definitely cause problems if you tried to do linux command "$string". TLDR: You actually want fixedEncodeURIComponent() and fixedEncodeURI(). Long-story...

When to use encodeURI()? Never. encodeURI() fails to adhere to RFC3986 with regard to bracket-encoding. Use fixedEncodeURI(), as defined and further explained at the MDN encodeURI() Documentation...

function fixedEncodeURI(str) {
   return encodeURI(str).replace(/%5B/g, '[').replace(/%5D/g, ']');
}

When to use encodeURIComponent()? Never. encodeURIComponent() fails to adhere to RFC3986 with regard to encoding: !'()*. Use fixedEncodeURIComponent(), as defined and further explained at the MDN encodeURIComponent() Documentation...

function fixedEncodeURIComponent(str) {
 return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
   return '%' + c.charCodeAt(0).toString(16);
 });
}

Then you can use fixedEncodeURI() to encode a single URL piece, whereas fixedEncodeURIComponent() will encode URL pieces and connectors; or, simply, fixedEncodeURI() will not encode +@?=:#;,$& (as & and + are common URL operators), but fixedEncodeURIComponent() will.

Hutchins answered 17/6, 2020 at 18:48 Comment(0)
M
4

I've found that experimenting with the various methods is a good sanity check even after having a good handle of what their various uses and capabilities are.

Towards that end I have found this website extremely useful to confirm my suspicions that I am doing something appropriately. It has also proven useful for decoding an encodeURIComponent'ed string which can be rather challenging to interpret. A great bookmark to have:

http://www.the-art-of-web.com/javascript/escape/

Magisterial answered 8/8, 2013 at 12:34 Comment(0)
H
4

The accepted answer is good. To extend on the last part:

Note that encodeURIComponent does not escape the ' character. A common bug is to use it to create html attributes such as href='MyUrl', which could suffer an injection bug. If you are constructing html from strings, either use " instead of ' for attribute quotes, or add an extra layer of encoding (' can be encoded as %27).

If you want to be on the safe side, percent encoding unreserved characters should be encoded as well.

You can use this method to escape them (source Mozilla)

function fixedEncodeURIComponent(str) {
  return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
    return '%' + c.charCodeAt(0).toString(16);
  });
}

// fixedEncodeURIComponent("'") --> "%27"
Halcyon answered 27/9, 2017 at 6:56 Comment(0)
D
3

Modern rewrite of @johann-echavarria's answer:

console.log(
    Array(256)
        .fill()
        .map((ignore, i) => String.fromCharCode(i))
        .filter(
            (char) =>
                encodeURI(char) !== encodeURIComponent(char)
                    ? {
                          character: char,
                          encodeURI: encodeURI(char),
                          encodeURIComponent: encodeURIComponent(char)
                      }
                    : false
        )
)

Or if you can use a table, replace console.log with console.table (for the prettier output).

Domino answered 9/2, 2018 at 1:45 Comment(1)
I think what you meant was ``` console.table( Array(256) .fill() .map((ignore, i) => { char = String.fromCharCode(i); return { character: char, encodeURI: encodeURI(char), encodeURIComponent: encodeURIComponent(char) } }) .filter( (charObj) => encodeURI(charObj.character) !== encodeURIComponent(charObj.character) ) ) ```Plowboy
L
2

I have this function...

var escapeURIparam = function(url) {
    if (encodeURIComponent) url = encodeURIComponent(url);
    else if (encodeURI) url = encodeURI(url);
    else url = escape(url);
    url = url.replace(/\+/g, '%2B'); // Force the replacement of "+"
    return url;
};
Ligon answered 21/6, 2013 at 12:39 Comment(4)
@ChristianVielma escape() is deprecated but never refer w3schools.com. see w3fools.comAlienable
@Christian Vielma - Some find the reference material at W3Schools to be less controversial and useful. Not everyone agrees that W3Schools shouldn't ever be referenced.Eparchy
W3Schools does get a bad rap. Sure they aren't always accurate, but then again i've come across many a blog post that is downright wrong as well. For me its sometimes a great starting point just to learn some of the terminology and then I dive a little deeper with other resources. Most important is that a single resource should never be biblical when it comes to this kind of stuff.Weeping
It seems @Ligon wrote this function as a fallback to versions where encodeURI does not exist but escape exists.Nous
U
0

Short Answer

for data only intended to be parsed by JavaScript, use escape(), for anything else, use encodeURIComponent()

encodeURI and encodeURIComponent

encodeURI and encodeURIComponent do the same thing: they URL-Encode a string. There is an important difference, though: encodeURI respects the structure of an URI, while encodeURIComponent does not. In most cases, you won't notice the difference, but when the argument you pass is a valid URI, encodeURI won't encode some characters of it, while encodeURIComponent ignores the URI structure of the passed argument, and encodes all characters which are invalid/have special meaning in an URI:

console.log(encodeURIComponent("Some Example Text"),encodeURI("Some Example Text"));//==>Some%20Example%20Text Some%20Example%20Text
console.log(encodeURIComponent("https://example.com/äöü?param1=content"),encodeURI("https://example.com/äöü?param1=content"));

In the example above, you can clearly see how encodeURIComponent behaves the same way as encodeURI when no URI structure is given, but when it is given, encodeURI skips characters relevant to the URI's structure, where encodeURIComponent ignores these. In most cases, encodeURIComponent is what you want. I cannot think of any use cases where encodeURI is the better choice, if you have user data, it is better to do:

var url="https://example.com/upload?input="+encodeURIComponent(user_input);

instead of:

var url=encodeURI("https://example.com/upload?input="+user_input)

because a user might insert URI-corrupting data (accidentally or maliciously (even though preventing attacks on client-side is a bad idea anyways) or because a malicious actor told him to) like:

upload_data?second_parameter=unintended_content

which the would be encoded properly in example 1, but generate errorneous or even malicious URI's in example 2.

BOTH METHODS THROW AN ERROR IF A LONE SURROGATE (0xD800-0xDFFFF) IS IN THE PASSED STRING

escape

Even though escape might look like it URI-Encodes a string, it actually translates it into a javascript specific format. When only characters in range (0x00-0x7F) are encoded, it behaves the same as encodeURIComponent (not encodeURI, because it ignores the URI structure just like encodeURIComponent does), except for 3 special characters, which it does not encode, even though they might have a special meaning in the URI (@+/). Behaviour differs for code points above 0x7F:

escape translates it into %uXXXX when the code point is above 0xFF, for code points in range 0x80-0xFF, escape translates it into %XX

encodeURIComponent URL-encodes it regularly, and throws an URIError for lone surrogates, which is the reason why escape() is the more robust method.

//0x00-0x7F
console.log(escape("Some Example Text"),encodeURIComponent("Some Example Text")); //==> Some%20Example%20Text Some%20Example%20Text
//Special Characters
console.log(escape("@+/"),encodeURIComponent("@+/"))//==>@+/ %40%2B%2F
//Above 0x7F
console.log(escape(String.fromCodePoint(0x1234)),encodeURIComponent(String.fromCodePoint(0x1234)));//==> %u1234 %E1%88%B4
//2 Valid Surrogates
console.log(escape("😂"),encodeURIComponent("😂"));//==> %uD83D%uDE02 %F0%9F%98%82
//Lone Surrogate(0xD800-0xDFFF)
console.log(escape(String.fromCodePoint(0xD800)))//==> %uD800
encodeURIComponent(String.fromCodePoint(0xD800))//URIError

It is also noteworthy that escape is deprecated, but it is supported by all major browsers (even IE, although I don't think anyone uses it anymore) , and there is no reason why support might be dropped in the future.

When to Use encodeURIComponent and when to use escape?

For data only intended to be parsed by JavaScript (for example in the hash of an URI), use escape, for anything else, use encodeURIComponent (and almost never use encodeURI)

About decoding

no matter which of the 2 real options you choose, you need to use the proper decoding method:

encodeURIComponent ==> decodeURIComponent
escape ==> unescape

If you don't know how the string was encoded, use the following function to detect it automatically (unreliable/errorneous when characters in range 0x80-0xFF are encoded with escape, and no characters >0xFF are encoded along with it, reliable in most other cases):

decode=function(text){return (text.includes("%u")?unescape(text):decodeURIComponent(text))}
Underclassman answered 20/5, 2023 at 11:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.