What's the difference between EscapeUriString and EscapeDataString?

A

6

261

If only deal with url encoding, I should use EscapeUriString?

Annelid answered 9/12, 2010 at 9:23 Comment(1)

Always escape each individual value using Uri.EscapeDataString(), as explained in @Livven's answer. With other approaches, the system simply does not have enough information to produce the intended result for every possible input. – Keneth 16/6, 2016 at 12:34

R

136

Use EscapeDataString always (for more info about why, see Livven's answer below)

Edit: removed dead link to how the two differ on encoding

Roots answered 9/12, 2010 at 9:29 Comment(11)

I'm not sure that link actually provides more information as it's regarding unescaping rather then esacaping. – Stroboscope 30/8, 2013 at 15:52

It's basically the same difference. If you actually read the article, there's a table around the middle that actually escapes (not unescapes) to show the differences (comparing with URLEncode too). – Roots 30/8, 2013 at 15:57

It's still not clear to me -- what if I'm not escaping a whole URI but only part of it -- (i.e. the data for a query string parameter)? Am I escaping data for the URI, or does EscapeDataString imply something completely different? – Technocracy 10/11, 2013 at 3:37

... did some testing looks like I want EscapeDataString for a URI parameter. I tested with the string "I heart C++" and EscapeUriString did not encode the "+" characters, it just left them as is, EscapeDataString correctly converted them to "%2B". – Technocracy 10/11, 2013 at 3:42

@Technocracy yes, that's on the article I linked at. If you want to be more specific, you should be using HttpUtility.UrlEncode if what you are encoding is a URL... that will also change spaces into + (which is correct for a URL, more so than %20 -although both will work-) and will use the more correct lowercase too. As documentation states, EscapeUriString does not convert RFC2396 reserved characters (that includes +, but also others: more info here ) – Roots 11/11, 2013 at 16:10

I'm not encoding URL or URIs, I'm encoding data that goes into the value of a query string parameter of a URL (again, that data is not a URL or URI). As far as personal preference goes: using "+" for " " in a URL is evil, because some functions (as you mention) will randomly leave them in -- and on the server side, it can be ambiguous -- where as "%20" and "%2B" are explicit -- there's no chance to get the decoding wrong. – Technocracy 25/11, 2013 at 5:54

Yeah, well, it's a matter of standards (there's RFC's that define these kind of encodings). The problem is that browsers have historically been pretty loose on their support of encodings. The functions are not "randomly" encoding or decoding... they follow some standards or not, and it's usually documented :-) – Roots 25/11, 2013 at 7:18

Here's a sample of running it and the other encoding methods that shows differences dotnetfiddle.net/12IFw1 – Unfriended 17/9, 2014 at 18:2

This is a bad answer. You should never use EscapeUriString, it doesn't make any sense. See Livven's answer below (and upvote it). – Danika 5/2, 2016 at 1:20

By StackOverflow standards, this is a terrible answer. It doesn't actually explain the difference, gives confusing (and incorrect) advice, and leaves everything up to an external link. If that link becomes dead in the future, this answer will no longer be valid or correct. – Mold 5/10, 2017 at 14:56

I have updated the answer to link to the obviously more correct answer below. Also removed the dead link – Roots 15/1, 2019 at 17:15

C

351

I didn't find the existing answers satisfactory so I decided to dig a little deeper to settle this issue. Surprisingly, the answer is very simple:

There is (almost) no valid reason to ever use Uri.EscapeUriString. If you need to percent-encode a string, always use Uri.EscapeDataString.*

* See the last paragraph for a valid use case.

Why is this? According to the documentation:

Use the EscapeUriString method to prepare an unescaped URI string to be a parameter to the Uri constructor.

This doesn't really make sense. According to RFC 2396:

A URI is always in an "escaped" form, since escaping or unescaping a completed URI might change its semantics.

While the quoted RFC has been obsoleted by RFC 3986, the point still stands. Let's verify it by looking at some concrete examples:

You have a simple URI, like this:
```
 http://example.org/
```

Uri.EscapeUriString won't change it.

You decide to manually edit the query string without regard for escaping:
```
 http://example.org/?key=two words
```

Uri.EscapeUriString will (correctly) escape the space for you:

    http://example.org/?key=two%20words

You decide to manually edit the query string even further:
```
 http://example.org/?parameter=father&son
```

However, this string is not changed by Uri.EscapeUriString, since it assumes the ampersand signifies the start of another key-value pair. This may or may not be what you intended.

You decide that you in fact want the key parameter to be father&son, so you fix the previous URL manually by escaping the ampersand:
```
 http://example.org/?parameter=father%26son
```

However, Uri.EscapeUriString will escape the percent character too, leading to a double encoding:

    http://example.org/?parameter=father%2526son

As you can see, using Uri.EscapeUriString for its intended purpose makes it impossible to use & as part of a key or value in a query string instead of as a separator between multiple key-value pairs.

This is because, in an attempt at making it suitable for escaping full URIs, it ignores reserved characters and only escapes characters that are neither reserved nor unreserved, which, BTW, is contrary to the documentation. This way you don't end up with something like http%3A%2F%2Fexample.org%2F, but you do end up with the issues illustrated above.

In the end, if your URI is valid, it does not need to be escaped to be passed as a parameter to the Uri constructor, and if it's not valid then calling Uri.EscapeUriString isn't a magic solution either. Actually, it will work in many if not most cases, but it is by no means reliable.

You should always construct your URLs and query strings by gathering the key-value pairs and percent-encoding and then concatenating them with the necessary separators. You can use Uri.EscapeDataString for this purpose, but not Uri.EscapeUriString, since it doesn't escape reserved characters, as mentioned above.

Only if you cannot do that, e.g. when dealing with user-provided URIs, does it make sense to use Uri.EscapeUriString as a last resort. But the previously mentioned caveats apply – if the user-provided URI is ambiguous, the results may not be desirable.

Comatose answered 9/12, 2015 at 21:19 Comment(14)

Wow, thank you for finally clarifying this issue. The previous two answers were not very helpful. – Navaho 28/12, 2015 at 20:53

Exactly right. EscapeUriString (like EscapeUrl's default behavior in Win32) was created by someone who didn't understand URIs or escaping. It's a misguided attempt to create something that takes a malformed URI and sometimes turn it into the intended version. But it doesn't have the information it needs to do this reliably. It also frequently gets used in place of EscapeDataString which is also very problematic. I wish EscapeUriString did not exist. Every use of it is wrong. – Danika 5/2, 2016 at 1:19

nicely explained +1 it is way better than accepted link only answer – Fosse 25/4, 2016 at 17:23

This answer needs more attention. It is the correct way to do it. The other answers have scenarios where they do not produce the intended results. – Keneth 16/6, 2016 at 12:31

I will be an alternate voice of reason here. Coming from JavaScript where there are two distinct functions encodeURI and encodeURIComponent, this answer and some of the comments like "I wish EscapeUriString did not exist" appear mis-guided... – Kling 14/11, 2017 at 22:59

...Sure encodeURI/Uri.EscapeUriString is not needed as often as encodeURIComponent/Uri.EscapeDataString (since when are you deaing with blind urls that must be used in a uri context), but that does not mean it doesn't have its place. – Kling 14/11, 2017 at 23:2

Point #3: "it assumes the ampersand signifies the start of another key-value pair" is a bit misleading. Key-value pair syntax is a web framework thing, not a URI thing. I think it's more accurate to say spaces are escaped (point #2) because they are illegal in a URI; ampersands are not, because they are not. – Pankey 4/12, 2017 at 3:22

@CrescentFresh You haven't actually explained where encodeURI/Uri.EscapeUriString are needed. Can you give a single use case where encodeURIComponent/Uri.EscapeDataString are not the best solution for the problem? – Evonevonne 14/3, 2018 at 6:32

Uri.EscapeDataString worked for me too. I was previously using WebUtility.HtmlEncode(str) to escape form input, however this was causing exceptions on the server of this form "A potentially dangerous Request.Form value was detected from the client". One example is for single quotes - encoded to &39; by HTMLEncode, but correctly (and safely) encoded to %27 by Uri.EscapeDataString. – Spermophyte 3/12, 2018 at 10:36

@CrescentFresh You're right, a valid use would be as a best-effort when dealing with user-provided URIs. I added that to the answer. Are there any other you could think of? – Comatose 11/11, 2019 at 15:2

@Comatose - even as a best-effort solution for user-provided URI's EscapeUriString is probably not a good idea. It's not clearly documented; and whatever processing you need to do for user-provided uri's is likely going to exceed that method anyhow. e.g. let's say your user enters google.com/?q=bla bla - EscapeUriString isn't going to do anything useful, unlike most browsers, that will. The tiny niche for implementing a browser url bar is so specialized, .net simply shouldn't have a method for that, and even if you're going that - don't use Uri.EscapeUriString. It's still not good enough. – Encamp 10/6, 2020 at 15:12

@Comatose Then there's the fact that even when EscapeUriString does "something" - what destination server won't do that better? If it's comprehensible enough to best-effort escape, then let the target server deal with it. Finally consider that the "real-world" use case for EscapeUriString is simply making a bug by accident. Best be clearly about it's usefulness therefore - just don't use it. Ever. – Encamp 10/6, 2020 at 15:14

@CrescentFresh People don't use EscapeUriString correctly - github.com/search?p=99&q=EscapeUriString&type=Code for some additional reason's why you should be 100% clear it's just not a good idea to ever use this. Nobody is using it correctly. Can you find even one case where it's at least clearly harmless and has any reasonable effect whatsoever? I can see a ton that are clearly wrong, and bet you could find a few exploitable security holes just on the basis of those search results. Don't use it; it's dangerous and useless - even as a best-effort fallback. – Encamp 10/6, 2020 at 15:16

After wondering why it seemed "&" was not encoded, then it seemed like it was being encoded, I think I mixed up these two, causing a bit of panic. Looking closely at documentation, it seems like Uri.EscapeString is now marked Obsolete. @BrandonPaddock, seems like this is close to what you were hoping. – Working 10/6, 2020 at 15:45