URL Encoding using C#
Asked Answered
C

14

468

I have an application which sends a POST request to the VB forum software and logs someone in (without setting cookies or anything).

Once the user is logged in I create a variable that creates a path on their local machine.

c:\tempfolder\date\username

The problem is that some usernames are throwing "Illegal chars" exception. For example if my username was mas|fenix it would throw an exception..

Path.Combine( _      
  Environment.GetFolderPath(System.Environment.SpecialFolder.CommonApplicationData), _
  DateTime.Now.ToString("ddMMyyhhmm") + "-" + form1.username)

I don't want to remove it from the string, but a folder with their username is created through FTP on a server. And this leads to my second question. If I am creating a folder on the server can I leave the "illegal chars" in? I only ask this because the server is Linux based, and I am not sure if Linux accepts it or not.

EDIT: It seems that URL encode is NOT what I want.. Here's what I want to do:

old username = mas|fenix
new username = mas%xxfenix

Where %xx is the ASCII value or any other value that would easily identify the character.

Carree answered 22/2, 2009 at 18:48 Comment(1)
Incorporate this to make file system safe folder names: https://mcmap.net/q/14111/-is-there-a-way-of-making-strings-file-path-safe-in-cEloisaeloise
O
209

Edit: Note that this answer is now out of date. See Siarhei Kuchuk's answer below for a better fix

UrlEncoding will do what you are suggesting here. With C#, you simply use HttpUtility, as mentioned.

You can also Regex the illegal characters and then replace, but this gets far more complex, as you will have to have some form of state machine (switch ... case, for example) to replace with the correct characters. Since UrlEncode does this up front, it is rather easy.

As for Linux versus windows, there are some characters that are acceptable in Linux that are not in Windows, but I would not worry about that, as the folder name can be returned by decoding the Url string, using UrlDecode, so you can round trip the changes.

Osculate answered 22/2, 2009 at 20:55 Comment(3)
this answer is out of date now. read a few answers below - as of .net45 this might be the correct solution: msdn.microsoft.com/en-us/library/…Pyrogen
For FTP each Uri part (folder or file name) may be constructed using Uri.EscapeDataString(fileOrFolderName) allowing all non Uri compatible character (spaces, unicode ...). For example to allow any character in filename, use: req =(FtpWebRequest)WebRequest.Create(new Uri(path + "/" + Uri.EscapeDataString(filename))); Using HttpUtility.UrlEncode() replace spaces by plus signs (+). A correct behavior for search engines but incorrect for file/folder names.Brockway
asp.net blocks majority of xss in url as you get warning when ever you try to add js script A potentially dangerous Request.Path value was detected from the client.Prussian
E
734

I've been experimenting with the various methods .NET provide for URL encoding. Perhaps the following table will be useful (as output from a test app I wrote):

Unencoded UrlEncoded UrlEncodedUnicode UrlPathEncoded EscapedDataString EscapedUriString HtmlEncoded HtmlAttributeEncoded HexEscaped
A         A          A                 A              A                 A                A           A                    %41
B         B          B                 B              B                 B                B           B                    %42

a         a          a                 a              a                 a                a           a                    %61
b         b          b                 b              b                 b                b           b                    %62

0         0          0                 0              0                 0                0           0                    %30
1         1          1                 1              1                 1                1           1                    %31

[space]   +          +                 %20            %20               %20              [space]     [space]              %20
!         !          !                 !              !                 !                !           !                    %21
"         %22        %22               "              %22               %22              "      "               %22
#         %23        %23               #              %23               #                #           #                    %23
$         %24        %24               $              %24               $                $           $                    %24
%         %25        %25               %              %25               %25              %           %                    %25
&         %26        %26               &              %26               &                &       &                %26
'         %27        %27               '              '                 '                '       '                %27
(         (          (                 (              (                 (                (           (                    %28
)         )          )                 )              )                 )                )           )                    %29
*         *          *                 *              %2A               *                *           *                    %2A
+         %2b        %2b               +              %2B               +                +           +                    %2B
,         %2c        %2c               ,              %2C               ,                ,           ,                    %2C
-         -          -                 -              -                 -                -           -                    %2D
.         .          .                 .              .                 .                .           .                    %2E
/         %2f        %2f               /              %2F               /                /           /                    %2F
:         %3a        %3a               :              %3A               :                :           :                    %3A
;         %3b        %3b               ;              %3B               ;                ;           ;                    %3B
<         %3c        %3c               <              %3C               %3C              &lt;        &lt;                 %3C
=         %3d        %3d               =              %3D               =                =           =                    %3D
>         %3e        %3e               >              %3E               %3E              &gt;        >                    %3E
?         %3f        %3f               ?              %3F               ?                ?           ?                    %3F
@         %40        %40               @              %40               @                @           @                    %40
[         %5b        %5b               [              %5B               %5B              [           [                    %5B
\         %5c        %5c               \              %5C               %5C              \           \                    %5C
]         %5d        %5d               ]              %5D               %5D              ]           ]                    %5D
^         %5e        %5e               ^              %5E               %5E              ^           ^                    %5E
_         _          _                 _              _                 _                _           _                    %5F
`         %60        %60               `              %60               %60              `           `                    %60
{         %7b        %7b               {              %7B               %7B              {           {                    %7B
|         %7c        %7c               |              %7C               %7C              |           |                    %7C
}         %7d        %7d               }              %7D               %7D              }           }                    %7D
~         %7e        %7e               ~              ~                 ~                ~           ~                    %7E

Ā         %c4%80     %u0100            %c4%80         %C4%80            %C4%80           Ā           Ā                    [OoR]
ā         %c4%81     %u0101            %c4%81         %C4%81            %C4%81           ā           ā                    [OoR]
Ē         %c4%92     %u0112            %c4%92         %C4%92            %C4%92           Ē           Ē                    [OoR]
ē         %c4%93     %u0113            %c4%93         %C4%93            %C4%93           ē           ē                    [OoR]
Ī         %c4%aa     %u012a            %c4%aa         %C4%AA            %C4%AA           Ī           Ī                    [OoR]
ī         %c4%ab     %u012b            %c4%ab         %C4%AB            %C4%AB           ī           ī                    [OoR]
Ō         %c5%8c     %u014c            %c5%8c         %C5%8C            %C5%8C           Ō           Ō                    [OoR]
ō         %c5%8d     %u014d            %c5%8d         %C5%8D            %C5%8D           ō           ō                    [OoR]
Ū         %c5%aa     %u016a            %c5%aa         %C5%AA            %C5%AA           Ū           Ū                    [OoR]
ū         %c5%ab     %u016b            %c5%ab         %C5%AB            %C5%AB           ū           ū                    [OoR]

The columns represent encodings as follows:

  • UrlEncoded: HttpUtility.UrlEncode

  • UrlEncodedUnicode: HttpUtility.UrlEncodeUnicode

  • UrlPathEncoded: HttpUtility.UrlPathEncode

  • EscapedDataString: Uri.EscapeDataString

  • EscapedUriString: Uri.EscapeUriString

  • HtmlEncoded: HttpUtility.HtmlEncode

  • HtmlAttributeEncoded: HttpUtility.HtmlAttributeEncode

  • HexEscaped: Uri.HexEscape

NOTES:

  1. HexEscape can only handle the first 255 characters. Therefore it throws an ArgumentOutOfRange exception for the Latin A-Extended characters (eg Ā).

  2. This table was generated in .NET 4.0 (see Levi Botelho's comment below that says the encoding in .NET 4.5 is slightly different).

EDIT:

I've added a second table with the encodings for .NET 4.5. See this answer: https://mcmap.net/q/13942/-url-encoding-using-c

EDIT 2:

Since people seem to appreciate these tables, I thought you might like the source code that generates the table, so you can play around yourselves. It's a simple C# console application, which can target either .NET 4.0 or 4.5:

using System;
using System.Collections.Generic;
using System.Text;
// Need to add a Reference to the System.Web assembly.
using System.Web;

namespace UriEncodingDEMO2
{
    class Program
    {
        static void Main(string[] args)
        {
            EncodeStrings();

            Console.WriteLine();
            Console.WriteLine("Press any key to continue...");
            Console.Read();
        }

        public static void EncodeStrings()
        {
            string stringToEncode = "ABCD" + "abcd"
            + "0123" + " !\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~" + "ĀāĒēĪīŌōŪū";

            // Need to set the console encoding to display non-ASCII characters correctly (eg the 
            //  Latin A-Extended characters such as ĀāĒē...).
            Console.OutputEncoding = Encoding.UTF8;

            // Will also need to set the console font (in the console Properties dialog) to a font 
            //  that displays the extended character set correctly.
            // The following fonts all display the extended characters correctly:
            //  Consolas
            //  DejaVu Sana Mono
            //  Lucida Console

            // Also, in the console Properties, set the Screen Buffer Size and the Window Size 
            //  Width properties to at least 140 characters, to display the full width of the 
            //  table that is generated.

            Dictionary<string, Func<string, string>> columnDetails =
                new Dictionary<string, Func<string, string>>();
            columnDetails.Add("Unencoded", (unencodedString => unencodedString));
            columnDetails.Add("UrlEncoded",
                (unencodedString => HttpUtility.UrlEncode(unencodedString)));
            columnDetails.Add("UrlEncodedUnicode",
                (unencodedString => HttpUtility.UrlEncodeUnicode(unencodedString)));
            columnDetails.Add("UrlPathEncoded",
                (unencodedString => HttpUtility.UrlPathEncode(unencodedString)));
            columnDetails.Add("EscapedDataString",
                (unencodedString => Uri.EscapeDataString(unencodedString)));
            columnDetails.Add("EscapedUriString",
                (unencodedString => Uri.EscapeUriString(unencodedString)));
            columnDetails.Add("HtmlEncoded",
                (unencodedString => HttpUtility.HtmlEncode(unencodedString)));
            columnDetails.Add("HtmlAttributeEncoded",
                (unencodedString => HttpUtility.HtmlAttributeEncode(unencodedString)));
            columnDetails.Add("HexEscaped",
                (unencodedString
                    =>
                    {
                        // Uri.HexEscape can only handle the first 255 characters so for the 
                        //  Latin A-Extended characters, such as A, it will throw an 
                        //  ArgumentOutOfRange exception.                       
                        try
                        {
                            return Uri.HexEscape(unencodedString.ToCharArray()[0]);
                        }
                        catch
                        {
                            return "[OoR]";
                        }
                    }));

            char[] charactersToEncode = stringToEncode.ToCharArray();
            string[] stringCharactersToEncode = Array.ConvertAll<char, string>(charactersToEncode,
                (character => character.ToString()));
            DisplayCharacterTable<string>(stringCharactersToEncode, columnDetails);
        }

        private static void DisplayCharacterTable<TUnencoded>(TUnencoded[] unencodedArray,
            Dictionary<string, Func<TUnencoded, string>> mappings)
        {
            foreach (string key in mappings.Keys)
            {
                Console.Write(key.Replace(" ", "[space]") + " ");
            }
            Console.WriteLine();

            foreach (TUnencoded unencodedObject in unencodedArray)
            {
                string stringCharToEncode = unencodedObject.ToString();
                foreach (string columnHeader in mappings.Keys)
                {
                    int columnWidth = columnHeader.Length + 1;
                    Func<TUnencoded, string> encoder = mappings[columnHeader];
                    string encodedString = encoder(unencodedObject);

                    // ASSUMPTION: Column header will always be wider than encoded string.
                    Console.Write(encodedString.Replace(" ", "[space]").PadRight(columnWidth));
                }
                Console.WriteLine();
            }
        }
    }
}

Click here to run code on dotnetfiddle.net

Essay answered 27/6, 2012 at 23:10 Comment(6)
Note that the .NET documentation says Do not use; intended only for browser compatibility. Use UrlEncode., but that method encodes a lot of other undesired characters. The closest one is Uri.EscapeUriString, but beware it doesn't support a null argument.Dividers
I forgot to mention, my comment above is for UrlPathEncode. So basically replace UrlPathEncode with Uri.EscapeUriString.Dividers
Watch out: this answer is misleading. Some of these escape methods escape differing chars depending on their context within the string. Some are quite dangerous if you don't fully understand their limitations. If you're sticking stuff into uri's stick to Uri.EscapeDataString (not EscapeUriString!) unless you're very sure you know what you're doing.Shipyard
Does .NET Core 3+ and .NET 5 any changes of these?Erigeron
The HTML standard uses a slightly different encoding for the query parameters that are after the "?" character in a URL. Before the "?", URI Percent Encoding is used. After the "?", "application/x-www-form-urlencoded" encoding is used. In particular, that is why space is encoded as %20 before the "?", but space is encoded as + after the "?". Notice that URLEncoded encodes space as +, while EscapedUriString encodes it as %20. See "Mutate action URL" and "application/x-www-form-urlencoded serializer" in the HTML standard.Spheroidal
You don't often see an answer that is a resource - what an awesome answer.Prescriptible
S
333

You should encode only the user name or other part of the URL that could be invalid. URL encoding a URL can lead to problems since something like this:

string url = HttpUtility.UrlEncode("http://www.google.com/search?q=Example");

Will yield

http%3a%2f%2fwww.google.com%2fsearch%3fq%3dExample

This is obviously not going to work well. Instead, you should encode ONLY the value of the key/value pair in the query string, like this:

string url = "http://www.google.com/search?q=" + HttpUtility.UrlEncode("Example");

Hopefully that helps. Also, as teedyay mentioned, you'll still need to make sure illegal file-name characters are removed or else the file system won't like the path.

Sabellian answered 22/2, 2009 at 21:29 Comment(4)
Using the HttpUtility.UrlPathEncode method should prevent the problem you're describing here.Skolnik
@DJ Pirtu: It's true that UrlPathEncode won't make those undesired changes in the path, however it also won't encode anything after the ? (since it assumes the query string is already encoded). In Dan Herbert's example it looks like he's pretending Example is the text that requires encoding, so HttpUtility.UrlPathEncode("http://www.google.com/search?q=Example"); won't work. Try it with ?q=Ex&ple (where the desired result is ?q=Ex%26ple). It won't work because (1) UrlPathEncode doesn't touch anything after ?, and (2) UrlPathEncode doesn't encode & anyway.Rattray
See here: connect.microsoft.com/VisualStudio/feedback/details/551839/… I should add that of course it's good that UrlPathEncode doesn't encode &, because you need that to delimit your query string parameters. But there are times when you want encoded ampersands as well.Rattray
HttpUtility is succeeded by WebUtility in latest versions, save yourself some time :)Martres
C
232

Better way is to use

Uri.EscapeUriString

to not reference Full Profile of .net 4.

[Update]

Based on what the OP is asking for, the recommended API should be

Uri.EscapeDataString

(Thank you @ykadaru)

Clausewitz answered 15/9, 2011 at 7:57 Comment(5)
Totally agree since often the "Client Profile" is enough for apps using System.Net but not using System.Web ;-)Fagaly
OP is talking about checking it for file system compatibility, so this won't work. Windows disallowed character set is '["/", "\\", "<", ">", ":", "\"", "|", "?", "*"]' but many of these don't get encoded using EscapedUriString (see table below - thanks for that table @Simon Tewsi) ..."creates a path on their local machine" -OP UrlEncoded takes care of almost all of the problems, but doesn't solve the problem with "%" or "%3f" being in original input, as a "decode" will now be different than original.Calculable
just to make it clear: THIS answer WONT WORK for file systemsCalculable
In addition, starting with the .NET Framework 4.5, the Client Profile has been discontinued and only the full redistributable package is available.Superadd
https://mcmap.net/q/14112/-what-39-s-the-difference-between-escapeuristring-and-escapedatastring Use Uri.EscapeDataString NOT Uri.EscapeUriString Read this comment, it helped me out.Recipe
B
211

Since .NET Framework 4.5 and .NET Standard 1.0 you should use WebUtility.UrlEncode. Advantages over alternatives:

  1. It is part of .NET Framework 4.5+, .NET Core 1.0+, .NET Standard 1.0+, UWP 10.0+ and all Xamarin platforms as well. HttpUtility, while being available in .NET Framework earlier (.NET Framework 1.1+), becomes available on other platforms much later (.NET Core 2.0+, .NET Standard 2.0+) and it still unavailable in UWP (see related question).

  2. In .NET Framework, it resides in System.dll, so it does not require any additional references, unlike HttpUtility.

  3. It properly escapes characters for URLs, unlike Uri.EscapeUriString (see comments to drweb86's answer).

  4. It does not have any limits on the length of the string, unlike Uri.EscapeDataString (see related question), so it can be used for POST requests, for example.

Blondellblondelle answered 3/6, 2013 at 10:7 Comment(1)
I like the way it encodes using "+" instead of %20 for spaces.. but this one still does not remove " from the URL and gives me invalid URL... oh well.. just gonna have to do a a replace(""""","")Ailey
O
209

Edit: Note that this answer is now out of date. See Siarhei Kuchuk's answer below for a better fix

UrlEncoding will do what you are suggesting here. With C#, you simply use HttpUtility, as mentioned.

You can also Regex the illegal characters and then replace, but this gets far more complex, as you will have to have some form of state machine (switch ... case, for example) to replace with the correct characters. Since UrlEncode does this up front, it is rather easy.

As for Linux versus windows, there are some characters that are acceptable in Linux that are not in Windows, but I would not worry about that, as the folder name can be returned by decoding the Url string, using UrlDecode, so you can round trip the changes.

Osculate answered 22/2, 2009 at 20:55 Comment(3)
this answer is out of date now. read a few answers below - as of .net45 this might be the correct solution: msdn.microsoft.com/en-us/library/…Pyrogen
For FTP each Uri part (folder or file name) may be constructed using Uri.EscapeDataString(fileOrFolderName) allowing all non Uri compatible character (spaces, unicode ...). For example to allow any character in filename, use: req =(FtpWebRequest)WebRequest.Create(new Uri(path + "/" + Uri.EscapeDataString(filename))); Using HttpUtility.UrlEncode() replace spaces by plus signs (+). A correct behavior for search engines but incorrect for file/folder names.Brockway
asp.net blocks majority of xss in url as you get warning when ever you try to add js script A potentially dangerous Request.Path value was detected from the client.Prussian
E
105

Levi Botelho commented that the table of encodings that was previously generated is no longer accurate for .NET 4.5, since the encodings changed slightly between .NET 4.0 and 4.5. So I've regenerated the table for .NET 4.5:

Unencoded UrlEncoded UrlEncodedUnicode UrlPathEncoded WebUtilityUrlEncoded EscapedDataString EscapedUriString HtmlEncoded HtmlAttributeEncoded WebUtilityHtmlEncoded HexEscaped
A         A          A                 A              A                    A                 A                A           A                    A                     %41
B         B          B                 B              B                    B                 B                B           B                    B                     %42

a         a          a                 a              a                    a                 a                a           a                    a                     %61
b         b          b                 b              b                    b                 b                b           b                    b                     %62

0         0          0                 0              0                    0                 0                0           0                    0                     %30
1         1          1                 1              1                    1                 1                1           1                    1                     %31

[space]   +          +                 %20            +                    %20               %20              [space]     [space]              [space]               %20
!         !          !                 !              !                    %21               !                !           !                    !                     %21
"         %22        %22               "              %22                  %22               %22              &quot;      &quot;               &quot;                %22
#         %23        %23               #              %23                  %23               #                #           #                    #                     %23
$         %24        %24               $              %24                  %24               $                $           $                    $                     %24
%         %25        %25               %              %25                  %25               %25              %           %                    %                     %25
&         %26        %26               &              %26                  %26               &                &amp;       &amp;                &amp;                 %26
'         %27        %27               '              %27                  %27               '                &#39;       &#39;                &#39;                 %27
(         (          (                 (              (                    %28               (                (           (                    (                     %28
)         )          )                 )              )                    %29               )                )           )                    )                     %29
*         *          *                 *              *                    %2A               *                *           *                    *                     %2A
+         %2b        %2b               +              %2B                  %2B               +                +           +                    +                     %2B
,         %2c        %2c               ,              %2C                  %2C               ,                ,           ,                    ,                     %2C
-         -          -                 -              -                    -                 -                -           -                    -                     %2D
.         .          .                 .              .                    .                 .                .           .                    .                     %2E
/         %2f        %2f               /              %2F                  %2F               /                /           /                    /                     %2F
:         %3a        %3a               :              %3A                  %3A               :                :           :                    :                     %3A
;         %3b        %3b               ;              %3B                  %3B               ;                ;           ;                    ;                     %3B
<         %3c        %3c               <              %3C                  %3C               %3C              &lt;        &lt;                 &lt;                  %3C
=         %3d        %3d               =              %3D                  %3D               =                =           =                    =                     %3D
>         %3e        %3e               >              %3E                  %3E               %3E              &gt;        >                    &gt;                  %3E
?         %3f        %3f               ?              %3F                  %3F               ?                ?           ?                    ?                     %3F
@         %40        %40               @              %40                  %40               @                @           @                    @                     %40
[         %5b        %5b               [              %5B                  %5B               [                [           [                    [                     %5B
\         %5c        %5c               \              %5C                  %5C               %5C              \           \                    \                     %5C
]         %5d        %5d               ]              %5D                  %5D               ]                ]           ]                    ]                     %5D
^         %5e        %5e               ^              %5E                  %5E               %5E              ^           ^                    ^                     %5E
_         _          _                 _              _                    _                 _                _           _                    _                     %5F
`         %60        %60               `              %60                  %60               %60              `           `                    `                     %60
{         %7b        %7b               {              %7B                  %7B               %7B              {           {                    {                     %7B
|         %7c        %7c               |              %7C                  %7C               %7C              |           |                    |                     %7C
}         %7d        %7d               }              %7D                  %7D               %7D              }           }                    }                     %7D
~         %7e        %7e               ~              %7E                  ~                 ~                ~           ~                    ~                     %7E

Ā         %c4%80     %u0100            %c4%80         %C4%80               %C4%80            %C4%80           Ā           Ā                    Ā                     [OoR]
ā         %c4%81     %u0101            %c4%81         %C4%81               %C4%81            %C4%81           ā           ā                    ā                     [OoR]
Ē         %c4%92     %u0112            %c4%92         %C4%92               %C4%92            %C4%92           Ē           Ē                    Ē                     [OoR]
ē         %c4%93     %u0113            %c4%93         %C4%93               %C4%93            %C4%93           ē           ē                    ē                     [OoR]
Ī         %c4%aa     %u012a            %c4%aa         %C4%AA               %C4%AA            %C4%AA           Ī           Ī                    Ī                     [OoR]
ī         %c4%ab     %u012b            %c4%ab         %C4%AB               %C4%AB            %C4%AB           ī           ī                    ī                     [OoR]
Ō         %c5%8c     %u014c            %c5%8c         %C5%8C               %C5%8C            %C5%8C           Ō           Ō                    Ō                     [OoR]
ō         %c5%8d     %u014d            %c5%8d         %C5%8D               %C5%8D            %C5%8D           ō           ō                    ō                     [OoR]
Ū         %c5%aa     %u016a            %c5%aa         %C5%AA               %C5%AA            %C5%AA           Ū           Ū                    Ū                     [OoR]
ū         %c5%ab     %u016b            %c5%ab         %C5%AB               %C5%AB            %C5%AB           ū           ū                    ū                     [OoR]

The columns represent encodings as follows:

  • UrlEncoded: HttpUtility.UrlEncode
  • UrlEncodedUnicode: HttpUtility.UrlEncodeUnicode
  • UrlPathEncoded: HttpUtility.UrlPathEncode
  • WebUtilityUrlEncoded: WebUtility.UrlEncode
  • EscapedDataString: Uri.EscapeDataString
  • EscapedUriString: Uri.EscapeUriString
  • HtmlEncoded: HttpUtility.HtmlEncode
  • HtmlAttributeEncoded: HttpUtility.HtmlAttributeEncode
  • WebUtilityHtmlEncoded: WebUtility.HtmlEncode
  • HexEscaped: Uri.HexEscape

NOTES:

  1. HexEscape can only handle the first 255 characters. Therefore it throws an ArgumentOutOfRange exception for the Latin A-Extended characters (eg Ā).

  2. This table was generated in .NET 4.5 (see answer https://mcmap.net/q/13942/-url-encoding-using-c for the encodings relevant to .NET 4.0 and below).

EDIT:

  1. As a result of Discord's answer I added the new WebUtility UrlEncode and HtmlEncode methods, which were introduced in .NET 4.5.
Essay answered 14/2, 2014 at 5:0 Comment(6)
No not user UrlPathEncode - even the MSDN says it is not to be used. It was build to fix an issue with netscape 2 msdn.microsoft.com/en-us/library/…Shadbush
Is Server.URLEncode yet another variation on this theme? Does it generate any different output?High
@ALEX: In ASP.NET the Server object is an instance of HttpServerUtility. Using the dotPeek decompiler I had a look at HttpServerUtility.UrlEncode. It just calls HttpUtility.UrlEncode so the output of the two methods would be identical.Essay
It seems like, even with this overabundance of encoding methods, they all still fail pretty spectacularly for anything above Latin-1, such as → or ☠. (UrlEncodedUnicode seems like it at least tries to support Unicode, but is deprecated/missing.)Auk
Simon, can you just integrate this answer in the accepted answer? it will be nice to have it in one answer. you could integrate it and make a h1 heading in the bottom of that answer, or integrate in one table, and marked different lines, like: (Net4.0) ? %3f................................ (Net4.5) ? %3f ..................................Handknit
Sad, url i need '->', = -> %3D, [space] -> %20. but i dont want move to 4.0Armstrong
E
75

Url Encoding is easy in .NET. Use:

System.Web.HttpUtility.UrlEncode(string url)

If that'll be decoded to get the folder name, you'll still need to exclude characters that can't be used in folder names (*, ?, /, etc.)

Ectogenous answered 22/2, 2009 at 18:57 Comment(5)
Does it encode every character thats not part of the alphabet?Carree
URL encoding converts characters that are not allowed in a URL into character-entity equivalents. List of unsafe characters: blooberry.com/indexdot/html/topics/urlencoding.htmDeflected
MSDN Link on HttpUtility.UrlEncode: msdn.microsoft.com/en-us/library/4fkewx0t.aspxDeflected
It is good practice to put the full System.Web... part in your answer, it saves a lot of people a little time :) thanksTrammell
This is dangerous: not all character of the url have to be encoded, only the values of parameters of querystring. The way you suggest will encode also the & that is needed to create multiple parameter in the querystring. The soution is to encode each value of parameters if neededSuspensor
N
12

If you can't see System.Web, change your project settings. The target framework should be ".NET Framework 4" instead of ".NET Framework 4 Client Profile"

Nicias answered 20/3, 2011 at 5:20 Comment(4)
In my opinion developers should know about ".NET Profiles" and they should use the correct one for their purposes! Just adding the full profile in order to get (e.g System.Web) without really knowing why they add the full profile, isn't very smart. Use "Client Profile" for your client apps and the full profile only when needed (e.g. a WinForms or WPF client should use client profile and not full profile)! e.g. I don't see a reason using the HttpServerUtility in a client app ^^ ... if this is needed then there is something wrong with the design of the app!Fagaly
Really? Do don't ever see a need for a client app to construct a URL? What do you do for a living - janitorial duties?Distraction
@hfrmobile: no. It's all wrong with the profile model (which lived just once and was abandoned in next version). And it was obvious from the beginning. Is it obvious for you now? Think first, don't accept everything 'as is' what msft tries to sell you ;PLandwehr
Sorry, but I never said that a client never has to build/use an URL. As long as .NET 4.0 is in use, user should care about it. To put it short: Developers should think twice before adding HttpServerUtility to a client. There are other/better ways, just see the answer with 139 votes or "Since .NET Framework 4.5 you can use WebUtility.UrlEncode. First, it resides in System.dll, so it does not require any additional references.".Fagaly
S
10

The .NET implementation of UrlEncode does not comply with RFC 3986.

  1. Some characters are not encoded but should be. The !()* characters are listed in the RFC's section 2.2 as a reserved characters that must be encoded yet .NET fails to encode these characters.

  2. Some characters are encoded but should not be. The .-_ characters are not listed in the RFC's section 2.2 as a reserved character that should not be encoded yet .NET erroneously encodes these characters.

  3. The RFC specifies that to be consistent, implementations should use upper-case HEXDIG, where .NET produces lower-case HEXDIG.

Sexagesimal answered 21/5, 2013 at 22:21 Comment(0)
L
6

I think people here got sidetracked by the UrlEncode message. URLEncoding is not what you want -- you want to encode stuff that won't work as a filename on the target system.

Assuming that you want some generality -- feel free to find the illegal characters on several systems (MacOS, Windows, Linux and Unix), union them to form a set of characters to escape.

As for the escape, a HexEscape should be fine (Replacing the characters with %XX). Convert each character to UTF-8 bytes and encode everything >128 if you want to support systems that don't do unicode. But there are other ways, such as using back slashes "\" or HTML encoding """. You can create your own. All any system has to do is 'encode' the uncompatible character away. The above systems allow you to recreate the original name -- but something like replacing the bad chars with spaces works also.

On the same tangent as above, the only one to use is

Uri.EscapeDataString

-- It encodes everything that is needed for OAuth, it doesn't encode the things that OAuth forbids encoding, and encodes the space as %20 and not + (Also in the OATH Spec) See: RFC 3986. AFAIK, this is the latest URI spec.

Laundryman answered 29/1, 2019 at 21:58 Comment(0)
B
5

I have written a C# method that url-encodes ALL symbols:

    /// <summary>
    /// !#$345Hf} → %21%23%24%33%34%35%48%66%7D
    /// </summary>
    public static string UrlEncodeExtended( string value )
    {
        char[] chars = value.ToCharArray();
        StringBuilder encodedValue = new StringBuilder();
        foreach (char c in chars)
        {
            encodedValue.Append( "%" + ( (int)c ).ToString( "X2" ) );
        }
        return encodedValue.ToString();
    }
Bus answered 2/11, 2017 at 12:25 Comment(0)
C
1

Ideally these would go in a class called "FileNaming" or maybe just rename Encode to "FileNameEncode". Note: these are not designed to handle Full Paths, just the folder and/or file names. Ideally you would Split("/") your full path first and then check the pieces. And obviously instead of a union, you could just add the "%" character to the list of chars not allowed in Windows, but I think it's more helpful/readable/factual this way. Decode() is exactly the same but switches the Replace(Uri.HexEscape(s[0]), s) "escaped" with the character.

public static List<string> urlEncodedCharacters = new List<string>
{
  "/", "\\", "<", ">", ":", "\"", "|", "?", "%" //and others, but not *
};
//Since this is a superset of urlEncodedCharacters, we won't be able to only use UrlEncode() - instead we'll use HexEncode
public static List<string> specialCharactersNotAllowedInWindows = new List<string>
{
  "/", "\\", "<", ">", ":", "\"", "|", "?", "*" //windows dissallowed character set
};

    public static string Encode(string fileName)
    {
        //CheckForFullPath(fileName); // optional: make sure it's not a path?
        List<string> charactersToChange = new List<string>(specialCharactersNotAllowedInWindows);
        charactersToChange.AddRange(urlEncodedCharacters.
            Where(x => !urlEncodedCharacters.Union(specialCharactersNotAllowedInWindows).Contains(x)));   // add any non duplicates (%)

        charactersToChange.ForEach(s => fileName = fileName.Replace(s, Uri.HexEscape(s[0])));   // "?" => "%3f"

        return fileName;
    }

Thanks @simon-tewsi for the very usefull table above!

Calculable answered 8/2, 2013 at 22:5 Comment(2)
also usefull: Path.GetInvalidFileNameChars()Calculable
yes. Here's one way of doing it: foreach (char c in System.IO.Path.GetInvalidFileNameChars()) { filename = filename.Replace(c, '_'); }Disintegration
C
0

In addition to @Dan Herbert's answer , You we should encode just the values generally.

Split has params parameter Split('&','='); expression firstly split by & then '=' so odd elements are all values to be encoded shown below.

public static void EncodeQueryString(ref string queryString)
{
    var array=queryString.Split('&','=');
    for (int i = 0; i < array.Length; i++) {
        string part=array[i];
        if(i%2==1)
        {               
            part=System.Web.HttpUtility.UrlEncode(array[i]);
            queryString=queryString.Replace(array[i],part);
        }
    }
}
Calorie answered 1/3, 2013 at 8:7 Comment(0)
T
0

For .net core users, use this

Microsoft.AspNetCore.Http.Extensions.UriHelper.Encode(Uri uri)
Tungusic answered 12/12, 2022 at 7:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.