The specification from w3c states the following for forms of enctype=application/x-www-form-urlencoded
:
This is the default content type. Forms submitted with this content type must be encoded as follows:
1) Control names and values are escaped. Space characters are replaced by
+', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by
%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., `%0D%0A').2) The control names/values are listed in the order they appear in the document. The name is separated from the value by
=' and name/value pairs are separated from each other by
&'.
There are a few kinds of line terminators in Unicode. Namely:
LF: Line Feed, U+000A
VT: Vertical Tab, U+000B
FF: Form Feed, U+000C
CR: Carriage Return, U+000D
CR+LF: CR (U+000D) followed by LF (U+000A)
NEL: Next Line, U+0085
LS: Line Separator, U+2028
PS: Paragraph Separator, U+2029
Are all of these converted to CR LF (\r\n
)?