URLDecoder is converting '+' into space
Asked Answered
C

4

14

I have a hash key in one of my query params which can have + char with other special chars. The issue is when this URL is getting decoded URLDecoder converts + char into space. Is there a way we can enforce URLDecoder not to convert '+' into space.

Curse answered 12/4, 2017 at 17:22 Comment(2)
Better than forcing this non-standard behaviour on the receiving side would be to fix the sending side to encode + characters in parameters correctly as %2B.Semiconscious
To be safe, you should encode + into %2B instead...Parvis
P
8

According to HTML URL Encoding Reference:

URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.

and + sign itself must be encoded with %2B. So if you want to pass your hash as a GET parameter in URL, you should replace plus signs with %2B in your hash. Do not replace every + in the entire URL because you might ruin other string parameters which are supposed to contain spaces.

Pupil answered 12/4, 2017 at 17:30 Comment(0)
H
13

Do this on your string before decoding:

String plusEncoded = yourString.replaceAll("\\+", "%2b")

The decoder will then show + where it should've been

Halakah answered 12/4, 2017 at 17:24 Comment(1)
is there a utilities method to handle the issue, instead of manually replacing the '+' to UTF-8 format, before decoding? Though it works, the code does not look good.Lindon
P
8

According to HTML URL Encoding Reference:

URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.

and + sign itself must be encoded with %2B. So if you want to pass your hash as a GET parameter in URL, you should replace plus signs with %2B in your hash. Do not replace every + in the entire URL because you might ruin other string parameters which are supposed to contain spaces.

Pupil answered 12/4, 2017 at 17:30 Comment(0)
K
0

There is a bug reference similar to this issue, and it's closed as "not an issue". Here I quote what the Assignee had told:

The Java API documentation at https://docs.oracle.com/javase/8/docs/api/java/net/URL.html clearly states that "The URLEncoder and URLDecoder classes can also be used, but only for HTML form encoding, which is not the same as the encoding scheme defined in RFC2396." . This means that it is not meant for URL encoding and will cause issues with spaces and plus signs in the path. Using URL or URI classes to construct the url will give expected results.

URL url = new URL(input);
System.out.println(url.toString()); //outputs http://www.example.com/some+thing

Reference: https://bugs.openjdk.java.net/browse/JDK-8179507

Kristinakristine answered 19/2, 2021 at 15:29 Comment(0)
O
0

Java's URLDecoder is meant to be used only for "application/x-www-form-urlencoded".

Reference: https://bugs.openjdk.org/browse/JDK-8179507#:~:text=JDK-,JDK%2D8179507,-URLDecoder%20wrongly%20replaces

The Java API documentation at https://docs.oracle.com/javase/8/docs/api/java/net/URL.html clearly states that "The URLEncoder and URLDecoder classes can also be used, but only for HTML form encoding, which is not the same as the encoding scheme defined in RFC2396." . This means that it is not meant for URL encoding and will cause issues with spaces and plus signs in the path.

Solution

Instead of using Java's URLDecoder, use any percent encoding library. One such is org.apache.commons.codec.net.PercentCodec

Sample code

PercentCodec percentCodec = new PercentCodec();
        String decodedCert = new String(percentCodec.decode("<endoded text>", "UTF-8");
Olivarez answered 1/3 at 13:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.