How do I encode URI parameter values?
Asked Answered
S

7

57

I want to send a URI as the value of a query/matrix parameter. Before I can append it to an existing URI, I need to encode it according to RFC 2396. For example, given the input:

http://google.com/resource?key=value1 & value2

I expect the output:

http%3a%2f%2fgoogle.com%2fresource%3fkey%3dvalue1%2520%26%2520value2

Neither java.net.URLEncoder nor java.net.URI will generate the right output. URLEncoder is meant for HTML form encoding which is not the same as RFC 2396. URI has no mechanism for encoding a single value at a time so it has no way of knowing that value1 and value2 are part of the same key.

Seafarer answered 14/1, 2009 at 18:21 Comment(10)
I am not sure I understand what result do you expect. I would use URLEncoder.Noon
According to the Javadoc for URL: "The URLEncoder and URLDecoder classes can also be used, but only for HTML form encoding, which is not the same as the encoding scheme defined in RFC2396."Seafarer
@Peter: Agreed, but the latter is dead. There is at least one up-voted answer so it won't show up in the list of unanswered questions even though the answer is technically wrong. If you want to help please go vote it down to zero.Seafarer
I wonder what would be result for example you give in your question.Noon
Sorry, I removed my comment about this being a duplicate of #305306Noon
@Peter: I added a sample input and output per your request.Seafarer
Bugger, Stackoverflow marks a question as answered even if all answers have a score of zero! Please consider voting for stackoverflow.uservoice.com/pages/general/suggestions/… to fix this.Seafarer
related #724543Shantell
Is this like this question: How to encode URL parameters? , except for Java (that one is for JavaScript) ? If yes, java.net.URLEncoder is the (or "a") correct answer.Benares
@DavidBalažic Wrong, I explicitly mention why URLEncoder won't work in the above question.Seafarer
S
31

Jersey's UriBuilder encodes URI components using application/x-www-form-urlencoded and RFC 3986 as needed. According to the Javadoc

Builder methods perform contextual encoding of characters not permitted in the corresponding URI component following the rules of the application/x-www-form-urlencoded media type for query parameters and RFC 3986 for all other components. Note that only characters not permitted in a particular component are subject to encoding so, e.g., a path supplied to one of the path methods may contain matrix parameters or multiple path segments since the separators are legal characters and will not be encoded. Percent encoded values are also recognized where allowed and will not be double encoded.

Seafarer answered 14/1, 2009 at 18:21 Comment(8)
URL is unreachable. But download.oracle.com/javaee/6/api/javax/ws/rs/core/… is available alreadyBarony
@sergdev, fixed the link. Thanks for the head's up!Seafarer
How exactly did you produce the expected output mentioned above using the UriBuilder? I have no clue how to tell it to encode the part before the "?". Thanks!Ballance
If you are not use JAX-RS and are using Spring you could use Spring's UriUtilsType
@tbh, The point I was trying to make is that if you use UriBuilder it'll encode what needs to be encoded for you. If you want to manually encode sections of text it turns out that #305306 will work. Alternatively, you can use UriBuilder.fromPath("host/").matrixParam("key", "google.com/resource?key=value1 & value2").build() and you'll get back "host/…".Seafarer
Beware of using UriBuilder for "free" URLs that do not use JAX-RS's parameter scheme with curly braces. Even if it might be an exotic value, try this: UriBuilder.fromPath("http://www.query.example/").queryParam("key", "{val}").build(); will fail. "http://www.query.example?key=" + URLEncoder.encode("{val}", "UTF-8") will work.Kala
@ujay68: UriBuilder treats curly brace as template delimiters. Have you tried UriBuilder.fromPath("http://www.query.example/").queryParam("key", "{val}").build("{val}")?Pesach
FYI. If you have a full URL string already, and just care about special characters being replaces appropriately, you can simply just build it like so: UriBuilder.fromPath(urlString).build(); That will encode all the special characters to the right of the first "/" at the end of the hostname of the URL.Caseose
T
20

You could also use Spring's UriUtils

Type answered 14/1, 2009 at 18:21 Comment(0)
I
9

I don't have enough reputation to comment on answers, but I just wanted to note that downloading the JSR-311 api by itself will not work. You need to download the reference implementation (jersey).

Only downloading the api from the JSR page will give you a ClassNotFoundException when the api tries to look for an implementation at runtime.

Inmesh answered 14/1, 2009 at 18:21 Comment(1)
specifically, you need jsr311-api, jersey-server, and jersey-core jars.Inmesh
E
3

I wrote my own, it's short, super simple, and you can copy it if you like: http://www.dmurph.com/2011/01/java-uri-encoder/

Entitle answered 14/1, 2009 at 18:21 Comment(6)
I guess people are hesitant to like this solution because maybe they fear it will have bugs etc. It looks pretty comprehensive, and has had a couple of bugs addressed already, so I think I will try it out. Java and Objective C both don't have built in routines to do this kind of encoding, which is just ... bafflingExcaudate
Hmm, this is for the URI itself and not for the parameters?Weathercock
Spaces should be escaped :) It will only include numbers, letters, and the 'mark' characters as unescapedEntitle
@HerrGrumps What? Obj-C has def had built-in URL encoding for a long time.Virtuous
@Virtuous oh? ok, maybe I was wrong on Obj-C and/or Java not having it - sorry :) For reference maybe you can post a reply with some class/method names in the event that it might help someone out who might not know where to look (like me ;) )Excaudate
@HerrGrumps Not sure about pure Java, I ended up using Android specific classes. On iOS you can use the URLComponents class to break a URL down into its constituent parts and the queryItems property to set parameters. But that's outside the scope of this answer and question. ;)Virtuous
V
2

It seems that CharEscapers from Google GData-java-client has what you want. It has uriPathEscaper method, uriQueryStringEscaper, and generic uriEscaper. (All return Escaper object which does actual escaping). Apache License.

Vassaux answered 14/1, 2009 at 18:21 Comment(9)
Unfortunately it uses some other classes and interfaces, but I think you'll be able to modify it to suit your needs.Noon
"There has to be an easier way than this". It amazes me that this common use-case (building URIs) isn't easier to do. java.net.URI should do a better job.Seafarer
I was surprised at how messy this area is (honestly, I didn't even know there is such thing as special URI Encoding ... I learned something).Noon
Peter, I just realized this might work: jsr311.dev.java.net/nonav/releases/1.0/javax/ws/rs/core/… I am using JAX-RS anyway for my application. I'll try it and report back.Seafarer
Peter, please add the following and I will mark it as the accepted answer: " javax.ws.rs.core.UriBuilder will do what you want: jsr311.dev.java.net/nonav/releases/1.0/javax/ws/rs/core/… "Seafarer
Feel free to answer your question yourself :-) You found solution which works best for you. People who are searching for same problem can then choose whatever works for them.Noon
The reason I want you to post the answer is that I can't "accept" my own answer. Please post a new answer and I'll "accept it".Seafarer
OK. Didn't know about accepting own answer, I thought it's possible.Noon
+1 for CharEscapers. I ran into this issue today and this fixed it: // false will force spaces to encode as %20 CharEscapers.uriEscaper(false).escape(value);Amoeboid
I
-2

I think that the URI class is the one that you are looking for.

Immethodical answered 14/1, 2009 at 18:21 Comment(6)
It doesn't help because it expects me to pass in a full query string. It has no way of knowing which part of the string needs to be encoded and which part does not. I need a method that takes in a raw parameter value and passes out the URL encoded form.Seafarer
Yes. Stackoverflow marks questions as answered if they have been up-voted once. This answer has been up-voted by one person and I down-voted it. It still marks my questions as answered last time I checked.Seafarer
My bad. You're right. I assumed "upvoted answer" meant ... you know, an answer with a positive number to its left ...Linotype
Same here, please vote for stackoverflow.uservoice.com/pages/general/suggestions/…Seafarer
Why don't you update your question to indicate that the URI class will not work for you? Otherwise, someone else will enter in the same advice even if I delete my answer. BTW, why can't the full URI be encoded?Immethodical
I will update the question. The reason the full URI can't be encoded is if I pass in "key=value with spaces&still part of value" the constructor has no way of knowing where the value begins and ends.Seafarer
O
-2

Mmhh I know you've already discarded URLEncoder, but despite of what the docs say, I decided to give it a try.

You said:

For example, given an input:

http://google.com/resource?key=value

I expect the output:

http%3a%2f%2fgoogle.com%2fresource%3fkey%3dvalue

So:

C:\oreyes\samples\java\URL>type URLEncodeSample.java
import java.net.*;

public class URLEncodeSample {
    public static void main( String [] args ) throws Throwable {
        System.out.println( URLEncoder.encode( args[0], "UTF-8" ));
    }
}

C:\oreyes\samples\java\URL>javac URLEncodeSample.java

C:\oreyes\samples\java\URL>java URLEncodeSample "http://google.com/resource?key=value"
http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue

As expected.

What would be the problem with this?

Oily answered 14/1, 2009 at 19:44 Comment(5)
It is similar to RFC2396 but is not the same. For example, try encoding spaces. URLEncoder will encode it as '+', URIs expect %20 instead. There are other differences.Seafarer
Ok but you wouldn't be encodgin: "value with space" but "value+with+space" like this:Oily
like:java URLEncodeSample "google.com/resource?key=value+with+spaces" http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue%2Bwith%2BspacesOily
There are a whole slew of rules that one must follow for RFC 2369. Instead of playing games with the input string I'd rather find a class that encodes things properly.Seafarer
What do you intend to do? That would be very helpful to know in order to give you the right answer.Oily

© 2022 - 2024 — McMap. All rights reserved.