I have a URL. How do I retrieve its path part?
For example: Given "http://www.costo.com/test1/test2"
, how do I get "test1/test2"
?
I have a URL. How do I retrieve its path part?
For example: Given "http://www.costo.com/test1/test2"
, how do I get "test1/test2"
?
You want something like this:
String path = new URL("http://www.costo.com/test1/test2").getPath();
Actually that'll give you /test1/test2
. You'll just have to remove the first /
to get what you want:
path = path.replaceFirst("/", "");
Now you'll have test1/test2
in path
.
I had performance doubts using the Java URL class for just extracting the path from an URL and thought that this is an overkill.
Therefore I wrote three methods, which all use a different way to extract the path from a given URL.
All three methods are invoked 1000000 times for a given URL.
The result is:
#1 (getPathviaURL) took: 860ms
#2 (getPathViaRegex) took: 3763ms
#3 (getPathViaSplit) took: 1365ms
Code - feel free to optimize it:
public static void main(String[] args) {
String host = "https://mcmap.net/q/459574/-how-to-get-the-path-of-a-url";
long start1 = System.currentTimeMillis();
int i = 0;
while (i < 1000000) {
getPathviaURL(host);
i++;
}
long end1 = System.currentTimeMillis();
System.out.println("#1 (getPathviaURL) took: " + (end1 - start1) + "ms");
Pattern p = Pattern.compile("(?:([^:\\/?#]+):)?(?:\\/\\/([^\\/?#]*))?([^?#]*)(?:\\?([^#]*))?(?:#(.*))?");
long start2 = System.currentTimeMillis();
int i2 = 0;
while (i2 < 1000000) {
getPathViaRegex(host, p);
i2++;
}
long end2 = System.currentTimeMillis();
System.out.println("#2 (getPathViaRegex) Took: " + (end2 - start2) + "ms");
long start3 = System.currentTimeMillis();
int i3 = 0;
while (i3 < 1000000) {
getPathViaSplit(host);
i3++;
}
long end3 = System.currentTimeMillis();
System.out.println("#3 (getPathViaSplit) took: " + (end3 - start3) + "ms");
}
public static String getPathviaURL(String url) {
String path = null;
try {
path = new URL(url).getPath();
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return path;
}
public static String getPathViaRegex(String url, Pattern p) {
Matcher m = p.matcher(url);
if (m.find()) {
return m.group(3);
}
return null;
}
public static String getPathViaSplit(String url) {
String[] parts = url.split("/");
parts = Arrays.copyOfRange(parts, 3, parts.length);
String joined = "/" + StringUtils.join(parts, "/");
return joined;
}
System.currentTimeMillis()
for micro benchmarking. Use nano, which is more accurate. And keep in mind this is not a real benchmark, I strongly recommend using a benchmarking tool for that, openjdk.java.net/projects/code-tools/jmh –
Natividadnativism .split
method: Keep in mind there might be a query or anchor or both attached as well (e.g. .../path?q=somevalue
or .../path#someAnchor
) –
Divorce URL url = new URL("http://www.google.com/in/on");
System.out.println(url.getPath());
Also See
You can do this:
URL url = new URL("http://www.costo.com/test1/test2");
System.out.println(url.getPath());
If you want to get it from an url of your application something like http://localhost:8080/test1/test2/main.jsp. Use can use
request.getRequestURI() //result will be like test1/test2
I recommend to use URI
class because that can handle relative path also. Here is a sample code to achieve the same with URI and URL:
String urlStr = "http://localhost:8080/collections-in-java?error=true";
try {
URI uri = URI.create(urlStr);
System.out.println(uri.getPath());
URL url1 = new URL(urlStr);
System.out.println(url1.getPath());
} catch (MalformedURLException e) {
e.printStackTrace();
}
The above code will produce same result. The URI is useful if there is chance that the path may be relative e.g. /some/path/collections-in-java?error=true
For this case, URI.getPath()
will return /some/path/collections-in-java
but URL.getPath()
will throw MalformedURLException
.
maybe late, but if you don't like URL or URI methods, here is a simple one:
url="https://www.google.com:443//hellowrodl://here/?ff=333#2222";
url=url.split("://",2)[1];
System.out.println(url.replace(url.split("/")[0],""));
The output will be:
//hellowrodl://here/?ff=333#2222
© 2022 - 2025 — McMap. All rights reserved.