How to determine the file extension of a file from a uri
Asked Answered
L

9

39

Assuming I am given a URI, and I want to find the file extension of the file that is returned, what do I have to do in Java.

For example the file at http://www.daml.org/2001/08/baseball/baseball-ont is http://www.daml.org/2001/08/baseball/baseball-ont.owl

When I do

    URI uri = new URI(address); 
    URL url = uri.toURL();
    String file = url.getFile();
    System.out.println(file);

I am not able to see the full file name with .owl extension, just /2001/08/baseball/baseball-ont how do I get the file extension as well. ``

Lammond answered 13/7, 2010 at 3:57 Comment(0)
P
80

At first, I want to make sure you know it's impossible to find out what kind of file a URI links too, since a link ending with .jpg might let you access a .exe file (this is especially true for URL's, due to symbolic links and .htaccess files), thus it isn't a rock solid solution to fetch the real extension from the URI if you want to limit allowed file types, if this is what you're going for of course. So, I assume you just want to know what extension a file has based on it's URI even though this isn't completely trustworthy;

You can get the extension from any URI, URL or file path using the method bellow. You don't have to use any libraries or extensions, since this is basic Java functionality. This solution get's the position of the last . (period) sign in the URI string, and creates a sub-string starting at the position of the period sign, ending at the end of the URI string.

String uri = "http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/integrating_apps/images/google_logo.png";
String extension = uri.substring(uri.lastIndexOf("."));

This code sample will above will output the .png extension from the URI in the extension variable, note that a . (period) is included in the extension, if you want to gather the file extension without a prefixed period, increase the substring index by one, like this:

String extension = uri.substring(url.lastIndexOf(".") + 1);

One pro for using this method over regular expressions (a method other people use a lot) is that this is a lot less resource expensive and a lot less heavy to execute while giving the same result.

Additionally, you might want to make sure the URL contains a period character, use the following code to achieve this:

String uri = "http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/integrating_apps/images/google_logo.png";
if(uri.contains(".")) {
    String extension = uri.substring(url.lastIndexOf("."));
}

You might want to improve the functionally even further to create a more robust system. Two examples might be:

  • Validate the URI by checking it exists, or by making sure the syntax of the URI is valid, possibly using a regular expression.
  • Trim the extension to remove unwanted white spaces.

I won't cover the solutions for these two features in here, because that isn't what was being asked in the first place.

Hope this helps!

Pleader answered 30/1, 2013 at 13:0 Comment(5)
This won't work if the url has a question mark after the filename or a hash.Anti
That period character check you added doesn't do anything very useful, since there's a period before the top level domain anyway.Dado
If uri doesn't contain period character can I assume that uri is NOT a uri of a file?Indemnity
@AndroidDeveloper, no. Take the following url as an example. fluttercommunity.dev/_github/header/flutter_webview_pluginArchlute
Uri is different than UrlBalderdash
C
22

This link might help for those who are still having problems: How I can get the mime type of a file having its Uri?

 public static String getMimeType(Context context, Uri uri) {
    String extension;

    //Check uri format to avoid null
    if (uri.getScheme().equals(ContentResolver.SCHEME_CONTENT)) {
        //If scheme is a content
        final MimeTypeMap mime = MimeTypeMap.getSingleton();
        extension = mime.getExtensionFromMimeType(context.getContentResolver().getType(uri));
    } else {
        //If scheme is a File
        //This will replace white spaces with %20 and also other special characters. This will avoid returning null values on file name with spaces and special characters.
        extension = MimeTypeMap.getFileExtensionFromUrl(Uri.fromFile(new File(uri.getPath())).toString());

    }

    return extension;
}
Containment answered 9/4, 2016 at 9:12 Comment(0)
T
19

There are two answers to this.

If a URI does not have a "file extension", then there is no way that you can infer one by looking at it textually, or by converting it to a File. In general, neither the URI or the File needs to have an extension at all. Extensions are just a file naming convention.

What you are really after is the media type / MIMEtype / content type of the file. You may be able to determine the media type by doing something like this:

URLConnection conn = url.connect();
String type = conn.getContentType();

However the getContentType() method will return null if the server did not set a content type in the response. (Or it could give you the wrong content type, or a non-specific content type.) At that point, you would need to resort to content type "guessing", and I don't know if that would give you a specific enough type in this case.

But if you "know" that the file should be OWL, why don't you just give it a ".owl" extension anyway?

Topsoil answered 13/7, 2010 at 4:24 Comment(4)
this fails in case of no internetDorman
Well yea. But if there is not internet, you can't fetch the file. So it hardly matters what its type is. (And you can wait until the internet comes back, or "guess based on the extension" if you really need to "know" now.)Topsoil
Maybe you just need to know the extension without fetching the file?Dorman
If the extension is not in the URI, AND you can't fetch the file or its metadata to find out what the file type really is, then there is no solution ... that doesn't involve time travel, clairvoyance, or some other mumbo jumbo.Topsoil
B
7

Accepted answer is not useful for url contains '?' or '/' after extension. So, to remove that extra string, You can use getLastPathSegment() method. It gives you only name from uri and then you can get extension as follows:

String name = uri.getLastPathSegment();
//Here uri is your uri from which you want to get extension
String extension = name.substring(name.lastIndexOf("."));

above code gets extension with .(dot) if you want to remove the dot then you can code as follows:

String extension = name.substring(name.lastIndexOf(".") + 1);
Biologist answered 27/1, 2019 at 6:16 Comment(1)
Hey Tedinoz, Don't you notice comment of tim visee's answer? This will not work if url has '?' or '/' after filename. If you have url of image from firebase then it gives whole string after "." (dot). e.g, if url is "firebasestorage.googleapis.com/v0/b/mememaker-13a8c.appspot.com/…" then it gives extension as ".jpg?alt=media&token=e89f415c-8338-4e56-9d4c-9a87b6e0edb5" but I want only ".jpg" as extensionBiologist
P
6

URLConnection.guessContentTypeFromName(url) would deliver the mime type as in the first answer. Maybe you simply wanted:

String extension = url.getPath().replaceFirst("^.*/[^/]*(\\.[^\\./]*|)$", "$1");

The regular expression consuming all upto the last slash, then upto a period and either returns an extension like ".owl" or "". (If not mistaken)

Palpitant answered 11/5, 2011 at 16:35 Comment(1)
URLConnection.guessContentTypeFromName(address) // (String address) is the best answer imho.Anti
S
4

Another useful way which is not mentioned in accepted answer is, If you have a remote url, then you can get mimeType from URLConnection, Like

  URLConnection urlConnection = new URL("http://www.google.com").openConnection();
  String mimeType = urlConnection.getContentType(); 

Now to get file extension from MimeType, I'll refer to this post

Showily answered 15/4, 2019 at 7:35 Comment(0)
B
3

As other answers have explained, you don't really know the content type without inspecting the file. However, you can predict the file type from a URL.

Java almost provides this functionality as part of the URL class. The method URL::getFile will intelligently grab the file portion of a URL:

final URL url = new URL("http://www.example.com/a/b/c/stuff.zip?u=1");
final String file = url.getFile(); // file = "/a/b/c/stuff.zip?u=1"

We can use this to write our implementation:

public static Optional<String> getFileExtension(final URL url) {

    Objects.requireNonNull(url, "url is null");

    final String file = url.getFile();

    if (file.contains(".")) {

        final String sub = file.substring(file.lastIndexOf('.') + 1);

        if (sub.length() == 0) {
            return Optional.empty();
        }

        if (sub.contains("?")) {
            return Optional.of(sub.substring(0, sub.indexOf('?')));
        }

        return Optional.of(sub);
    }

    return Optional.empty();
}

This implementation should handle edge-cases properly:

assertEquals(
    Optional.of("zip"), 
    getFileExtension(new URL("http://www.example.com/stuff.zip")));

assertEquals(
    Optional.of("zip"), 
    getFileExtension(new URL("http://www.example.com/stuff.zip")));

assertEquals(
    Optional.of("zip"), 
    getFileExtension(new URL("http://www.example.com/a/b/c/stuff.zip")));

assertEquals(
    Optional.empty(), 
    getFileExtension(new URL("http://www.example.com")));

assertEquals(
    Optional.empty(), 
    getFileExtension(new URL("http://www.example.com/")));

assertEquals(
    Optional.empty(), 
    getFileExtension(new URL("http://www.example.com/.")));
Bamboo answered 6/6, 2017 at 16:9 Comment(0)
Q
0

You can just make in such way:

import org.apache.commons.io.FilenameUtils;
FilenameUtils.getExtension(url.getPath());
Quote answered 5/12, 2023 at 11:30 Comment(0)
S
-1

I am doing it in this way.

You can check any file extension with more validation:

String stringUri = uri.toString();
String fileFormat = "png";

                    if (stringUri.contains(".") && fileFormat.equalsIgnoreCase(stringUri.substring(stringUri.lastIndexOf(".") + 1))) {

                        // do anything

                    } else {

                        // invalid file

                    }
Sokil answered 25/10, 2016 at 5:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.