How to determine appropriate file extension from MIME Type in Java
Asked Answered
T

2

41

I am uploading files to an Amazon s3 bucket and have access to the InputStream and a String containing the MIME Type of the file but not the original file name. It's up to me to actually create the file name and extension before pushing the file up to S3. Is there a library or convenient way to determine the appropriate extension to use from the MIME Type?

I've seen some references to the Apache Tika library but that seems like overkill and I haven't been able to get it to successfully detect file extensions yet. From what I've been able to gather it seems like this code should work, but I'm just getting an empty string when my type variable is "image/jpeg"

    MimeType mimeType = null;
    try {
        mimeType = new MimeTypes().forName(type);
    } catch (MimeTypeException e) {
        Logger.error("Couldn't Detect Mime Type for type: " + type, e);
    }

    if (mimeType != null) {
        String extension = mimeType.getExtension();
        //do something with the extension
    }
Tussah answered 30/11, 2012 at 17:44 Comment(3)
Are you sure you need to set a file extension? If you know the MIME type, you can upload it to S3 with the proper Content-Type, and the extension (usually) becomes irrelevant.Hah
You're correct, I just checked, and setting the Content-Type in ObjectMetadata for S3 does the trick in my case. I'd still like to know the answer to the question, seems like it could come in handy in the future.Tussah
Fair enough. There's no One True Mapping™ of MIME types to file extensions -- some types have multiple extensions, some extensions have multiple types -- so when I'm using a data store that can persist both independently, I try to store only what I know and refrain from guessing.Hah
A
79

As some of the commentors have pointed out, there is no universal 1:1 mapping between mimetypes and file extensions... Some mimetypes have more than one possible extension, many extensions are shared by multiple mimetypes, and some mimetypes have no extension.

Wherever possible, you're much better off storing the mimetype and using that going forward, and forgetting about the extension.

That said, if you do want to get the most common file extension for a given mimetype, then Tika is a good way to go. Apache Tika has a very large set of mimetypes it knows about, and for many of these it also knows mime magic for detection, common extensions, descriptions etc.

If you want to get the most common extension for a JPEG file, then as shown in this Apache Tika unit test you just need to do something like:

  MimeTypes allTypes = MimeTypes.getDefaultMimeTypes();
  MimeType jpeg = allTypes.forName("image/jpeg");
  String jpegExt = jpeg.getExtension(); // .jpg
  assertEquals(".jpg", jpeg.getExtension());

The key thing is that you need to load up the xml file that's bundled in the Tika jar to get the definitions of all the mimetypes. If you might be dealing with custom mimetypes too, then Tika supports those, and change line one to be:

  TikaConfig config = TikaConfig.getDefaultConfig();
  MimeTypes allTypes = config.getMimeRepository();

By using the TikaConfig method to get the MimeTypes, Tika will also check your classpath for custom mimetype defintions, and include those too.

Along answered 30/11, 2012 at 23:45 Comment(0)
A
0

My solution:

Create an interface MimeService, a MimeServiceImpl implementation. Then create a mime.properties in classpath. StreamUtils.toProperties(fileName) is a home-made util class which load a properties file from classpath.

public class MimeServiceImpl implements MimeService
{
    private static final Logger LOG = 
       LoggerFactory.getLogger(MimeServiceImpl.class);

    protected Properties mapping;

    public MimeServiceImpl()
    {
        this.mapping = StreamUtils.toProperties("mime.properties");
        for(String key : this.mapping.stringPropertyNames())
        {
            String value = this.mapping.getProperty(key);
            LOG.info("{}={}", key, value);
        }
    }

    @Override
    public String getMime(String ext) {
        return StringUtils.isNotBlank(ext) ? 
            this.mapping.getProperty(ext.trim().toLowerCase()) : null;
    }
}

mime.properties (key=extension file, value=mime)

json=application/json
zip=application/octet-stream
pdf=application/pdf 
xls=application/vnd.ms-excel 
xlsx=application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
htm=text/html 
html=text/html 
xml=text/xml
txt=text/plain  
js=application/javascript
jpeg=image/jpeg
jpg=image/jpeg
png=image/png
gif=image/gif
tiff=image/tiff
doc=application/msword
rtf=application/msword

If you need more mapping, just add another entry in properties file.

If you´re using this code like a library, and you want an extended mapping, create a new mime.properties in your classpath project and add all new entries you want.

Auctioneer answered 9/1 at 22:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.