How to add a custom MIME type and override a default extension pattern?
Asked Answered
D

1

40

I am trying to add a custom mime type to Apache Tika.

I have the following custom-mimetypes.xml document in org.apache.tika.mime :

<?xml version="1.0" encoding="UTF-8"?>
<mime-info>
    <mime-type type="text/stringtemplategroup">
        <glob pattern="*.stg"/>
    </mime-type>
    <mime-type type="text/stringtemplate">
        <glob pattern="*.st"/>
    </mime-type>
</mime-info>

I am getting an error about a Conflicting extension pattern .st:

Caused by: org.apache.tika.mime.MimeTypeException: Conflicting extension pattern: .st
    at org.apache.tika.mime.MimeTypesReader.startElement(MimeTypesReader.java:166)
    at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source)

How do I override the default entry for *.st extension and have it use my own?

Decongestant answered 22/2, 2013 at 3:49 Comment(4)
Did you fix it? I am facing the same problem. Let me know if you figured it outCrenate
I gave up TIka was a PITA because of some pretty bad design decisions about tightly coupling everything to a File object instead of an InputStream so using on Google App Engine was extremely hard and I had to fork and modify too much stuff to make it less painless. I ended up writing my own magic number classifier for the handful of types I support in my application. Tika is a good idea, terrible implementation.Decongestant
Ok no luck for me thenCrenate
real shame to read this as the guys in my development team forked tika and wrote lots of it to work from more of a stream model than remain coupled to File. Sadly though they weren't permitted to push back to the project due to fear from the company they work for and that was 3 or more years ago now!Eviaevict
M
4

Seems you need to add a magic tag with a priority

<mime-type type="text/stringtemplate">
    <magic priority="50">
        <!-- some match pattern -->
        <!-- <match value="[some characters]" type="string" offset="0" /> -->
    </magic>
    <glob pattern="*.st"/>
</mime-type>
Mesocarp answered 13/6, 2018 at 22:52 Comment(4)
thanks for the info, but in the end it does not fix the tight coupling to File even if this works.Decongestant
Thanks for the bounty, much appreciated.Mesocarp
thanks for taking the time and trying to help those that might find this.Decongestant
:) funny enough I got to this question by mistake when SO changed its UI. I didn't realize it was an old one until I have posted my answer. It caught my attention since I added custom magic numbers for Quark Xpress files to a Linux box acting as an Apple file server :p several years ago.Mesocarp

© 2022 - 2024 — McMap. All rights reserved.