Detecting a file mimetype in coldfusion that's already uploaded on the server
Asked Answered
B

2

8

I am attempting to detect the file type of a library of files on our webserver as we are implementing code that is designed to stream files to the browser securely. Previously, the files were being stored and presented to users via a direct href.

I have attempted to do this 3 different ways, all on my local machine (which is NOT a simulated production environment):

  1. Setting a variable to be the value of what is returned from the function getPageContext().getServletContext().getMimeType(). This detects some but not all mime types for files.

  2. Creating an object from coldfusion.util.MimeTypeUtils and calling function guessMimeType(). This also detects some but not all mime types for files.

  3. A cffile action="read" on files in the library. This is the solution my boss recommended, as he has used this on files with cffile action="upload" from a form (and says it works), but when I use it, the cffile structure is always blank.

Ideally, I want to retrieve the mime type of every file located on the server with 100% accuracy. The code I have written has detected approximately 99% of the files on my copy of the repo, leaving about 30 that it can't identify. Included in these are MS office files with the new -x extension, and tgz compressed files.

I am wondering if there is there a sure-fire way to detect the mime-types of any given file that exists on a server by using CF code to look at it, and will the code that's being used work on a production server where very few applications are installed? It is my understanding that the first function I referenced uses the mime-type library of the OS, and the 2nd uses a predetermined list in the java object for mime-types. Searching on Google and SO has not produced anything that tells me that CF can accurately detect file mime types on it's own, nor have I seen anything that says this can't be done.

Edit: This is on a CF8 environment.

Better answered 19/9, 2010 at 23:17 Comment(2)
IIRC, the mime type on cffile upload is the mime type reported by the client browser, which may or may not be accurate. It always exists, because the browser is required to send something even if it is essentially a placeholder.Solita
@Ben, That's what I thought too. Searching on Google revealed results that are consistent with that explanation. If I can figure out an answer here, I can put together a function to check on the backend the mimetypes of uploaded files so that we don't have to go back and scrub data later for inconsistent or incorrect mime-types from files uploaded from users.Better
P
2

There will not be a 100% guaranteed sure-fire way because mime types are arbitrary mappings.

You will need to use somebody's mappings, whether its the OS or the JVM.

It will be your responsibility to fill in any blanks that either the OS or the JVM has in mappings, and keep that up to date.

But, I will always be able to create some file, give it an extension of .xyzzy, and you'll not be able to find out the 'mime-type' of it.

Phagocytosis answered 20/9, 2010 at 13:57 Comment(3)
This only seems half right. I'm suspecting that mime-types aren't stored in files, but arbitrary assignment doesn't sound correct either. Sure, you can create your own random file extension, but if you wanted to programmatically use it, there's got to be a way to associate your file type with your application beyond setting up the extension and mime-type in your OS's mime-type db/list.Better
Perhaps 'arbitrary' was the wrong word. I meant it in the sense of not being programatic. You can't look at a file you've never seen before and determine the proper mime type. The canonical registration of mime types is here: iana.org/assignments/media-types. The method of determining mime type is generally by looking up the file extension or the magic number (en.wikipedia.org/wiki/…) against a database of registered types.Phagocytosis
Ok, I understand now. Seeing how it's external metadata, I see how it's impossible to get them all the time. I can deal with manually entering a small handful of mime-types and I can implement validation to either only accept currently detectable mime-types or in the case of the unknown MS Office files, just programmatically assign the correct mimetype if the browser can't. Thanks for the help.Better
E
2

I know this is a very old question, but the answers offered were from the era of CF9 and earlier. To clarify for those using CF10 and above, there is a filegetmimetype function now.

And note that it does NOT just look at the file extension, but by default also assesses some of the content to make sure the content matches the file extension and mime type that would imply. See the "strict" argument in the function.

Elledge answered 4/3, 2021 at 18:58 Comment(2)
Never thought I'd see Charlie Arehart address a question I posed on SO.Better
I'm just here to help, whoever and wherever. :-) Thanks for the upvote, if that was yours, and either way hope the answer may help others in the future, Clearly, even really old posts get traffic years later, as this one already proves. :-)Elledge

© 2022 - 2024 — McMap. All rights reserved.