How to use python-magic 5.19-1
Asked Answered
E

1

8

I need to determine MIME-types from files without suffix in python3 and I thought of python-magic as a fitting solution therefor. Unfortunately it does not work as described here: https://github.com/ahupp/python-magic/blob/master/README.md

What happens is this:

>>> import magic
>>> magic.from_file("testdata/test.pdf")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'from_file'

So I had a look at the object, which provides me with the class Magic for which I found documentation here: http://filemagic.readthedocs.org/en/latest/guide.html

I was surprised, that this did not work either:

>>> with magic.Magic() as m:
...     pass
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'ms'
>>> m = magic.Magic()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'ms'
>>> 

I could not find any information about how to use the class Magic anywhere, so I went on doing trial and error, until I figured out, that it accepts instances of LP_magic_set only for ms. Some of them are returned by the module's methods magic.magic_set() and magic_t(). So I tried to instanciate Magic with either of them. When I then call the file() method from the instance, it will always return an empty result and the errlvl() method tells me error no. 22. So how do I use magic anyway?

Economics answered 13/8, 2014 at 12:28 Comment(7)
Do you have a magic.py file in the same directory as the one you launched the python shell from? The errors you got make it sound like you do (as I just got all your examples working). One way you can find out is import inspect then inspect.getfile(magic) and see whether this is the expected file for the magic module.Halftone
>>> import inspect >>> inspect.getfile(magic) '/usr/lib/python3.4/site-packages/magic.py'Economics
Oh wait, you are referring to Ubuntu's python-magic. Yeah that's a completely different package to the one you looked at. Really though you could take a cursory glance at that file returned by inspect.getfile and see that it probably completely differs from the one on GitHub.Halftone
Another thing, for future reference, is that from the python shell you can call help(obj) to get some kind of help from obj via the builtin documentations. So in this case help(magic) will also bring up any docstrings and available methods that clearly show that what you got on your system is not the same thing you got the documentation for.Halftone
I already read the module, which was not helpful at all. It's by the way the Arch-Package.Economics
As I said, the distro version of the python-magic package is NOT the same as the one you linked on github. They have the exact same name but are completely different. Heck, even the version number on pypi as linked (pypi.python.org/pypi/python-magic) is only at 0.4.6. Also, the distro version is the bindings to the magic library, which are NOT the pure native version. The only help you can get is from realizing your mistake here.Halftone
Got it. My only mistake was to not realize, that these are completely different programs. This does, however, not change the problem I describe in my question: How to use that specific library. Not some completely different stuff from github. Thankfully @mhawke found out how.Economics
N
18

I think that you are confusing different implementations of "python-magic"

You appear to have installed python-magic-5.19.1, however, you reference firstly the documentation for python-magic-0.4.6, and secondly filemagic-1.6. I think that you are better off using python-magic-0.4.6 as it is readily available at PYPI and easily installed via pip into virtualenv environments.

Documentation for python-magic-5.19.1 is hard to come by, but I managed to get it to work like this:

>>> import magic
>>> m=magic.open(magic.MAGIC_NONE)
>>> m.load()
0
>>> m.file('/etc/passwd')
'ASCII text'
>>> m.file('/usr/share/cups/data/default.pdf')
'PDF document, version 1.5'

You can also get different magic descriptions, e.g. MIME type:

>>> m=magic.open(magic.MAGIC_MIME)
>>> m.load()
0
>>> m.file('/etc/passwd')
'text/plain; charset=us-ascii'
>>> m.file('/usr/share/cups/data/default.pdf')
'application/pdf; charset=binary'

or for more recent versions of python-magic-5.30

>>> import magic
>>> magic.detect_from_filename('/etc/passwd')
FileMagic(mime_type='text/plain', encoding='us-ascii', name='ASCII text')
>>> magic.detect_from_filename('/etc/passwd').mime_type
'text/plain'
Nightmare answered 13/8, 2014 at 13:25 Comment(6)
Thanks. Couldn't find the documentation for bindings to that specifically for Python to help this person who can't seem to understand the difference.Halftone
I couldn't find any documentation either, just had to read the module source to guess how to use it.Nightmare
@MHawke - Thanks for providing a working example with both output formats. Here is a single example from the libmagic repo (or a fork of it?) github.com/threatstack/libmagic/blob/master/python/example.py This is only the documentation I could find that presented a workflow of using the module.Pinball
There's a lot of confusion around the two python binding (found a bug report in ubuntu packages by people trying to use the ahupp version of the lib with the standard one.) Anyway, you can get the same result without open and load: magic.detect_from_filename('your_file').mime_type directly provides the expected answer.Crystlecs
@MarwanBurelle: Thanks. This answer refers to file-5.19, however, detect_from_filename() was added in version file-5.26. To be strict, the return values are different with one being a string and the other a namedtuple, but your suggestion is certainly easier to use if using file-5.26 or later.Nightmare
The libmagic home page seems to be darwinsys.com/file and there are links to the official repos there. It appears the python is just a wrapper for libmagic(3), so the man page for libmagic may be helpful. For example in python magic.open(magic.MAGIC_SYMLINK) is the same as C API magic_open(MAGIC_SYMLINK) and the same as the shell command file -L.Exigent

© 2022 - 2024 — McMap. All rights reserved.