Compiling libmagic statically (c/c++ file type detection)
Asked Answered
T

3

3

Thanks to the guys that helped me with my previous question (linked just for reference).

I can place the files fileTypeTest.cpp, libmagic.a, and magic in a directory, and I can compile with g++ -lmagic fileTypeTest.cpp fileTypeTest. Later, I'll be testing to see if it runs in Windows compiled with MinGW.

I'm planning on using libmagic in a small GUI application, and I'd like to compile it statically for distribution. My problem is that libmagic seems to require the external file, magic. (I'm actually using my own shortened and compiled version, magic_short.mgc, but I digress.)

A hacky solution would be to code the file into the application, creating (and deleting) the external file as needed. How can I avoid this?

added for clarity:

magic is a text file that describes properties of different filetypes. When asked to identify a file, libmagic searches through magic. There is a compiled version, magic.mgc that works faster. My application only needs to identify a handful of filetypes before deciding what to do with them, so I'll be using my own magic_short file to create magic_short.mgc.

Transposal answered 23/1, 2010 at 21:11 Comment(0)
B
5

This is tricky, I suppose you could do it this way... by the way, I have downloaded the libmagic source and looking at it...

There's a function in there called magic_read_entries within the minifile.c (this is the pure vanilla source that I downloaded from sourceforge where it is reading from the external file.

You could append the magic file (which is found in the /etc directory) to the end of the library code, like this cat magic >> libmagic.a. In my system, magic is 474443 bytes, libmagic.a is 38588 bytes.

In the magic.c file, you would need to change the magichandle_t* magic_init(unsigned flags) function, at the end of the function, add the line magic_read_entries and modify the function itself to read at the offset of the library itself to pull in the data, treat it as a pointer to pointer to char's (char **) and use that instead of reading from the file. Since you know where the offset is to the library data for reading, that should not be difficult.

Now the function magic_read_entries will no longer be used, as it is not going to be read from a file anymore. The function `magichandle_t* magic_init(unsigned flags)' will take care of loading the entries and you should be ok there.

If you need further help, let me know,

Edit: I have used the old 'libmagic' from sourceforge.net and here is what I did:

  1. Extracted the downloaded archive into my home directory, ungzipping/untarring the archive will create a folder called libmagic.
  2. Create a folder within libmagic and call it Test
  3. Copy the original magic.c and minifile.c into Test
  4. Using the enclosed diff output highlighting the difference, apply it onto the magic.c source.
48a49,51
> #define MAGIC_DATA_OFFSET     0x971C
> #define MAGIC_STAT_LIB_NAME "libmagic.a"
>
125a129,130
>       /* magic_read_entries is obsolete... */
>       magic_read_entries(mh, MAGIC_STAT_LIB_NAME);
251c256,262
<
---
>
>       if (!fseek(fp, MAGIC_DATA_OFFSET, SEEK_SET)){
>               if (ftell(fp) != MAGIC_DATA_OFFSET) return 0;
>       }else{
>               return 0;
>       }
>
  • Then issue make
  • The magic file (which I copied from /etc, under Slackware Linux 12.2) is concatenated to the libmagic.a file, i.e. cat magic >> libmagic.a. The SHA checksum for magic is (4abf536f2ada050ce945fbba796564342d6c9a61 magic), here's the exact data for magic (-rw-r--r-- 1 root root 474443 2007-06-03 00:52 /etc/file/magic) as found on my system.
  • Here's the diff for the minifile.c source, apply it and rebuild minifile executable by running make again.
40c40
<       magic_read_entries(mh,"magic");
---
>       /*magic_read_entries(mh,"magic");*/

It should work then. If not, you will need to adjust the offset into the library for reading by modifying the MAGIC_DATA_OFFSET. If you wish, I can stick up the magic data file into pastebin. Let me know.

Hope this helps, Best regards, Tom.

Bindle answered 23/1, 2010 at 23:19 Comment(11)
I thought about using/modifying libmagic's source too, so I found that sourceforge version as well, but I suspect they're not the same as libmagic-dev from ubuntu repo. If you do man libmagic or check linux.die.net/include/magic.h, the timestamp is 2003. The one from SF is 2000 (sourceforge.net/projects/libmagic/files). Otherwise, I imagine this would be a good solution, and I could start figuring out how to pull it off.Transposal
It doesn't look like the magic entries from the newer libmagic 5.03 is compatible with the old libmagic alpha. It does look like libmagic 5.03 is included within file-5.03 (from man file, ftp.astron.com/pub/file/file-5.04.tar.gz), so I'll try messing around with that. I think I might get stuck trying to do that appending & pointer offset thing, if I get to that point.Transposal
either way, if you need my assistance, please do not hesitate to ask by replying back! ;)Bindle
Try as I might, I couldn't decouple the libmagic part of file, nor could I find a standalone source of libmagic. I think I'm just going to have my app identify a file by its extensions and throw if it ends up with the wrong filetype.Transposal
@Kache4: Where is the source for the most recent version of libmagic/file? Will look into it and give it a shot...Bindle
I can't find the source for a recent & standalone libmagic, but file can be found in darwinsys.com/file. Another option I have is to write my own magic number identifier. I actually only need to differentiate a handful of image, compressed archive, and text file types. All I'd need to do once I have an ifstream of a file is match a few bytes in the header, right?Transposal
I think a few things might be different. I used #define MAGIC_DATA_OFFSET <size of clean-compiled libmagic.a in bytes (mine wasn't 38684)>. Also, what version of file(1) does man magic say you have? I suspect this magic.c can't parse my magic version 5.03 from Ubuntu 9.10. if (!parse_line(...)) { printf("kache4 added this printf line - parse failed\n"); continue;} around line 287 fails every time. Btw if this is inconveniencing you at all, I think I'm going to find another way to do this. Getting this to work is just going to be more of an exercise for me than anything at this point.Transposal
@Kache4: Have a look here, I uploaded libmagic_tommieb75.tar.gz for your convenience. box.net/shared/ao1jzn9sslBindle
There's good news and there's bad news. Good news is that I got it to work. The libmagic-alpha from SF can parse your magic file, but not mine. The bad news is that minifile now reads libmagic.a for the magic concat'd at its end. If libmagic.a isn't in the executable's directory, it won't work. I think I'm just going to give up on libmagic and try a different method altogether.Transposal
Ok! Oh well, at least you tried...didn't think of that slight flaw..it was fun hacking it though...take care.. :)Bindle
@Kache4: Check this out...linuxjournal.com/content/…Bindle
A
1

I can tell you how to compile a library in statically - you simply pass the path to the .a file on the end of your g++ command - .a files are just archives of compiled objects (.o). Using "ldd fileTypeTest" will show you the dynamically linked libraries - ${libdir}/libmagic.so shouldn't be in it.

As for linking in an external data file... I don't know - Can you not package the application (.deb|.rpm|.tar.bz2)? On windows, I'd write an installer using NSIS.

Arroba answered 23/1, 2010 at 21:35 Comment(0)
C
0

In the past I've built self extracting archives. Basically it is a .exe file consisting of a .zip archive and code to unzip it. download the .exe, run it, and poof! you can have as many files as you want.

http://en.wikipedia.org/wiki/Self-extracting_archive

Cibis answered 23/1, 2010 at 21:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.