I am looking for a C/C++ alternative for Apache Tika framework which is Java based. Specifically, I am searching for file meatadata and structured text extraction all under one framework. After some online searching and browsing the closest thing I have is GNU libextractor and a bunch of individual file filters that parse documents to extract text data (pdftoext, xls2csv ..etc)
Can anyone please recommend a good library comparable to Apache's Tika ?
Thanks