I'm looking for a PDF library which will allow me to extract the text from a PDF document. I've looked at PyPDF, and this can extract the text from a PDF document very nicely. The problem with this is that if there are tables in the document, the text in the tables is extracted in-line with the rest of the document text. This can be problematic because it produces sections of text that aren't useful and look garbled (for instance, lots of numbers mashed together).
I'd like to extract the text from a PDF document, excluding any tables and special formatting. Is there a library out there that does this?