Read contents of the pdf using vim [closed]
Asked Answered
M

3

5

How can we read contents of the pdf file using vim command in the terminal? I have tried using pdftk by uncompressing it but still its not working for me. Is there any other way to decrypt or decode the pdf so that we can read by the terminal in any Linux flavour using vim.

Midiron answered 22/1, 2014 at 10:33 Comment(5)
in vim's official website, the definition of vim is clear: vim the editor It is not pdf reader, it is not MS-Word reader. You can of course write a pdf reader with other language, e.g. Java with itext lib. and in vim call that tool. but this is not the right way to use vim. my 2 cents.Barnstorm
@Barnstorm but see we can open anything using vim but the main thing is that we will not able to read it because it is in encrypted format. So my tast is to decode it, so that we can be able to understand what the pdf has while using vim. And the main things is that it should not loose anything while decrypted. If you want to get my question more please visit this pdflabs.com/docs/pdftk-cli-examples there is a method to uncompress and according to this command we can able to use vim or emacs to read the pdf.Midiron
pdftk (or qpdf, or cpdf, that can uncompress content streams, too, among other things) do not asciify (i.e. ASCII85Encode) binary streams like images, fonts etc. Unfortunately. Therefore, most PDF files after un-compression still contain binary data and are not suitable for text editor. Maybe you want have a look at COS-structure editors/explorers (PoDoFo browser, Enfocus PDF browser, iText RUPS (they are all free) etc.).Piny
And string literals in content streams can be binary, too.Piny
Might be off topic but less can read PDFs. less file.pdf.Attrition
G
6

If you want to read the pdf as text you can try the pdftotext command, though it won't always be beautiful. If you'd like vim to open pdf files in a pdf reader you can use something in your .vimrc like

au BufRead *.pdf sil exe "!xdg-open " . shellescape(expand("%:p")) | bd | let &ft=&ft | redraw!
Guardian answered 22/1, 2014 at 10:41 Comment(2)
Thanks but I think you didn't get my question. Actually, I want to decrypt or decode the pdf, I have. When I tried using vim directly to read that pdf, its giving something in encrypted format, we cannot read it directly. So, I want to decrypt or decode it so that whatever the pdf contains, we will be able to understand it.Midiron
Then you must first use pdftk to decrypt the document. I'm not sure what you're asking beyond that since vim doesn't have pdftk or a pdf reader built into it.Guardian
A
3

This question is more or less a duplicate of this one.
However, following the answers of @Conner and @Eric, the pdftotext utility is a recommended approach if you are only interested in the text content.

A possible approach to pdftotext can be found e.g. here.
There also exists a rather new Vim plugin to simplify these steps. This plugin can be found here or here

You might also write some ftplugin acting as a preprocessor to extract the text of a given pdf. Nevertheless, you won"t be able to change anything in the pdf with the proposed tools.

Analisaanalise answered 22/1, 2014 at 14:13 Comment(0)
S
2

Some pdf text contents can be extracted with pdf2txt, then fed to vim. Of course, you will lose most of the formatting, graphics, etc.

Starks answered 22/1, 2014 at 10:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.