Converting Source ASCII Files to JPEGs
Asked Answered
B

5

10

I publish technical books, in print, PDF, and Kindle/MOBI, with EPUB on the way.

The Kindle does not support monospace fonts, which are kinda useful for source code listings. The only way to do monospace fonts is to convert the text (Java source, HTML, XML, etc.) into JPEG images. More specifically, due to pagination issues, a given input ASCII file needs to be split into slices of ~6 lines each, with each slice turned into a JPEG, so listings can span a screen. This is a royal pain.

My current mechanism to do that involves:

  1. Running expand to set a consistent 2-space tab size, which pipes to...
  2. a2ps, which pipes to...
  3. A small Perl snippet to add a "%%LanguageLevel: 3\n" line, which pipes to...
  4. ImageMagick's convert, to take the (E)PS and make a JPEG out it, with an appropriate background, cropped to 575x148+5+28, etc.

That used to work 100% of the time. It now works 95% of the time. The rest of the time, I get convert: geometry does not contain image errors, which I cannot seem to get rid of, in part because I don't understand what the problem is.

Before this process, I used to use a pretty-print engine (source-highlight) to get HTML out of the source code...but then the only thing I could find to convert the HTML into JPEGs was to automate screen-grabs from an embedded Gecko engine. Reliability stank, which is why I switched to my current mechanism.

So, if you were you, and you needed to turn source listings into JPEG images, in an automated fashion, how would you do it? Bonus points if it offers some sort of pretty-print process (e.g., bolded keywords)!

Or, if you know what typically causes convert: geometry does not contain image, that might help. My current process is ugly, but if I could get it back to 100% reliability, that'd be just fine for now.

Thanks in advance!

Bid answered 25/7, 2009 at 19:26 Comment(1)
Related: unix.stackexchange.com/questions/138804/…Paragrapher
T
9

You might consider html2ps and then imagemagick's convert.

A thought: if your target (Kindle?) supports PNG, use that in preference to JPEG for this text rendering.

Treharne answered 25/7, 2009 at 20:5 Comment(3)
That holds some promise. I'm pretty sure I went down that path before and abandoned it, but I forget why, and my preliminary tests suggest it may work out OK. I'll try to get this going tomorrow or Monday to confirm this solution works. Thanks!Bid
No dice. Getting the same ImageMagick error at about the same frequency. Must be a Postscript input thing.Bid
Actually, further experiments showed that the error only occurs, with the html2ps solution, when the source file had trailing whitespace that caused an effectively empty image to be created. So, this works! Many thanks!Bid
B
1

html2ps is an excellent program -- I used it to produce a 1300-page book once, but it's overkill if you just want plain text -> postscript. Consider enscript instead.

Botti answered 19/5, 2010 at 16:46 Comment(0)
N
1

Because the question of converting HTML to JPG has been answered, I will offer a suggestion on the pretty printer. I've found Pygments to be pretty awesome. It supports different themes and has lexers for pretty much any language out there (they advertise the fact that it even highlights brainfuck). There's a command line tool and it's available on most Linux distros.

Nore answered 19/6, 2010 at 22:31 Comment(0)
R
0

Your Linux distribution may include pango-view and an assortment of fonts. This works on my FC6 system:

pango-view --font=DejaVuLGCSansMono --dpi=200 --output=/tmp/text.jpg -q /tmp/text

You'll need to identify a monospaced font that is installed on your system. Look around /usr/share/fonts/.

Pango supports Unicode.

Leave off the -q while you're experimenting, it'll display to a window instead of to a file.

Rock answered 27/8, 2009 at 21:28 Comment(0)
B
0

Don't use jpeg. It's optimized for photographs and does a terrible job with text and line art. Use gif or png instead. My understanding is that gif is now patent-free, so I would just use that.

Botti answered 11/5, 2010 at 16:16 Comment(1)
No option on Kindle -- JPEG or bust.Bid

© 2022 - 2024 — McMap. All rights reserved.