How to programmatically convert HTML to epub? [closed]
Asked Answered
T

7

45

Can I do this conversion with any programming language or library?

Travis answered 11/8, 2010 at 2:40 Comment(3)
Did you wind up settling on a process? Looking for a PHP solution myself.Bistoury
After try different programs, I think this is much better: juliansmart.com/ecub Though it is not open source.Jeopardize
I think you are interested in this: github.com/Grandt/PHPePubGabe
C
62

The short answer is yes, it can be done in any programming language.

Basic steps:

  1. Convert your HTML to XHTML (+ CSS). This can be done in your program or through an XSLT file.
  2. Copy your files (XHTML, CSS, any images and fonts) into a directory structure that follows the format.
  3. Zip the directory structure up and name the archive with a ".epub" extension.

Some web sites to help you get started:

June 2015 Note: The epubcheck validator has moved from google code to GitHub; note the new URL.

Chew answered 9/12, 2010 at 23:21 Comment(0)
M
16

Calibre supports a wide variety of input formats, including HTML, and a wide variety of output formats, including EPUB, but it's not "a programming language or library". Are there specific reasons you desire a programming-based approach rather than a free-standing tool? If so, maybe Python and ebookmaker.py, for example, could help you.

Mizzenmast answered 11/8, 2010 at 2:45 Comment(5)
I want to automatise a process.Travis
calibre can be run from the command lineAccroach
Agree..I've used it from the command line myself, integrated with some bash scripting. Good for small-ish books. My larger pubs created by Calibre never pass validation.Measurable
Unfortunately Calibre does not support ePub 3.0.Marcelmarcela
@Marcelmarcela It does nowLeyba
C
4

A late reply, but I found the Python 3-based ebookmaker to be of value, at least after I contributed a pull request to remove a UTF-8 BOM. One problem with it appears to be that it uses brittle regular expressions to parse HTML, but I guess I'll have to report it there.

Civilly answered 22/3, 2013 at 21:4 Comment(0)
B
1

Here's pdf to epub, I know that's not what you're after, but it's a start.

The calibre package may have what you want

Brookbrooke answered 11/8, 2010 at 2:54 Comment(0)
A
1

I am using the following library from Aspose - http://www.aspose.com/categories/.net-components/aspose.words-for-.net/default.aspx

In just two lines of code I am able to do html to epub conversions. Using this currently in a production system.

Document doc = new Document(_sourceFilePath);

doc.Save(_destinationFilePath, SaveFormat.Epub);

Annabel answered 30/9, 2010 at 19:33 Comment(0)
C
1

I just started to implement such a tool in Java (OpenJDK compatible): html2epub. In order to get rid of manually editing the config file, I'll probably start a separate tool to generate the config file from any given directory (however, it would still be necessary to determine the order of the XHTMLs in the EPUB - for non-programmatical use, developing a GUI helper tool could be considered, for a fully flexible programmatical solution, I haven't come up with an idea yet). Before that, I implemented shell script based converters for custom XML input (hag2epub tools) - in case you're interested, I would probably port them to XHTML input (with a config file for the EPUB metadata or obtaining metadata from the topmost index.html of a directory, if existing).

Cryology answered 7/1, 2014 at 17:1 Comment(4)
if you would license it under Apache 2.0 it would be way to go for many people, as its under AGPL i cannot use it. its pitty :(Cleocleobulus
Could you please tell me how the AGPL could possibly blocking you from using it?Cryology
cannot use AGPL in commercial productCleocleobulus
That's simply not true. To the contrary: being usable in a commercial context or as a commercial product is an important right free software grants its users, so does the AGPL too.Cryology
S
0

I have the same issue previously, necause I want to read some webpage content offline on my iPad. I have no idea and I am not a computer savvy. There are calibre or stanza blabla....

But for me they are just formats converters and I need a ePub book creator which will allows me to combine many desired documents together to read. Then I found a bookish html to ePub converter, I save the html page from web then convert with it. It's a quite good tool for me now.

Scrap answered 18/5, 2011 at 2:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.