Importing Html into Adobe Indesign
Asked Answered
V

7

18

We are currently working on an pdf version of a newspaper at work, we have a .net website which captures the articles to publish, storing the content entered as html, so we can maintain styles like bold, underline, strike out.

Once this is stored in the database we are planning to use Indesign to create the pdf. We currently we have a template built, but when we generate an xml document and import into Indesign the html tags are just written out. Is there a way around this, to get Indesign to maintain the tags as they would be in html? We just need some simple ones, like bold, strikeout, underling, center align.

Thanks.

Vinegarroon answered 1/3, 2012 at 14:38 Comment(1)
See also this similar question.Thanhthank
P
5

You'll need to translate the HTML tags into CharacterStyles, and apply those to the XML on import.

The tricky thing is that CharacterStyles can't be applied nested like HTML can, so you need to make a CharacterStyle for each combination that might be present. Or you can apply styles to the specific run of text, with a script.

Pomology answered 18/3, 2012 at 16:38 Comment(1)
The group working on the project ended up doing something similar to this. They opened up the idml file, it contains xml files, then converted the html into characterstyles and recreated the xml files needed and packaged that back into an idml file, which InDesign could then open.Vinegarroon
C
6

Pandoc now support export to ICML (Adobe InCopy's XML format that can be "placed" in InDesign documents). To convert HTML to ICML:

pandoc --standalone -o output.icml input.html

See Importing Markdown in InDesign in the pandoc wiki for details around the workflow.

Cristencristi answered 10/7, 2014 at 20:25 Comment(1)
This is the best answer! And pandoc is the best way to get a first contact with the ugly .icml (the Indesin XML is closed to the W3C's XHTML+CSS open format ecosystem). We not need to use InDesign for (high quality) PDF automated contents generation... See also print-css.rocksThanhthank
P
5

You'll need to translate the HTML tags into CharacterStyles, and apply those to the XML on import.

The tricky thing is that CharacterStyles can't be applied nested like HTML can, so you need to make a CharacterStyle for each combination that might be present. Or you can apply styles to the specific run of text, with a script.

Pomology answered 18/3, 2012 at 16:38 Comment(1)
The group working on the project ended up doing something similar to this. They opened up the idml file, it contains xml files, then converted the html into characterstyles and recreated the xml files needed and packaged that back into an idml file, which InDesign could then open.Vinegarroon
T
1

Adobe products are "closed" for universal standards (!) importation, like to import XHTML.

How to PROTEST against Adobe?!

The biggest problem arises when we have many files...


A solution by batch processing (a lot of articles)

... The only way that I can use today (2013) is this (semi-automatic) procedure:

  1. [manual, prepare] Check my InDesign "template" file, that will be used as "importer": styles with legible names must by defined. PS: they are all visible (listed) in a HTML+CSS exporting.
  2. [manual, prepare] Adapt my (X)HTML files to express all relevant styles with CSS class names (not by style attribute neither by strange class names);
  3. [automatic, batch processing] Convert all my (X)HTML files to DOC, automatically using Python OpenDocument Converter.
  4. [InDesign assisted, final processing] Import each DOC from a "template" (item 1) file clone at InDesign. The classes (item 2) will be automatically transformed by InDesign styles.

This procedure is better than IDML because use directaly the XHTML as content source for InDesign. It is not perfect for all applications, but avoids use of non-standard conversion by IDML, avoid to learn IDML, avoid IDML limitations, and avoids risks of IDML bugs... So, I think is faster than try and try IDML procedures.


Another procedure — better, because it allows to express things like footnotes — is to prepare a direct convertion from XML to MS-Word, by a XSLT that transforms XML into DOCX or RFT... Do you have a link or clue for this kind of procedure?

Thanhthank answered 1/3, 2012 at 14:38 Comment(1)
InDesign has one of the best extensibility layers of any software on the planet, and the IDML format is XML. This doesn't seem like a good target to complain about "closed" standards. There is an open format (IDML), it's just that the binary (INDD) executes faster. You can still create and edit an IDML (like you can with a DOCX) and open it.Klein
B
1

We have had some bad experiences importing xml into InDesign directly.

If you are still having trouble with this issue, check out the open source Ickmull code library. It converts an xhtml file to an idml file, that can then be opened in InDesign. This might be a better web to print workflow for you.

http://code.google.com/p/ickmull/

Breana answered 12/5, 2012 at 4:2 Comment(0)
A
1

Maybe you can use a Markdown to InDesign translater as a starting point: http://www.jongware.com/markdownid.html

Apocalypse answered 6/11, 2012 at 23:7 Comment(0)
K
1

This tool is a decent HTML to InDesign importer: https://www.id-extras.com/html-import-script

It may take some rework, but it brings in styles that you can edit and has saved me a bunch of time.

Klein answered 29/5, 2018 at 15:27 Comment(0)
P
0

This is an old question, but the problem is probably perennial.

Here is an easy real-world technique. It may not be perfectly suited to an automatic workflow, but is perfect for occasional use.

  1. Copy the html code, for example from the source view of the browser. Omit the head part, css, menus, etc., and copy only the relevant content which may be enclosed in a series of div, section or other container tags.

  2. Paste in a plain text document (Notepad on Windows, TextEdit on Mac) and save as a plain text file with a .html extension.

  3. Open the html file with LibreOffice. I tried with versions 4 and 6, and they both parse html just fine. You get a document with paragraph styles (like headings) and character styles (like bold and italic). Optionally select all and change the font to Times New Roman. Save as a .docx file, or some other file type.

  4. Import this to InDesign with options for preserving styles and formatting and importing styles automatically. You get a document with paragraph styles and character styles which you may edit as you wish.

Photodrama answered 18/5, 2020 at 17:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.