Special characters not showing in pandoc html output
Asked Answered
C

4

29

I am trying to get special characters (for foreign surnames) working in pandoc. I followed the instructions here and made sure all special characters are represented using UTF encoding (as per this page. I chose HTML Entity (decimal) option. The resulting files work well when converting to docx or pdf but not html. Is there an encoding that will work for all three output types, or do I need to include some other option?

Here is a line of markdown code for conversion using the special character encoding

some example text with special characters Å, ä, ö

which should print as

some example text with special characters Å, ä, ö

pandoc commands

pandoc example.md -o example.docx  # Works

pandoc example.md -o example.pdf   # Works

pandoc example.md -o example.html  # Doesn't work

running via inconv does not change output behaviour

iconv -t utf-8 example.md | pandoc -o example.html  # Doesn't work
Collocutor answered 20/1, 2014 at 1:4 Comment(0)
D
47

Try

pandoc example.md -s -o example.html

instead. The additional -s (for "stand-alone") makes pandoc insert the necessary metadata to create a full HTML file instead of just the HTML snippet that directly corresponds to the text in example.md. As part of the metadata, pandoc also generates the information that the file is UTF8 encoded. Your browser needs this piece of information to display the special characters correctly.

If you cannot use the -s flag for some reason, make sure to tell the browser about the UTF8 some other way.

Dey answered 12/2, 2014 at 1:8 Comment(2)
Not working if there are UTF-8 Chinese characters. Had to resort to browser text encoding.Dermatogen
not useful if your using the summary.md and not standaloneNorvil
M
3

You could also use the option --ascii to produce pure-ascii-output with special charactes encoded as entities.

Mingo answered 20/4, 2021 at 12:12 Comment(1)
This worked for me when using powershell to copy markdown into a format I could paste into microsoft Teams: Get-Clipboard | pandoc --ascii | Set-Clipboard -AsHtmlDolf
N
1

Add the following to _layouts/default.html in the tag when using the summary.md and you are not able to use the -s for standalone.

 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Norvil answered 9/1, 2019 at 18:6 Comment(0)
V
0

In the index.html change data-charset="iso-8859-15" to data-charset="utf-8" example:

    <section
data-markdown="slides/demo.md"
          data-separator="\n---\n"
          data-separator-vertical="^\n\n"
          data-separator-notes="\n> >"
          data-charset="utf-8">
</section>
Valorievalorization answered 16/5, 2021 at 12:52 Comment(1)
Welcome to StackOverflow! Can you share how this code was generated? It doesn't look as if it was produced by pandoc.Latialatices

© 2022 - 2024 — McMap. All rights reserved.