Character escaping in Pandoc filter pandoc.Para Lua function
Asked Answered
H

1

2

I'm using Pandoc for a large HTML to Markdown conversion project and am trying to write lua filters to handle some of the special cases.

The most common case I am trying to handle is converting specially formatted information boxes into the pymarkdown summary/detail formatting.

Source HTML

<div class="special-info-block">
  <p class="title">INFO</p>
</div>

Goal Markdown

???+ info "INFO"

I can use this function to replace the "INFO":

function Div(el)
   if el.classes[2] == "special-info-block" and pandoc.utils.stringify(el.content[1]) == "INFO" then
      el.content[1] = pandoc.Para('??? info "INFO"+')
      return el
   end
end

but the resulting markdown escapes the quotation marks around INFO:

???+ info \"INFO\"

How do I insert the literal string instead? Is this a feature of the pandoc.Para constructor or should I be looking elsewhere?

Hocus answered 15/12, 2020 at 16:30 Comment(0)
J
2

The escaping happens during Markdown generation, so there are two options here:

  1. Call pandoc with -t markdown-smart, which will instruct the Markdown writer to treat quotes as normal chars;

  2. Create a raw Markdown block instead of a Para to get maximal control over the output: el.content[1] = pandoc.RawBlock('markdown', '??? info "INFO"+').

Both methods these should give the desired result, but the second is probably preferable.

Jankey answered 17/12, 2020 at 15:5 Comment(1)
using -t markdown-smart worked perfectly! I had actually tried using RawBlock initially but when I do, I end up with this in my output: ```{=markdown}<br/> ??? info "INFO"+<br/> ```<br/>Hocus

© 2022 - 2024 — McMap. All rights reserved.