Set html title from the first header with pandoc
Asked Answered
A

7

10

I have usual README.md as people create for github, etc. Pandoc generates

 <title></title>

I want to see there the first header in file. So if I have .md

# My README
text
## Second header

Pandoc should generate

<title>My README</title>

And it would be nice generate from the first 1-# header. So

### Preface
# My README
text
## Second header

still should be

<title>My README</title>

Anyway I want to avoid extending my .md with metainformation which is not a part of simple Markdown standard.

Arceliaarceneaux answered 9/3, 2017 at 22:2 Comment(6)
You could write it as a filter, in python/perl/js/php/etc. The filter would just grab the first level-1 header and use it to set the title metadata.Hintz
@sergioCorreia, how? please explain.Arceliaarceneaux
why not use title: "my title" in the yaml front matter? Pandoc passes that to <title>my title</title>Sodomy
@scoa, because this will not be hidden by other renderers. Not good.Arceliaarceneaux
For info on Pandoc filters, see Scripting with pandoc. I would suggest trying it on your own. If you run into problems, then come back and ask a more specific question about the problems you are having with your code.Hatten
@kyb, I have shown a python filter example in an answer.Unhelm
D
11

Pandoc has an option for this: --shift-heading-level-by. The option can be used to promote or demote all headings in the document, and the first heading that's shifted "above top level" is used as the title.

In the case of a default GitHub README.md, the following would work best:

pandoc --shift-heading-level-by=-1 --standalone --from=gfm ...

The format specifier gfm stands for "GitHub Flavored Markdown".

The downside of this approach is that second-level headings are output as <h1> elements, which may be undesired. In that case, the most general solution is to use a Lua filter:

function Pandoc (doc)
  doc.blocks:walk {
    Header = function (h)
      -- use top-level heading as title, unless the doc
      -- already has a title
      if h.level == 1 and not doc.meta.title then
        doc.meta.title = h.content
        return {}  -- remove this heading from the body
      end
    end
  }
  return doc
end

The above can be used by saving it to a file readme-title.lua and passed to pandoc via --lua-filter=readme-title.lua.

Dulse answered 16/9, 2022 at 8:12 Comment(0)
S
7

pandoc -s test.md -o test.html --metadata title=titleName

Swearword answered 13/9, 2018 at 15:55 Comment(5)
Please explain how this solution helps in answering the question.Bradshaw
I'd add titleName=grep -m1 '^#\s+.*' - extract first line begins with '#'. Thanks.Arceliaarceneaux
This worked great for me. --metadata title="" removed the title from my output, which is what I was looking for.Stanley
This adds the given title to the HTML of the page. I just want it to add <title>My awesome title</title> to the head section of the document. Is this possible? I'm already using a <h1> as my title.Jingo
--metadata title="TITLE HERE" didn't work for me, but --metadata=title:TITLE HERE didPressroom
F
3

Sometimes you have just some plain Markdown files - no meaningful filename, no YAML front matter magic, or such ... and still you want i.e. the output to have some sensible title; without to much manual intervention. And it's pretty common for web pages that the title and the first headline are the same (look at the source code of this exact page: you'll find 'Set html title from the first header with pandoc' in the title and within a h1 tag).

For my purposes I have written a small Lua filter ...

local headline = ""

to_string = {

    Str = function(element)
        headline = headline .. element.text
    end,

    Space = function()
        headline = headline .. " "
    end
}

function Pandoc(doc)

    for i,element in pairs(doc.blocks) do
        if element.t == "Header" and element.level == 1 then
            element:walk(to_string)
            break
        end
    end

    print(headline)
    os.exit(0)
end

... and saved it as first_headline.lua

I can then use this filter for example in my shell scripts ...

#!/bin/bash
INPUT=$1
TITLE=`pandoc --lua-filter=first_headline.lua $INPUT`
pandoc -s --to=html --metadata title="$TITLE" $INPUT

... and in theory this should work for any input format that comes with headlines and any output format that supports a document title.

However, key point here is the underlying idea - with some tinkering this approach should be helpful for many similar use cases.

From answered 16/12, 2022 at 23:23 Comment(2)
Error running filter first_headline.lua: first_headline.lua:18: attempt to call a nil value (method 'walk')Fad
sry @IvanSveshnikov ... it did a year ago iirc - however: if the solution by johndoe works, then it definitely deserves some upvotes ... haven't tried it, but it looks nice and by far more conciseFrom
S
2

That is easily achiveable with lua filter.

Create HeaderToTitle.lua:

local title
function Header(el)
  if title then return end
  title = pandoc.utils.stringify(el)
end

function Meta(el)
  if not el.pagetitle then
    el.pagetitle = title
    return el
  end
end

And now add following option to pandoc command line: --lua-filter=HeaderToTitle.lua

Semiliquid answered 18/4, 2023 at 20:18 Comment(2)
Similar solution is here https://mcmap.net/q/975154/-pandoc-set-document-title-to-first-titleSemiliquid
yours actually worksFad
O
1

If you are generating a PDF, then an alternative workaround could be:

pandoc inputname.md -o outputname.pdf -V "title:My Desired Title"
Operative answered 28/7, 2022 at 9:48 Comment(0)
U
0

Suppose you have a README.md file, and you want to convert it into README.html with the first level 1 markdown title as html header.

You can run pandoc with customized filer (take python as example).

  1. save the following python script into filter.py file
from pandocfilters import toJSONFilter, Null


def behead(key, value, format, meta):
    if key == "Header" and value[0] == 1 and "title" not in meta:
        meta["title"] = {"t": "MetaInlines", "c": value[2]}
        return Null()


if __name__ == "__main__":
    toJSONFilter(behead)
  1. add execution permission for the file
chmod +x filter.py
  1. add execution permission for the file
pandoc -s README.md -o README.html --filter filter.py
Unhelm answered 15/4, 2020 at 16:43 Comment(0)
C
-1

Try this:

pandoc -s README.md -o README.html -V "pagetitle:My README"
Communitarian answered 15/9, 2022 at 22:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.