I had a similar problem when trying to export a Gitlab wiki to PDF. There links between pages look like filename-of-page#anchor-name
and links within a page look like #anchor-name
. I wrote a (finicky and fragile) pandoc filter that solved that problem for me, who knows it's useful to others.
Example files
To explain my solution I'll have two test files, 101-first-page.md
:
# First page // Gitlab automatically creates an anchor here named #first-page
Some text.
## Another section // Gitlab automatically creates an anchor here named #another-section
A link to the [first section](#first-page)
and 102-second-page.md
:
# Second page // Gitlab automatically creates an anchor here named #second-page
Some text and [a link to the first page](101-first-page#first-page).
When concatenating them to render as one document in pandoc, links between pages break as anchors change. Below the concatenated file with the anchors in comments.
# First page // anchor=#first-page
Some text.
## Another section anchor=#another-section
A link to the [first section](#first-page)
# Second page // anchor=#second-page
Some text and [a link to the first page](101-first-page#first-page). // <-- this anchor no longer exists.
The link from the second to the first page breaks as the link target is incorrect.
Solution
By pre-processing all markdown files first individually via a pandoc filter, and then concatenating the resulting json files I was able to get all links working.
Requirements
- pandoc
- latex
- python
- pandocfilters
- Every file should start with a level 1 header that matches the filename (except for the number at the beginning). E.g. the file
101-A file on the wiki.md
should have a first level one header named A file on the wiki
.
Filter
The filter itself (together with the pandoc script) is available in this gist.
What it does is:
- It gets the label of the first level 1 header, e.g.
first-page
- It prepends that label to all other labels in the same file, e.g.
first-page-another-section
.
- It renames all links to the same file such that the prefix is taken into account, e.g.
#first-page-first-page
- It renames all links to other files such that the (assumed) prefix of the other files is taken into account, e.g.
101-first-page#first-page
becomes #first-page-first-page
.
After it has run every markdown file through this filter individually and converted them to json files, it concatenates the json's and converts that to a PDF.