I writing a Word document with R markdown in R Studio. I can get many things, but at the moment I am not figuring out how can I get a page break. I have found solutions but only for rendered latex / pdf document that it is not my case.
Added: To insert a page break, please use \newpage
for formats including LaTeX, HTML, Word, and ODT.
https://bookdown.org/yihui/rmarkdown-cookbook/pagebreaks.html
Paragraph before page break.
\newpage
First paragraph on a new page.
Previously: There is a way by using a fifth-level header block (#####
) and a docx template defined in YAML.
After creating headingfive.docx
in Microsoft Word, you select Modify Style
of the Heading 5
, and then select Page break before
in the Line and Page Breaks
tab and save the headingfive.docx
file.
---
title: 'Making page break using fifth-level header block'
output:
word_document:
reference_docx: headingfive.docx
---
In your Rmd
document, you define reference_docx
in the YAML header, and now you can use the page-breaking #####
.
Please see below.
https://www.r-bloggers.com/r-markdown-how-to-insert-page-breaks-in-a-ms-word-document/
With the help of John MacFarlane and others on the pandoc google group, I put together a filter that does this. Please see:
https://groups.google.com/forum/#!topic/pandoc-discuss/FzLrhk0vVbU
In short, the filter needs to look for something to replace with the openxml for pagebreak. In this case
\newpage
is being replaced with
<w:p><w:r><w:br w:type=\"page\"/></w:r></w:p>
This allows for a single latex markup to be interpreted for both pdf and word output.
Joel
What you are trying to do is force a "page break" or "new page" in a word document generated with Pandoc. I have found a way to do this in my environment but I'm not sure it will work in every environment.
My environment: * R-studio / Pandoc / MS-WORD starting with an "*.Rmd" file and generating a DOCX file.
In my RMD file the key idea is that i've created what acts like a TEMPLATE document (MyFormattingDocument.docx) and in that word document I tweak the STYLES for things like "Heading 1" and/or "Heading 2" and or "footnote" or whatever other predefined styles I want to tweak.
(SEE THIS: http://rmarkdown.rstudio.com/word_document_format.html#style-reference ) for explanation of style reference and how to set the header information in your RMD file to specify a reference document.
SOOOO in my case... i tweak the "Heading 1" style in WORD to include a forced "Page Break Before" in the Paragraph formatting for "Heading 1". Exactly how you force every "Heading 1" to always "Page Break" is different in different versions of Microsoft WORD but if you follow the WORD documentation and modify the "Heading 1" style THEN every "Heading 1" will always have a pagebreak before it.
THEN... you save this template file in the some directory you're working from with the RMD file... and it is USED AS a template. THE CONTENTS of the file are ignored.... so don't worry... you can put sample text in this file and test that the formatting all works.... THE CONTENTS ARE IGNORED but the STYLES are USED in the new word document which will be built by the RMD file so.... then every "Heading 1" will have a break before it.
NOTE: You could obviously do the same with ANY style that has a one-to-one mapping from PANDOC MARKUP so you could instead just make all "Heading 3" or whatever.... just look at see in your RMD created DOCX what "STYLE" is being applied and then tweak that style even if you need to insert some "fake" lines with essentially blank content just for the purpose of forcing a style to appear in the DOCX
Here is an R script that can be used as a pandoc filter to replace LaTeX breaks (\pagebreak
) with word breaks, per @JAllen's answer above. With this you don't need to compile a pandoc script. Since you are working in R Markdown I assume one has R available in the system.
#!/usr/bin/env Rscript
json_in <- file('stdin', 'r')
lat_newp <- '{"t":"RawBlock","c":["latex","\\\\newpage"]}'
doc_newp <- '{"t":"RawBlock","c":["openxml","<w:p><w:r><w:br w:type=\\"page\\"/></w:r></w:p>"]}'
ast <- paste(readLines(json_in, warn=FALSE), collapse="\n")
ast <- gsub(lat_newp, doc_newp, ast, fixed=TRUE)
write(ast, "")
Save this as page-break-filter.R
or something like that and make it executable by running chmod +x page-break-filter.R
in the terminal.
Then include this filter the R Markdown YAML like so:
---
title: "Title
author: "Author"
output:
word_document:
pandoc_args: [
"--filter", "/path/to/page-break-filter.R"
]
---
Error running filter page-break-filter.R: Error in $: Failed reading: not a valid json value
. Also, incredibly bizarrely, every time I try to render the Rmd, it deletes page-break-filter.R
and a bunch of other source files. That doesn't happen when I I don't include the pandoc_args
in my YAML. –
Bosky You can use the R package worded
. This avoids the need for a template word file. See https://github.com/davidgohel/worded.
The output
parameter needs to be set to worded::rdocx_document
and you need to call library(worded)
.
---
date: "2018-03-27"
author: "David Gohel"
title: "Document title"
output:
worded::rdocx_document
---
```{r setup, include=FALSE}
library(worded)
```
You can then add <!---CHUNK_PAGEBREAK--->
to your document whenever you want a page break.
The package allows various word formatting options using a similar mechanism.
devtools::install_github("davidgohel/officedown")
–
Serbocroatian ....se the information contained in this document. Please ensure that you read the last available version of this document.<!---CHUNK_PAGEBREAK---> # 1.Introduction The main purpose of the expert forum is to form a qualitative and quant...
–
Montero devtools::install_github("davidgohel/officedown")
Installation failed: handle is dead
–
Dodgem R Markdown 1.16 introduced a new feature which allows to insert a page break by adding a paragraph that contains only the commands \pagebreak
or \newpage
:
Paragraph before page break.
\pagebreak
First paragraph on a new page.
See also the pagebreaks section in the R Markdown cookbook.
When updating to R 4.0.0, the <!---CHUNK_PAGEBREAK--->
solution was not working any more for me.
Instead I could use the run_pagebreak()
function from the officer
package, still in combination with the officedown
package:
---
output: word_document
---
```{r settings}
library(officedown)
library(officer)
```
Hello world on page 1
`r run_pagebreak()`
Hello world on page 2
It is not an automated solution. But I have been adding the text '#####page break' to my markdown document. Then in MS Word using find-replace to replace the text "page break" with "^m" (manual page break).
Sungpil's article was close, but didn't quite work. This was the best solution I found for this: https://scriptsandstatistics.wordpress.com/2015/12/18/rmarkdown-how-to-inserts-page-breaks-in-a-ms-word-document/
Even better, the author included the Word template to make this work. The R-blogger's link to his template is broken, and the header is formatted wrong. Some notes I took:
1) You might need to include the whole path to the word template in your Rmd header, like so:
output:
word_document:
reference_docx: C:/workspace/myproject/mystyles.docx
2) The template at the link above changed some of the default style settings so you'll need to change them back
My solution is not very robust but can work for some of us.
Assuming you need a page break before each level 1 title in your word document, I defined this in the format template used in the yaml field reference_docx:
.
In this document you modify the Heading 1 format (or equivalent) to insert a page break before the Title. Do not forget to start your template with the first docx rendered with knitr (pandoc) in RStudio.
In the reference word document, modify the style for the Table of Contents as follows:
- Select TOC
- Selects "styles"
- Under the styles, select "Modify"
- Under modify style, select "Format"
- From the format, select "Paragraph"
- Within Paragraph "Line and Page Breaks" section, check/select "Page break before"
- Click Ok and save the reference document (word_styles.docx) and mention the same in Yaml.
---
output:
word_document:
reference_docx: "word_styles.docx"
---
Ok, I found this in the markdown docs.
Horizontal Rule / Page Break
Three or more asterisks
***
or dashes---
.
© 2022 - 2024 — McMap. All rights reserved.