Knitr & Rmarkdown docx tables
Asked Answered
C

6

36

When using knitr and rmarkdown together to create a word document you can use an existing document to style the output.

For example in my yaml header:

output: 
  word_document:
    reference_docx: style.docx
    fig_caption: TRUE

within this style i have created a default table style - the goal here is to have the kable table output in the correct style.

When I knit the word document and use the style.docx the tables are not stylized according to the table.

Using the style inspector has not been helpful so far, unsure if the default table style is the incorrect style to modify.

Example Code:

```{r kable}
n <- 100
x <- rnorm(n)
y <- 2*x + rnorm(n)
out <- lm(y ~ x)
library(knitr)
kable(summary(out)$coef, digits=2, caption = "Test Captions")
```

I do not have a stylized document I can upload for testing unfortunately.

TL;DR: Want to stylise table output from rmarkdown and knitr automatically (via kable)

Update: So far I have found that changing the 'compact' style in the docx will alter the text contents of the table automatically - but this does not address the overall table styling such as cell colour and alignment.

Update 2: After more research and creation of styles I found that knitr seems to have no problem accessing paragraph styles. However table styles are not under that style category and don't seem to apply in my personal testing.

Update 3: Dabbled with the ReporteRs package - whilst it was able to produce the tables as a desired the syntax required to do so is laborious. Much rather the style be automatically applied.

Update 4: You cannot change TableNormal style, nor does setting a Table Normal style work. The XML approach is not what we are looking for. I have a VBA macro that will do the trick, just want to remove that process if possible.

Curr answered 7/6, 2016 at 6:20 Comment(5)
This has nothing to do with kable as pandoc does the conversion from markdown to docx. You can try creating a TableNormal table style in the reference docx file.Schreib
To @daroczig's point, this is essentially a pandoc problem and possibly a duplicate of #17859098. That post is only concerned with borders, but I suspect the underlying xml could be hacked to get more elaborate formatting...Aquarius
I had a similar problem with styles of the reference_docx in general. Perhaps this is the same issue? github.com/rstudio/rmarkdown/issues/668Theocrasy
@Curr this is after the fact but if you are able to I would highly recommend using knitr with pdflatex. Latex takes a bit more time to figure out but there is a ton of info of questions already asked. Usually whatever you need to do has already been completed and you copy/paste.Cherilyncherilynn
@Cherilyncherilynn this was specifically done in word because clients will want that format.Curr
F
23

This is essentially a combination of the answer that recommends TableNormal, this post on rmarkdown.rstudio.com and my own experiments to show how to use a TableNormal style to customize tables like those generated by kable:

RMD:

---
output:
  word_document
---

```{r}
knitr::kable(cars)
```
  • Click "Knit Word" in RStudio. → The document opens in Word, without any custom styles yet.
  • In that document (not in a new document), add the required styles. This article explains the basics. Key is not to apply direct styles but to modify the styles. See this article on support.office.com on Style basics in Word.
  • Specifically, to style a table you need to add a table style. My version of Word is non-English, but according to the article linked above table styles are available via "the Design tab, on the Table Tools contextual tab".
  • Choose TableNormal as style name and define the desired styles. In my experiments most styles worked, however some did not. (Adding a color to the first column and making the first row bold was no problem; highlighting every second row was ignored.) The last screenshot in this answer illustrates this step.
  • Save the document, e.g. as styles.docx.
  • Modify the header in the RMD file to use the reference DOCX (see here; don't screw up the indentation – took me 10 minutes find this mistake):

    ---
    output:
      word_document:
        reference_docx: styles.docx
    ---
    
  • Knit to DOCX again – the style should now be applied.

Following the steps I described above yields this output:

Output

And here a screenshot of the table style dialog used to define TableNormal. Unfortunately it is in German, but maybe someone can provide an English version of it:

Table Style


As this does not seem to work for most users (anyone but me …), I suggest we test this systematically. Essentially, there are 4 steps that can go wrong:

  • Wrong RMD (unlikely).
  • Differences in the initially generated DOCX.
  • Differences in how the TableNormal style is saved in the DOCX.
  • Differences in how the reference DOCX is used to format the final DOCX.

I therefore suggest using the same minimal RMD posted above (full code on pastebin) to find out where the results start do differ:

The three files are generated on the following system: Windows 7 / R 3.3.0 / RStudio 0.99.896 / pandoc 1.15.2 / Office 2010.

I get the same results on a system with Windows 7 / R 3.2.4 / RStudio 0.99.484 / pandoc 1.13.1 / Office 2010.

I suppose the most likely culprits are the pandoc and the Office versions. Unfortunately, I cannot test other configurations at the moment. Now it would be interesting to see the following: For users where it does not work, what happens …

  • … if you start from my initial.docx?
  • If that does not work, what if you use my reference.docx as reference document?
  • If nothing works, are there eye-catching differences in the generated XML files (inside the DOCX container)? Please share your files and exact version information.

With a number of users running these tests it should be possible to find out what is causing the problems.

Fugere answered 17/6, 2016 at 21:47 Comment(13)
This does not work on English versions on Word it seems, if you had noticed update 4 I had already tried exactly this. I've since tried it again with completely new documents following the answer you provided and it does not work. drive.google.com/file/d/0B9p9Rsa-Y-hRNVJDZDlCWDNDbWs/…Curr
@Curr I noticed update 4, but assumed that you were doing something wrong. Interestingly, if I use the DOCX file you uploaded to Google Drive as reference DOCX, I get this output (red background in first column). So the reference document you created works on my system – very strange. I'll check this with a different version of Word as soon as I can; currently I'm using Office 2010.Fugere
Yea that is interesting. I have tried this on 4 systems so far and none have worked. All using office 365 however.Curr
That does not work neither on my computer (Office 2016). In case I don't receive any other answer you will be awarded the bounty for effort and level of details.Edgardoedge
@EricLecoutre Hopefully, we'll be able to figure out what is causing these problems. If it's actually a matter of Office versions, then this could be worth a bug report to pandoc.Fugere
I just tried to view the documents with LibreOffice and can confirm it still does not work.Curr
@EricLecoutre and zacdav: Updated the answer. Let's see whether we can figure out what is causing the problems. Unfortunately, I will have only very limited time to support this process (my master's thesis is due in 10 days – this has priority …).Fugere
Guy: I hope you the best for your master thesis: resistance to final rush , good presentation and nice results!Edgardoedge
Thanks a lot, Eric!Fugere
None of your supplied documents generate the desired result on the computers I tried with office 365 / English VersionCurr
@Curr What a pity ... Can you upload one of the non-working DOCX files?Fugere
Ive already uploaded mine prior and you have all yours. Try opening them on an English version.Curr
I'd be curious to know if anyone has tried this with success and what language/system they used. I've yet to see this work.Curr
M
7

This was actually a known issue. Fortunately, it was solved in v2.0 or later releases of pandoc.

And I have tested the newer version, and found that there is a newly-added hidden style called "Table". Following @CL.'s suggestions to change "Table" style in reference.docx will be okay now.

In addition, look at this entry of pandoc's v2.0 release notes:

Use Table rather than Table Normal for table style (#3275). Table Normal is the default table style and can’t be modified.

Mink answered 11/12, 2017 at 12:32 Comment(0)
M
6

As of 2021, I could not get any of the other suggested answers to work.

However, I did discover the {officedown} package, which, amongst other things, supports the styling of tables in .docx documents. You can install {officedown} with remotes::install_github("davidgohel/officedown")

To use {officedown} to render .Rmd to .docx you must replace

output:
  word_document

in your document header with

output:
  officedown::rdocx_document

In addition to this the {officedown} package must be loaded in your .Rmd.

As with the word_document output format, {officedown} allows us to use styles and settings from template documents, again with the reference_docx parameter.

With a reference document styles.docx, a minimal example .Rmd may look like:

---
date: "2038-01-19"
author: "The Reasonabilists"
title: "The end of time as we know it"
output: 
  officedown::rdocx_document:
    reference_docx: styles.docx
---

```{r setup, include = FALSE}
# Don't forget about me: I'm important!
library("officedown")
```

{officedown} allows us to go one step further and specify the name of the table style to use in the document's front matter. This table style could be a custom style we created in styles.docx, or it could be one of Word's in-built styles you prefer.

Let's say we created a style My Table:

enter image description here

We could tell {officedown} to use this table style in our front matter as:

output: 
  officedown::rdocx_document:
    reference_docx: styles.docx
    tables:
      style: My Table

Putting this altogether, knitting the minimal .Rmd:

---
date: "2038-01-19"
author: "The Reasonabilists"
title: "The end of time as we know it"
output: 
  officedown::rdocx_document:
    reference_docx: styles.docx
    tables:
      style: My Table
---

```{r setup, include = FALSE}
# Don't forget about me: I'm important!
library(officedown)
```

```{r}
head(mtcars)
```

Resulting in a .docx document which looks like:

enter image description here

Melvinamelvyn answered 10/3, 2021 at 22:26 Comment(1)
anyone know whether officedown can generate gt() formatted tables in word?Hales
C
2

TableNormal doesn't work for me too.

On my Dutch version of Word 2016 (Office 365), I found out that I could markup tables with the style Compact.

Input (refdoc.docx contains the Compact style):

---
title: "Titel"
subtitle:  "Ondertitel"
author: "`r Sys.getenv('USERNAME')`"
output:
  word_document:
    toc: true
    toc_depth: 2
    fig_width: 6.5
    fig_height: 3.5
    fig_caption: true
    reference_docx: "refdoc.docx"
---

And RMarkdown:

# Methoden {#methoden}
```{r}
kable(cars)
```

Output:

img

Cascabel answered 23/10, 2017 at 11:25 Comment(2)
I can confirm that this also works in the English version of Word 2016. A small problem with this solution is that bullets are also formatted using the 'Compact' style. I have not found a way to format tables and bullets independently.Imprudent
Hmm, maybe we should knit a new default document and merge these styles with our own reference style file. I’ll have a look again.Cascabel
A
2

You need to have a reference_docx: style.docx which has "Table" style in it. (see @Liang Zhang's explanation and links above).

  1. Create a basis reference document using pandoc (source). In command line (or cmd.exe on Windows) run: pandoc -o custom-reference.docx --print-default-data-file reference.docx
  2. In this newly created reference.docx file, find the table created (a basic 1 row table with a caption).
  3. While the table is selected, click "Table Design" and find "Modify Table Style":

enter image description here

  1. Modify the style of the table as you wish and use this reference document in your RMD document (see the first answer by @CL.).

Using this reference document, you can also change the table and figure caption styles.

Anneliese answered 10/8, 2020 at 17:19 Comment(0)
N
0

I was able to get my word output to use a default table style that I defined in a reference .docx.

Instead of 'TableNormal', the table style it defaulted to was 'Table'.

I discovered this by knitting an rmarkdown with a kable.

---
date: "December 1, 2017"
output: 
  word_document:
    reference_docx: Template.docx
---
`r knitr::kable(source)`

Then I took a look at that generated document's XML to see what style it had defaulted to.

require(XML)

docx.file <- "generated_doc.docx"

## unzip the docx converted by Pandoc
system(paste("unzip", docx.file, "-d temp_dir"))
document.xml <- "temp_dir/word/document.xml"

doc <- xmlParse(document.xml)
tblStyle <- getNodeSet(xmlRoot(doc), "//w:tblStyle")

tblStyle

I defined the 'Table' style to put some color and borders in the reference docx. This works for one standard table style throughout the document, I haven't found a way to use different styles throughout.

This stayed true even after I opened the reference doc and edited it.

Nakitanalani answered 1/12, 2017 at 21:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.