Input html table from file into R Markdown, knit to Word?
Asked Answered
W

1

2

I am working with an R Markdown file that we need to be able to knit both to pdf and Word (for a co-author). We also have regression tables generated in stargazer that, due to the size of the data, are computed separately and two files are created: regression_table.tex and regression_table.html.

When knitting to pdf, I can easily add the table to R Markdown with the latex command \input.

\input{"regression_table.tex"}

To knit to Word, though, I've not been able to find an easy equivalent to \input for the html file. One option is to manually insert the html table file within Word and that works fine as a low-tech back-up option. Another partial solution uses modified code from a related question. With the code chunk below, I am able to knit to html and then import the html document to Word. This maintains the table format but other formatting, like headers and figures, gets messed up.

```{r echo = FALSE, results = 'asis'}
tmp <- paste(readLines(here("regression_table.html")), collapse="\n")

cat(tmp)
```

Is there a simple equivalent to \input for an html table in a file that works well with knitting to Word?

Wellchosen answered 23/8, 2020 at 20:30 Comment(2)
Save the table as an Rdata file and load that one?Amphibole
That's a good suggestion but it turns out the stargazer object is just the same string of html text that gets saved into the html file. Your idea also helped me realize that in the simplest possible case of a stargazer table created in html format within an Rmd and knit to Word (i.e, no external file), the table still does not knit to Word smoothly. In short, a stand alone stargazer html file saved with the out option opens smoothly in Word but stargazer output generated within an Rmd or imported into an Rmd does not knit to Word well.Wellchosen
W
2

This is not an ideal solution but, using the webshot package, it's easy to convert an html file to an image file that then can easily be imported to R Markdown with knitr::include_graphics. Three advantages of this approach are (1) it works automatically; (2) it preserves formatting well; and (3) it could work with other table-making packages or, for that matter, any external html file (or webpage). In addition, I've added some code at the top so the Rmd automatically incorporates the right external file (.tex or .html) depending on whether I knit to pdf or word.

Note, if you haven't used webshot before, need to run webshot::install_phantomjs() (my thanks to JacobG for pointing this out).

```{r create_output_logicals, include = FALSE}
# https://mcmap.net/q/507220/-knitr-is_word_output-to-check-if-the-current-output-type-is-word-just-like-knitr-is_latex_output-and-knitr-is_html_output

is_word_output <- function(fmt = knitr:::pandoc_to()) {
  length(fmt) == 1 && fmt == "docx"
}

# create logical variables that indicate knitting output format 
latex_lgl <- knitr::is_latex_output()
html_lgl  <- knitr::is_html_output()
word_lgl  <- is_word_output()
```

```{r load_packages, include = FALSE}
library(stargazer)
library(webshot)
```

```{r create_table, include = FALSE}    
lm1 <- lm(mpg ~ wt,       data = mtcars)
lm2 <- lm(mpg ~ wt + cyl, data = mtcars)

stargazer(
  lm1, lm2, 
  type   = 'html', 
  header = FALSE, 
  out    = 'regression_table.html'
)

stargazer(
  lm1, lm2, 
  type   = 'latex', 
  header = FALSE, 
  out    = 'regression_table.tex'
)
```

```{r regression_table_word, echo = FALSE, eval = word_lgl}

webshot(
    url  = "regression_table.html", 
    file = "regression_table.png",
    zoom = 2   # doubles the resolution
)

knitr::include_graphics("regression_table.png")

```

```{r regression_tables_tex, results = 'asis', echo = FALSE, eval = latex_lgl}
# if not knit to word document, use latex \input for tex tables
# line spacing assumes YAML/header includes: \usepackage{setspace}
# header-includes: |
#   \usepackage{setspace}\doublespacing

cat(
'\\singlespacing
 \\input{"regression_table.tex"}
 \\doublespacing'
)
```

Note, the table/image will not be centered in Word. The image that's created by webshot is padded with whitespace. If centering is important, you'll need to trim the image with either the cliprect option in webshot() or using something like the magick package with magick::image_trim. In addition, you would probably need to create a Word template.

Wellchosen answered 24/8, 2020 at 15:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.