In `knitr` how can I test for if the output will be PDF or word?
Asked Answered
L

5

43

I would like to include specific content based on which format is being created. In this specific example, my tables look terrible in MS word output, but great in HTML. I would like to add in some test to leave out the table depending on the output.

Here's some pseudocode:

output.format <- opts_chunk$get("output")

if(output.format != "MS word"){
print(table1)
}

I'm sure this is not the correct way to use opts_chunk, but this is the limit of my understanding of how knitr works under the hood. What would be the correct way to test for this?

Lissie answered 2/2, 2016 at 2:42 Comment(3)
What's your source document? RMarkdown? And your functions to generate the final document? Why not create a new variable yourself that describes the output format and condition on that?Psychosurgery
Yes, RMarkdown is my source. I'm using Rstudio as well. I can make my own variable and then just change the code each time I run it, but I was hoping to have it detect what the output is so I can just use the dropdown menu in Rstudio.Lissie
This article has some good ideas: jmablog.com/post/bookdown-not-bookoutLissie
M
55

Short Answer

In most cases, opts_knit$get("rmarkdown.pandoc.to") delivers the required information.

Otherwise, query rmarkdown::all_output_formats(knitr::current_input()) and check if the return value contains word_document:

if ("word_document" %in% rmarkdown::all_output_formats(knitr::current_input()) {
  # Word output
}

Long answer

I assume that the source document is RMD because this is the usual/most common input format for knitting to different output formats such as MS Word, PDF and HTML.

In this case, knitr options cannot be used to determine the final output format because it doesn't matter from the perspective of knitr: For all output formats, knitr's job is to knit the input RMD file to a MD file. The conversion of the MD file to the output format specified in the YAML header is done in the next stage, by pandoc.

Therefore, we cannot use the package option knitr::opts_knit$get("out.format") to learn about the final output format but we need to parse the YAML header instead.

So far in theory. Reality is a little bit different. RStudio's "Knit PDF"/"Knit HTML" button calls rmarkdown::render which in turn calls knit. Before this happens, render sets a (undocumented?) package option rmarkdown.pandoc.to to the actual output format. The value will be html, latex or docx respectively, depending on the output format.

Therefore, if (and only if) RStudio's "Knit PDF"/"Knit HTML" button is used, knitr::opts_knit$get("rmarkdown.pandoc.to") can be used to determine the output format. This is also described in this answer and that blog post.

The problem remains unsolved for the case of calling knit directly because then rmarkdown.pandoc.to is not set. In this case we can exploit the (unexported) function parse_yaml_front_matter from the rmarkdown package to parse the YAML header.

[Update: As of rmarkdown 0.9.6, the function all_output_formats has been added (thanks to Bill Denney for pointing this out). It makes the custom function developed below obsolete – for production, use rmarkdown::all_output_formats! I leave the remainder of this answer as originally written for educational purposes.]

---
output: html_document
---
```{r}
knitr::opts_knit$get("out.format") # Not informative.

knitr::opts_knit$get("rmarkdown.pandoc.to") # Works only if knit() is called via render(), i.e. when using the button in RStudio.

rmarkdown:::parse_yaml_front_matter(
    readLines(knitr::current_input())
    )$output
```

The example above demonstrates the use(lesness) of opts_knit$get("rmarkdown.pandoc.to") (opts_knit$get("out.format")), while the line employing parse_yaml_front_matter returns the format specified in the "output" field of the YAML header.

The input of parse_yaml_front_matter is the source file as character vector, as returned by readLines. To determine the name of the file currently being knitted, current_input() as suggested in this answer is used.

Before parse_yaml_front_matter can be used in a simple if statement to implement behavior that is conditional on the output format, a small refinement is required: The statement shown above may return a list if there are additional YAML parameters for the output like in this example:

---
output: 
  html_document: 
    keep_md: yes
---

The following helper function should resolve this issue:

getOutputFormat <- function() {
  output <- rmarkdown:::parse_yaml_front_matter(
    readLines(knitr::current_input())
    )$output
  if (is.list(output)){
    return(names(output)[1])
  } else {
    return(output[1])
  }
}

It can be used in constructs such as

if(getOutputFormat() == 'html_document') {
   # do something
}

Note that getOutputFormat uses only the first output format specified, so with the following header only html_document is returned:

---
output:
  html_document: default
  pdf_document:
    keep_tex: yes
---

However, this is not very restrictive. When RStudio's "Knit HTML"/"Knit PDF" button is used (along with the dropdown menu next to it to select the output type), RStudio rearranges the YAML header such that the selected output format will be the first format in the list. Multiple output formats are (AFAIK) only relevant when using rmarkdown::render with output_format = "all". And: In both of these cases rmarkdown.pandoc.to can be used, which is easier anyways.

Meuser answered 2/2, 2016 at 9:10 Comment(11)
Many thanks @Meuser for this very informative explanation. You are correct in your assumptions that this is an RMD source. You mention at the end that "only the first output format" is returned. Does this mean that this solution will always return the first item in the YAML header and not necessarily the output format I'm currently running? So in your example, if I'm knitting a PDF, the getOutputFormat function will still return "html_document"?Lissie
@Lissie I updated my answer. The last paragraph should answer your question: The current output format will be the first in the list (usually). But probably you are more interested in the "short solution" - see the update.Meuser
Got it, that makes sense. Thank you!Lissie
It looks like something very similar to this has been made official in the current rmarkdown package. See the all_output_formats function added in March.Detector
@BillDenney Thank you for pointing this out. Indeed, rmarkdown::all_output_formats makes manually parsing the YAML header obsolete. I updated the answer.Meuser
I have independently stepped through the current bookdown and rmarkdown and come to the same conclusion. opts_knit$get("rmarkdown.pandoc.to") gives the value that was retrieved from calling the very output function (like bookdown::gitbook or bookdown::pdf_book) used to render the text.Azide
@CL Based on your explanation about RStudio rearranging the YAML header, I would change your test instead into document <- rmarkdown::all_output_formats(knitr::current_input())[1] if (grepl("word", document)) { ... }Mangle
@greendiod Not sure; what you suggest would check if the first output format is Word, the current solution checks if any output format is Word (Correct? I may have forgot some details in the meantime). It depends on the use case whether the former or the latter is more informative.Meuser
@CL depends effectively on the use case. But still, can you actually output many formats at the same time? I thought you could only output one format at a time (at least, that's what the buttons in RStudio seem to suggest). In this case, you're interested in the first format in the list of output formats.Mangle
@greendiod Multiple output formats are possible, but only with rmarkdown::render (see last paragraph of the answer).Meuser
If you're changing the root.dir for the document, you have to add the option dir = TRUE to the call to knitr::current_input() (see this gist: gist.github.com/Hugovdberg/57aac048f4bb5e863819767df4d23867)Doughnut
B
37

Since knitr 1.18, you can use the two functions

knitr::is_html_output()

and

knitr::is_latex_output()
Blossomblot answered 26/1, 2018 at 8:53 Comment(0)
D
8

Just want to add a bit of clarification here, since I often render the same Rmarkdown file (*.Rmd) into multiple formats (*.html, *.pdf, *.docx), so rather than wanting to know if the format of interest is listed amongst those specified in the front matter yaml (i.e. "word_document" %in% rmarkdown::all_output_formats(knitr::current_input()), I want to know which format is being currently rendered. To do this you can either:

  1. Get first element of formats listed in front matter: rmarkdown::all_output_formats(knitr::current_input()[1]; or

  2. Get default output format name: rmarkdown::default_output_format(knitr::current_input())$name

For example...

---
title: "check format"
output:
  html_document: default
  pdf_document: default
  word_document: default
---

```{r}
rmarkdown::all_output_formats(knitr::current_input())[1]
```

```{r}
rmarkdown::default_output_format(knitr::current_input())$name
```

```{r}
fmt <- rmarkdown::default_output_format(knitr::current_input())$name

if (fmt == "pdf_document"){
  #...
}

if (fmt == "word_document"){
  #...
}
```
Dichroscope answered 23/12, 2017 at 0:40 Comment(0)
S
3

One additional point: the above answers don't work for an html_notebook, since code is being executed directly there and knitr::current_input() doesn't respond. If you know the document name you can call all_output_formats as above, specifying the name explicitly. I don't know if there's another way to do this.

Strickle answered 17/3, 2017 at 16:30 Comment(0)
T
1

This is what I use

library(stringr)
first_output_format <-
  names(rmarkdown::metadata[["output"]])[1]
if (!is.null(first_output_format)) {
  my_output <- str_split(first_output_format,"_")[[1]][1]
} else {
  my_output = "unknown"
}
Timorous answered 20/6, 2021 at 22:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.