rstudio/rmarkdown-cookbook

Document that ONLY `bibliography` key implicitly triggers citeproc

fkohrt opened this issue · 3 comments

Only providing a bibliography via pandoc_args does not trigger citeproc:

---
title: Test
output:
  html_document:
    pandoc_args: --bibliography=bibliography.bib
---

```{cat, engine.opts=list(file = "bibliography.bib")}
@Manual{Xie2020,
  title = {rmarkdown: Dynamic Documents for R},
  author = {JJ Allaire and Yihui Xie and Jonathan McPherson and
    Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley
    Wickham and Joe Cheng and Winston Chang and Richard Iannone},
  year = {2020},
  note = {R package version 2.1},
  url = {https://github.com/rstudio/rmarkdown},
}
```

@Xie2020

Only after providing the key bibliography in the metadata, it works:

---
title: Test
output:
  html_document:
    pandoc_args: --bibliography=bibliography.bib
bibliography: ""
---

```{cat, engine.opts=list(file = "bibliography.bib")}
@Manual{Xie2020,
  title = {rmarkdown: Dynamic Documents for R},
  author = {JJ Allaire and Yihui Xie and Jonathan McPherson and
    Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley
    Wickham and Joe Cheng and Winston Chang and Richard Iannone},
  year = {2020},
  note = {R package version 2.1},
  url = {https://github.com/rstudio/rmarkdown},
}
```

@Xie2020

Alternatively (and more precise), one can could also provide --citeproc via pandoc_args.

I don't necessarily consider the described behaviour a bug, but documenting the fact that certain methods of providing a bibliography like --bibliography (or putting the bibliography key inside an external --metadata-file) do not (implicitly) trigger --citeproc might be helpful. Maybe in the section "Bibliographies and citations" of the RMarkdown Cookbook.

You may ask why anyone might not just use the bibliography key. When working with the targets package (and without tarchetypes) it is necessary to declare the dependency within the _targets.R file (and hence, within rmarkdown::render):

# _targets.R
# setup...
list(
  # existing targets...
  tar_target(
    bibliography,
    "bibliography.bib",
    format = "file"
  ),
  tar_target(
    document,
    {
      # Return relative paths to keep the project portable.
      fs::path_rel(
        # Need to return/track all input/output files.
        c( 
          rmarkdown::render(
            input = "document.Rmd",
            output_options = list(pandoc_args = c(paste0("--bibliography=\"",
                                                         bibliography, "\""),
                                                  rmarkdown::pandoc_citeproc_args())),
            # Always run from the project root
            # so the report can find _targets/.
            knit_root_dir = getwd(),
            quiet = TRUE
          ),
          "document.Rmd"
        )
      )
    },
    format = "file"
  )
)

When using tarchetypes, one can just use tarchetypes::tar_render(...) (inside _targets.R) and then add

bibliography: "`r targets::tar_read(bibliography)`"
cderv commented

Thanks. I understand the usecase here. We'll see how we can improve the documentation for this.

I need to think about this because the R Markdown Cookbook doesn't aim to document how to use Pandoc directly. We document explicitly that using citations requires the bibliography to be set in YAML.

When a user is using pandoc_args, this means the aim is to directly use Pandoc without benefiting from rmarkdown feature. In this case, it seems expected that --citeproc should be passed as argument, as documented in the Pandoc MANUAL.

However, I understand the usecase with target, and the tarchetypes usage seems the right way to me.

I fear that if we start documenting every edge case involving Pandoc flags, we'll end up with documenting every Pandoc feature a second time.

I'll see the best way to add a sentence about this though.

(And about adding the --citeproc flag in more detection cases maybe... 🤔 )

Thanks a lot for your feedback !

When a user is using pandoc_args, this means the aim is to directly use Pandoc without benefiting from rmarkdown feature. In this case, it seems expected that --citeproc should be passed as argument, as documented in the Pandoc MANUAL.

This makes sense to me. Feel free to close this, it was the sudden lack of automation that confused me for a moment. One could add a half sentence: The basic usage requires us to specify a bibliography file using the `bibliography` metadata field in YAML, which triggers citation processing. Or just leave it as it is.