/getcitations

R script to extract pandoc citations from markdown text and create local .bib file from a master BibTeX library.

Primary LanguageRMIT LicenseMIT

getcitations.R: A simple Rscript for localized .bib files

getcitations.R is an R script for creating a local .bib file from a master BibTeX library based on the citations in a markdown document.

The script searches through Pandoc-styled markdown documents for citation keys and pulls the corresponding citation from a central .bib file.

For me, this file is generated by Mendeley and automatically contains every article in my library. But Mendeley adds lots of unnecessary entries; thus getcitations.R keeps only the minimum required entries. Add keywords to the keep variable to suite your needs.

I made this script so that documents produced with R, knitr, and pandoc can be entirely self-contained, even though my BibTeX reference file is centrally maintained.

Usage

$ Rscript getcitations.R <in doc file> <out bib file>

or use getcitations.sh:

$ ./getcitations.sh <name>.md
# returns <name>.bib

WARNING! I assume all synchronization occurs through the central file, so the output file is blindly overwritten.

Setup

To setup this script, I recommend the following:

  • Copy getcitations.* to ~/.pandoc/.
  • Add an alias for getcitations.sh to your ~/.bash_profile.
  • Add a symlink called library.bib pointing to your central BibTeX file by running ln -s /path/to/central/library.bib from ~/.pandoc.

Requires stringr, Hadley Wickham's string processing package for R.

Behind the scenes

Following the pandoc citation style, the script looks for citations like: @CitationKey

Citations start with [, a space or at the begining of a line, followed by an @, the citation key with any punctuation + alphanum chars and end with a space, ], ;, , or at the end of a line.

BibTeX entries start on lines like @<type>{CitationKey, and end with a line with a single closing bracket }.