!! functional but experimental !!
doi2bibtex
This project is a proof-of-principle implementation of the idea to automatically convert a list of DOIs into a useful bibtex file. This means that:
- Bibtex keys need to be nice and unique.
- Journal titles need to be consistently abbreviated.
- Unicode needs to be consistently encoded in latex commands.
Prerequisites
-
Python 3 with BeautifulSoup, bibtexparser and python-slugify. The modules can be installed with
pip3 install beautifulsoup4 bibtexparser python-slugify
-
GNU
realpath
Installation
make
This builds a couple of tools within the repository directory.
make test
This should print
Files test.bib.ref and test.bib are identical
Usage
doi2bibtex.sh <dois.txt >references.bib
dois.txt
is a text file with one DOI per line.
Notes
The conversion from DOI to bibtex consists of five steps:
-
DOI is converted to raw bibtex using a HTTP resolver
curl -LH 'Accept: application/x-bibtex' http://data.crossref.org/10.1002/qua.20315
-
Fields
author
,title
andjournal
in raw bibtex are converted from latex to Unicode using the definitions from W3C [large XML]. -
Journal names are abbreviated using the List of Title Word Abbreviations which was scraped and is stored locally.
-
Pseudo-unique ASCII-converted bibtex keys are generated in format
<last name of first author><initials of journal><last 2 digits of year>*<last two digits of SHA1 hash of DOI>
-
Fields from (2) are converted from Unicode to latex.
Developer notes
- Whenever pushing changes to
download_abbrv.py
orprocess.sql
, also update theabbrv.db
file if it changes.
TODO
- Add caching of response from DOI resolver.
- Convert all bibtex fields to latex.