/sortbibtex

Clean/sort/handle a bibtex file (in python)

Primary LanguageTeXMIT LicenseMIT

sortbibtex

Python2 script that loads a bibtex file, validates its content (see Bibtex style), and saves it in a clean version.
When used with option '--subset' (see Usage and Example_subset), it creates a new bibtex file only including a subset of the original items (this may be useful to create a project-specific file out of a master bibtex file).

Usage

usage: sortbibtex.py [-h] [--subset SUBSET] [-d] refs

Clean and sort a bibtex file.

positional arguments:
  refs             bibtex file to be cleaned/sorted

optional arguments:
  -h, --help       show this help message and exit
  --subset SUBSET  only use items listed in the file SUBSET (default: use all)
  -d               dry run (only parse, do not write any file)

Bibtex style

The script requires enforcement of a set of style properties in the bibtex file:

  • Titles should be enclosed in double curly brackets.
  • Item types (article, book, ..) should be in lower case.
  • Do not break long lines (e.g. a long authors list).
  • Comments should start with '#'.
  • 'month' field should not be included ('mmonth', which is not recognized by bibtex, is accepted).
  • Tabs should be replaced by spaces.
  • arXiv ID should be in the 'note' field.

These choices are as of my taste. To modify them, look at this code block:

print 'Parsing %s - start' % file_input
...
...
print 'Parsing %s - end (total number of keys: %i)' % (file_input, num_items)