/east-biblio

Bibliographic data for EAST

Primary LanguageTeXCreative Commons Attribution Share Alike 4.0 InternationalCC-BY-SA-4.0

README for EAST’s bibliographic records

The repository east-biblio contains the bibliographic data for *EAST*. Short for “Epistemology and Argumentation in South Asia and Tibet”, EAST systematically collects information on works of epistemology and argumentanion in South Asia and Tibet, their authors, and relevant modern literature.

The bibliographic records are kept in a BibLaTeX file, file:./bib/east.bib. The records are encoded according to the general BibLaTeX rules (see the section “Database guide”), and a set of conventions specific to EAST.

Encoding Guidelines for east-biblio

The bibliographic database in east-biblio (file:./bib/east.bib) is supposed validate against the datamodel (file:./bib/biber.conf). Running make verify in this project’s root directory should not report any errors.

This does not mean that the encoding is “ideal” or even recommended in all cases. It is still work in progress. The reason for this is that much of east-biblio was derived from MODS records, which do not have a 1:1 mapping to the fields that biblatex employs.

Here follow a few explanations on the encoding that is used in east-biblio.

Entry types

EAST-biblio currently supports the following types of bibliographic entries:

  • @article
  • @book
  • @collection
  • @inbook
  • @incollection
  • @misc
  • @thesis
  • @unpublished
  • (and @xdata, which is just a shortcut for storing sets of data for other bibliographic entries)

These types are a small subset of what biblatex supports. Keeping this permitted set small is helpful for maintaining the bibliography.

If other types should be added, the datamodel (file:./bib/biber.conf, <entrytypes/>) needs to be changed.

Hints on particular tools

Jabref hints

Here are some useful things to know about jabref:

  1. Use the search/filter function: read https://help.jabref.org/en/Search. It’s somewhat obvious, but remember you can easily search for fields:
    1. journal = chizan, or keywords=canonical, etc.
    2. Combine them: keywords=canonical and title = tshad
    3. For “spaces” in the search string, use quotes: keywords = canonical and translator = "zha ma"
    4. Empty fields: author != .+
    5. Finding Tibetan canonical entries where something is missing:
      • No co ne: language = bo and keywords = canonical and series != "co ne"
      • No Peking: language = bo and keywords = canonical and series != Peking
      • etc.
  2. Linking to east: Take the numerical part of the bib_id (e.g., 27507 in east:27507), and add it to https://east.ikga.oeaw.ac.at/bib/ (e.g.: https://east.ikga.oeaw.ac.at/bib/27507)

Layout of this repository

The most important files are:

  1. file:./README.org (this file)
  2. file:./bib/east.bib: A single BibLaTeX file, to be edited by a program of choice (e.g., http://www.jabref.org/)
    • Whichever program you use, make sure that you use a “biblatex” editing mode (Jabref: Options -> Preferences -> General -> Default bibliography mode -> “BiBLaTeX”, for example)

The other directories contain files that should not be edited normally. They contain automatically generated content (pulled from EAST’s database), and their content is subject to change.

  1. file:./archived: contains all the bibliographic data in a normalized (mods) xml format
  2. file:./archived/sources: this directory contains all the bibliographic sources, in the raw format (they are not all MODS files).

General workflow

Before starting to edit anything, always make sure your repository is up to date. You can do this with

  • git status: checks for local changes
  • git pull: gets updates from the remote repositories

After you have finished your work, adding or changing entries, you will usually want to share your changes, typically:

  1. git commit: add a short message describing what you did
  2. git push: send changes to central repository

The overall aim is then to add or update an entry in the main bibliographic file, file:./bib/east.bib

Guidelines for adding/updating entries

  • We follow the BibLaTeX standards, as described in Chapter 2 (“Database Guide”) of http://tug.ctan.org/macros/latex/exptl/biblatex/doc/biblatex.pdf, with the following additional rule(s):
    • Bibliographic entries that contain data fields in multiple languages or scripts need to have the following format: transcribed-title-or-name [*translated title or name]. For example:
      title = {Ninshikiron: chikaku no riron to sono tenkai [*Epistemology: the Theory of Perception and its Development]},
              

      Note that in such cases we do not use Biblatex’s subtitle field!

  • We use unicode encoding in the bib file.

Git useful commands

Check status

  1. To see what has changed: git status
  2. To see differences: git diff

To save local changes

  1. git commit -a (commits all changes locally)

To share things

  1. Get latest version: git pull
  2. Upload local version (if committed): git push

2019.03.08

Bibliography processing

Formatting the bibliography

EAST formats its bibliographic records with file:styles/chicago-author-date-east.csl (see https://github.com/ea-east/styles/commits/master/chicago-author-date-east.csl).

echo """---
title: EAST Bibliography (formatted)
nocite: |
 @*
...
""" | pandoc \
          --standalone \
          --csl=styles/chicago-author-date-east.csl \
          --bibliography=bib/east.bib  \
          -o /tmp/east-bib.html

Or to see just a single record:

echo """---
title: EAST Bibliography (formatted)
nocite: |
 @east:27507
...
""" | pandoc \
          --standalone \
          --csl=styles/chicago-author-date-east.csl \
          --bibliography=bib/east.bib  \
          -o /tmp/east-bib.html

Conversions

For conversions between formats, first install bibutils (https://sourceforge.net/p/bibutils/home/Bibutils/).

MODS -> bib(la)tex

parallel xml2bib \
         --output-encoding unicode \
         --no-bom \
         --whitespace \
         --strictkey \
         --finalcomma \
         --brackets \
         --no-latex \
         ::: *xml > /tmp/east-bibs.bib

biblatex -> MODS

parallel biblatex2xml \
         --input-encoding unicode \
         --unicode-characters \
         --unicode-no-bom \
         --no-latex \
         ::: *.bib > /tmp/east-bibs.mods

Database/Django commands

To generate initial data from what’s in the Django backend, do something like this:

import os
import shutil
from biblio.models import BibliographicEntry
from biblio.stuff import *
from django.utils.text import slugify
from lxml import etree

outdir = "/tmp/east-biblio-exports"

if os.path.exists(outdir):
    shutil.rmtree(outdir)

os.makedirs(outdir)

def write_bibs(bibs, subdir):
    """Write useful formats (mods, source, bib) of every bib in bibs (a
query object) into outputdir/subdir."""
    suboutdir = os.path.join(outdir, subdir)

    if os.path.exists(suboutdir):
        shutil.rmtree(suboutdir)
    os.makedirs(suboutdir)
  

    for bib in bibs:
        basename = "%s__%s" % (bib.id,
                               slugify(bib.pretty_short)[:40])
        modsoutfile = open(
            os.path.join(suboutdir,
                         "%s.mods.xml" % (basename)),
            "w")

        print("Writing %s" % modsoutfile)
        modsoutfile.write(
            etree.tostring(
                etree.fromstring(bib.get_mods()),
                encoding=str,
                pretty_print=True
            ))
        modsoutfile.close()

        biboutfile = open(
            os.path.join(suboutdir,
                         "%s.bib" % (basename)),
            "w")

        print("Writing %s" % biboutfile)
        biboutfile.write(bib.get_bibtex(putids=True))
        biboutfile.close()

        sourceoutfile = open(
            os.path.join(suboutdir,
                         "%s.src" % (basename)),
            "w")
        print("Writing %s" % sourceoutfile)
        sourceoutfile.write(bib.source)
        sourceoutfile.close()



write_bibs(BibliographicEntry.objects.filter(repository="TAMB"), "tamboti")

Notes

  • [ ] Look for date/year fields containing “X” or “?” (something was unclear).
  • [ ] Correct all occurrences of the macro “unknown” (see @string{unknown).
  • [ ] Check the misc and unpublished items.
  • [ ] Some information was lost by ignoring //relatedItem[@type="otherVersion"] (e.g., https://east.ikga.oeaw.ac.at/bib/5472/).
  • [ ] Some information was lost due to preceding/following relations in MODS.
  • Except for Śaṅkaranandana, EAST records very few manuscripts. If more manuscripts are to be added, we will probably want a dedicated type.