jgm/citeproc

Detection of English-language entries only permits ISO 639-1 codes

Closed this issue · 2 comments

The old pandoc-citeproc filter was liberal in detecting English bibliography entries for the purposes of putting them into title case, but the new version appears only to allow ISO 639-1 codes. Any of the examples in the following would previously have been rendered in title case.

(The englat example is how Zotero imports Library of Congress data in the case of a bilingual book; but logically, it should probably stay as sentence case.)

Minimal example

pandoc -C -t plain << EOT

---
references:
- id: "en"
  title: "A test"
  language: "en"
- id: "en-GB"
  title: "A test"
  language: "en-GB"
- id: "eng"
  title: "A test"
  language: "eng"
- id: "englat"
  title: "A test"
  language: "englat"
- id: "english"
  title: "A test"
  language: "English"
---

@en; @en-GB; @eng; @englat; @english

EOT

Result (pandoc 2.19.2)

“A Test” (n.d.a); “A Test” (n.d.b); “A test” (n.d.c); “A test” (n.d.d);
“A test” (n.d.e)

All these would have been title-cased by the pandoc-citeproc filter.

jgm commented

We're just going by the CSL spec here, which says:

language
The language of the item;
Should be entered as an ISO 639-1 two-letter language code (e.g. “en”, “zh”), optionally with a two-letter locale code (e.g. “de-DE”, “de-AT”)

I think obeying the spec is the thing to do.

Fair enough! For anyone else who might find this issue, I normalized the language tags in Zotero using batch replacement with the Zutilo plugin.