ISO-TC211/XML

Codelists for language and characterset

smrgeoinfo opened this issue · 6 comments

Hi I noticed this issue is still open. Does anyone happen to know which codelists people are using instead?

Hello,

The Library of Congress is the official source for the ISO 639-2 and ISO 639-5 language codes (see https://www.loc.gov/standards/iso639-2/ and https://www.loc.gov/standards/iso639-5/). That page also provides a link to the official source of ISO 639-3 language codes.

Unfortunately, the code lists are not always available for download in an XML format (the ISO 639-2 codes, for example, can be downloaded in a CSV-based format), let alone an ISO 19139 code list dictionary. In a recent project we've implemented a ShapeChange transformation to load the ISO 639-2 codes into an application schema with code list LanguageCode and derive an ISO 19139 code list dictionary (the downside of this approach is that you would need to keep the dictionary in synch with the official source). That transformation will be part of the next release of ShapeChange, which is planned for end of summer this year.

Lexvo.org is a nice resource in this space. It provides machine-readable multi-lingual descriptions, e.g
http://lexvo.org/id/iso639-3/eng

See http://www.lexvo.org/linkeddata/resources.html

And the LoC/MARC linked data service is good too:
e.g.
https://id.loc.gov/vocabulary/languages/eng
https://id.loc.gov/vocabulary/countries/xxu

The full set is here: https://id.loc.gov/

@dr-shorthair Thanks for the reference to https://id.loc.gov/! I was not aware of it.

@jechterhoff @dr-shorthair Thank you for your responses!