/glottolog

Collaborative data curation for Glottolog

Primary LanguageTeXOtherNOASSERTION

The Glottolog Data Repository

The Glottolog data repository is the place where the data served by the Glottolog web application is curated. But the repository also provides an alternative way to access Glottolog's data locally, and possibly even locally customized data by forking glottolog/glottolog.

Accessing Glottolog data

  • This repository is the place where Glottolog data is curated. So it's the right place to open issues about errors you identified and to propose changes. A clone of this repository is also the right thing if you need access to all of Glottolog's data, possibly including older versions and the history of changes. Since the format of the data here is tailored towards maintainability - and not towards accessibility - you might want to use the Python package pyglottolog to access it programmatically.
  • glottolog.org - the Glottolog website - may be the most convenient place to inspect and browse the latest released version of Glottolog data. It also provides access to various download formats, tailored towards various re-use scenarios.
  • glottolog as CLDF dataset is probably the best option for accessing all of Glottolog's languoid data. Due to the format being CLDF, it can be used from all kinds of programming environments such as spreadsheet programs, programming languages like R or python, or the UNIX shell. A description of the files in this datasets is available in the README.

How-to cite

Only released versions of the Glottolog data should be cited. These releases are archived with and available from ZENODO at https://doi.org/10.5281/zenodo.596479

Types of data in Glottolog

Languoids

Data about Glottolog languoids (languages, dialects or sub-groups, aka families) is stored in text files (one per languoid) formatted as INI files in the languoids/tree subdirectory. The directory tree mirrors the Glottolog classification of languages.

References

The Glottolog bibliography is curated as a set of BibTeX files in the references/bibtex subdirectory, which are merged into a single reference database for each release/edition.

Metadata

Metadata - e.g. controlled vocabularies for some of the languoid data - are stored as INI files in the config subdirectory.