/cd

Collection Descriptions

Primary LanguagePythonCreative Commons Attribution 4.0 InternationalCC-BY-4.0

On October 19, 2023

  • The cd repository's review branch was merged with main and the repository was archived (#498)
  • The official repository of the TDWG Latimer Core standard switches over to be at https://github.com/tdwg/ltc.

Collection Descriptions Interest Group

This was the original repository for the Collection Descriptions Interest Group during the development phase for the Latimer Core collection description standard. Should you, due to some reason, want to query this repository, be aware that the latest version of its contents can be found in the review branch.

About the group

The Collection Descriptions Interest Group is the parent interest group that is chartering a task group to develop a Collection Descriptions metadata standard.

The new Collection Descriptions standard will be a successor to the unratified draft Natural Collections Description data standard, whose development has been discontinued.

The day-to-day operations of the Interest Group is documented in this repository. You can also track and participate in the work of the group by watching this repository and monitoring the group's issues tracker.

Quick reference index

For quick reference, an index of classes and properties, and summaries of the current data model can be found in this Google sheet.

Members

Conveners

Name Affiliation GitHub username or Email
Sharon Grant Field Museum, Chicago @rondlg
Janeen Jones Field Museum, Chicago @fmjjones
Kate Webbink Field Museum, Chicago @magpiedin
Matt Woodburn Natural History Museum London @mswoodburn

Core contributors

Name Affiliation GitHub username or Email
Jutta Buschbom Statistical Genetics @jbstatgen
Sarah Vincent Natural History Museum London @essvee
Maarten Trekels Meise Botanic Garden/Synthesys+ @mtrekels
Quentin Groom Meise Botanic Garden/TDWG/Synthesys+ @qgroom

Expert Review Team

Name Affiliation GitHub username or Email
Ben Norton North Carolina Museum of Natural Sciences @ben-norton
Rob Sanderson Yale @
Ian Engelbrecht University of Pretoria @
Steve Baskauf Vanderbilt @baskaufs

Contributors

Name Affiliation GitHub username or Email
David Bloom VertNet @dbloom
Gabi Droege Botanic Garden and Botanical Museum Berlin / Global Genome Biodiversity Network @gdadade
Deborah Paul (past Co-convenor) Species File Group, INHS @debpaul
Niels Raes Naturalis Biodiversity Center niels.raes AT naturalis.nl
Mike Trizna Smithsonian Institution @MikeTrizna
William Ulate Missouri Botanical Garden / Centro de Investigación en Informática de la Biodiversidad (CRBio.org) @WUlate

Members

Name Affiliation GitHub username or Email
Wouter Addink Naturalis Biodiversity Center @wouteraddink
James Beach University of Kansas beach53 AT gmail.com
Allison Becker Smithsonian Institution / National Museum of Natural History
Joana Beja Flanders Marine Institute
Stan Blum TDWG @stanblum
Ana Casino CETAF ana.casino AT cetaf.org
Terry Catapano UCB @tcatapano
Arthur Chapman Australian Biodiversity Information Services
Cat Chapman iDigBio
Heather Cole Ag Canada Heather.Cole AT AGR.GC.CA
Johanna Eder State Museum of Natural History Stuttgart
Dag Endresen Univerity of Oslo Natural History Museum @dagendresen
Falko Glöckler MfN Berlin falko.gloeckler AT mfn-berlin.de
Andrea Hahn GBIF @ahahn-gbif
Jean-Marc Herpers RBINS
Olle Hints Tallinn University of Technology
Donald Hobern GBIF @dhobern
Jana Hoffman MfN Berlin jana.hoffmann AT mfn-berlin.de
Morten Høfft GBIF
Sharif Islam Naturalis Biodiversity Center @sharifX
Natalya Ivanova Institute of Mathematical Problems of Biology, Russian Academy of Sciences
Shelley James Department of Biodiversity, Conservation & Attractions, Western Australia; TDWG @grungle
Gail Kampmeier TDWG
Talia Karim Museum of Natural History, Museum of Colorado
Niels Klazenga Royal Botanic Gardens Victoria
Dimitris Koureas Naturalis @dkoureas
Erica Krimmel iDigBio / Florida State University
Kerstin Lehnert Columbia University @klehnert55
Holly Little Smithsonian Institution / National Museum of Natural History @hollyel
Tina Loo Naturalis Biodiversity Center
Anissa Lybaert Ag Canada Anissa.lybaert AT agr.gc.ca
James Macklin Ag Canada @jmacklin
Patricia Mergen Meise Botanic Garden
Giles Miller Natural History Museum London
Gil Nelson iDigBio
Raoul Palese Conservatoire et Jardin botaniques de la Ville de Genève
Mareike Petersen MfN Mareike.Petersen AT mfn.berlin
Judith Price CMN (retired)
Joel Ramirez NYBG @jlramirez
Isabel Reu Consejo Superior de Investigaciones Científicas, CSIC
Connie Rinaldo Harvard University crinaldo AT oeb.harvard.edu
Tim Robertson GBIF
Hanieh Saeedi Senckenberg Research Institute and Natural History Museum
Íris Sampaio University of the Azores / Senckenberg am Meer
Celia Santos Consejo Superior de Investigaciones Científicas, CSIC
Dave Smith Natural History Museum London d.a.smith AT nhm.ac.uk
Rebecca Snyder Smithsonian Institution / National Museum of Natural History
Barbara Thiers NYBG bthiers AT nybg.org
Caitlin Thorn MfN Berlin
Laura Tilley CETAF
Mike Trizna Smithsonian Institution
Pascal Tschudin University of Basel
Melissa Tulig NYBG mtulig AT nybg.org
William Ulate Missouri Botanical Garden
Wim van Dongen Picturae @cannedit
Sabine Von Mering MfN Berlin
Wiebke Walbaum State Museum of Natural History Stuttgart
Ramona Walls CyVerse @ramonawalls
Karin Wiltschke Natural History Museum Vienna
Paula Zermoglio Universidad de Buenos Aires

Collection Descriptions Standard (CD) Repository Navigation

Contents of this README.md page assist with understanding of how to contribute and where to find materials related to the development of the collections description data standard. Note that where needed, there exists a very brief description of contents you will find at each link shared below. This group manages development using GitHub as much as possible.

A (not so) brief description of our group

A detailed description of our rationale and goals, motivation, tasks, and strategy. This document outlines the goals and objectives of the task group and plan for reaching these goals.

The community is asked to review these and add to them if they see a missing use case.

This document gathers some of the key known issues to keep in mind as the CD standard is developed. It is meant to help guide and structure both design and implementation considerations of CD and resulting products that CD enables.

CD Way of Work

As much as possible, each group is taking on a self-selected task and will manage delivery of it as they choose (meeting as needed). They may link to working documents however they choose (google docs, other, ...) but will upload summary and completed documents directly to GitHub in the appropriate folder (e.g. meetings and documents) for that task. Where possible, links to external working documents should be added to the document links page to make them easily findable by TG members.

The CD TG as a whole will meet 1/x month. 4th Wednesday of each month (2019) except where holidays require date/time to change. Meetings are held 2x on that day (one Eastern-time friendly, one Western)

We started with a spreadsheet acting as a template for a Gantt-style chart of all our envisioned tasks with dependencies. From this chart, we created GitHub milestones where each group can manage tracking the issues and timelines related to that task. These tasks are now each grouped into GitHub projects.

  1. Landscape and requirements analysis
  2. Communication plan
  3. Data model
  4. Data standards
  5. Documentation
  6. Reference examples
  7. Develop extensions

To manage group activities in more detail, TG members can add new issues and allocate them to the appropriate project and milestone on the right-hand side of the form. This will mean that issues are displayed on the appropriate project page, and their statuses can be easily monitored.

CD Events

  • 2016 met at TDWG
  • 2017 met at TDWG
  • 2018 met at SPNHC-TDWGNZ, with some online meetings
  • 2019 plans to meet in-person and at Biodiversity Next
  • 2020 deliver a standard with implementations
  • 2020-2022 something about a pandemic.

Reference and Historical Materials

This current effort evolves from work started over 10 years ago by the Natural Collections Description Standard IG/TG group (NCD). Here we attempt to link to materials resulting from their efforts. These documents provide a foundation for the CD Group. Some have been copied over into this CD Repo to insure they do not get lost.

Old NCD repository

NCD Repo wiki
NCD Code page
NCD standard versions
NCD Draft Specification NCD cross-walks
NCD TG Charter
NCD use cases on NCD Repo wiki

Other historical docs

2016 and 2017 interest group abstracts
Google Doc with 2016-2017 meeting notes

Glossary

  • CETAF - Consortium of European Taxonomic Facilities. CETAF is the Consortium of European Taxonomic Facilities: a European network of Natural Science Museums, Natural History Museums, Botanical Gardens and Biodiversity Research Centres with their associated biological collections and research expertise.
  • EML - Ecological Metadata Language. The Ecological Metadata Language (EML) is a metadata standard developed by the ecology discipline and for the ecology discipline. It is based on prior work done by the Ecological Society of America and associated efforts (Michener et al., 1997, Ecological Applications). EML is implemented as a series of XML document types that can by used in a modular and extensible manner to document ecological data. Each EML module is designed to describe one logical part of the total metadata that should be included with any ecological dataset.
  • DiSSCo - Distributed System of Scientific Collections. DiSSCo is a new pan-European Research Infrastructure initiative of 21 European countries with a vision to position European natural science collections at the centre of data-driven scientific excellence and innovation in environmental research, climate change, food security, one health and the bioeconomy.
  • GBIF - Global Biodiversity Information Facility. GBIF—the Global Biodiversity Information Facility—is an international network and research infrastructure funded by the world’s governments and aimed at providing anyone, anywhere, open access to data about all types of life on Earth.
    GRBio - Global Repository of Biodiversity Repositories. GRBio is now shepherded by GBIF. Developments underway include implementing the CD standard to support sharing of collection-level metadata worldwide.
  • ICEDIG - Innovation and consolidation for large scale digitisation of natural heritage - is an EU-funded project that aims at supporting the implementation phase of the new Research Infrastructure DiSSCo (“Distributed System of Scientific Collections”) by designing and addressing the technical, financial, policy and governance aspects necessary to operate such a large distributed initiative for natural sciences collections across Europe.
  • iDigBio - Integrated Digitized Biocollections. An NSF-funded initiative to provide access and capacity/community support for digitization, data mobilization, and use of scientific collections both neontological and paleontological.
  • MOBILISE - Mobilising Data, Policies and Experts in Scientific Collections. European Natural Science Collections host approximately 1.5 billion biological and geological collection objects, which represent about 80% of the known current and past biological and geological diversity on earth. The scope of this MOBILISE is to foster a cooperative network in Europe to support excellent research activities, and facilitate knowledge and technology transfer around natural science collections. This will prepare the ground for a future pan-European Distributed System of Scientific Collections (DiSSCo).
  • RDF - Resource Description Framework. From the W3C Semantic Web: RDF is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed. Also see the TDWG Beginner's Guide to RDF.
  • SPNHC - The Society for the Preservation of Natural History Collections [SPNHC] is an international society whose mission is to improve the preservation, conservation and management of natural history collections to ensure their continuing value to society.
  • SYNTHESYS - Synthesis of Systematic Resources. SYNTHESYS+ is a European Commission - funded project, creating an integrated European infrastructure for natural history collections.

Repo structure

The current repository structure is described below.

├── README.md                   : Description of this repository
├── LICENSE                     : Repository license
│
├── charters                    : Interest Group and Task Group charters
│   └── draft                   : Draft charters and historical versions
│
├── documents                    
│   ├── draft                   : Working folder for draft documents
│   ├── final                   : Final versions of group documents
│   ├── historical              : Historical and deprecated documents, and snapshots of exernal drafts in Google Docs etc
│   └── DOCUMENT_LINKS.md       : Links to working documents in Google Docs, Office 365 etc
│
├── meetings                    : Agendas and minutes of IG and TG meetings
│
├── reference
│   ├── crosswalks              : Crosswalks of existing and previous collection descriptions standards and initiatives
│   ├── use_cases               : Documented use cases for a collection descriptions standard
│   └── REFERENCE_LINKS.md      : Links to relevant information resources (publications, sites etc)
│
├── standard
│   ├── data_model              : Data model definitions, schemata and diagrams
│   └── vocabularies            : Controlled vocabularies, ontologies etc relevant to the standard
│
├── tools                       : Ad hoc tools used to support the development of the standard and data model
└── .gitignore                  : Files and directories to be ignored by git

Preferred citation

Collections Descriptions interest group. 2019. Collection Descriptions (CD), in development. Biodiversity Information Standards (TDWG) http://www.tdwg.org/standards/