/cd

Collection Descriptions

Creative Commons Attribution 4.0 InternationalCC-BY-4.0

Collection Descriptions Interest Group

This is the repository for the Collection Descriptions Interest Group.

About the group

The Collection Descriptions Interest Group is the parent interest group that is chartering a task group to develop a Collection Descriptions metadata standard.

The new Collection Descriptions standard will be a successor to the unratified draft Natural Collections Description data standard, whose development has been discontinued.

The day-to-day operations of the Interest Group is documented in this repository. You can also track and participate in the work of the group by watching this repository and monitoring the group's issues tracker.

Members

Conveners

Name Affiliation Email
Deborah Paul iDigBio dpaul@fsu.edu
Matt Woodburn Natural History Museum London m.woodburn@nhm.ac.uk

Core Members

Name Affiliation Email
Wouter Addink Naturalis Biodiversity Center wouter.addink@naturalis.nl
Mike Trizna Smithsonian Institution triznam@si.edu
Janeen Jones Field Museum jjones@fieldmuseum.org
Sharon Grant Field Museum sgrant@fieldmuseum.org
Kate Webbink Field Museum kwebbink@fieldmuseum.org
Connie Rinaldo Harvard University crinaldo@oeb.harvard.edu
Carolyn Sheffield Smithsonian Libraries / BHL sheffieldc@si.edu
Dag Endresen Univerity of Oslo Natural History Museum dag.endresen@nhm.uio.no
Holly Little Smithsonian Institution / National Museum of Natural History littleh@si.edu
Ramona Walls CyVerse rwalls@cyverse.org
Kerstin Lehnert Columbia University lehnert@ldeo.columbia.edu
Niels Raes Naturalis Biodiversity Center niels.raes@naturalis.nl
Dave Smith Natural History Museum London d.a.smith@nhm.ac.uk
Mareike Petersen Museum für Naturkunde mareike.petersen@mfn.berlin
William Ulate Missouri Botanical Garden william_ulate_r@yahoo.com
Donald Hobern GBIF dhobern@gbif.org
Barbara Thiers NYBG bthiers@nybg.org
Kevin Love iDigBio klove@flmnh.ufl.edu
Andrea Hahn GBIF ahahn@gbif.org
James Macklin Ag Canada james.macklin@agr.gc.ca
Anissa Lybaert Ag Canada Anissa.lybaert@agr.gc.ca
Joel Ramirez NYBG jramirez@nybg.org
Melissa Tulig NYBG mtulig@nybg.org
Falko Glöckler MfN Berlin falko.gloeckler@mfn-berlin.de
Jana Hoffman MfN Berlin jana.hoffmann@mfn-berlin.de
David Bloom VertNet dbloom@vertnet.org
Steve Baskauf Vanderbilt steve.baskauf@vanderbilt.edu
Mareike Petersen MfN Mareike.Petersen@mfn.berlin
James Beach University of Kansas beach53@gmail.com
Terry Catapano UCB catapanoth@gmail.com
Stan Blum TDWG stanblum@gmail.com
Dimitris Koureas Naturalis dimitris.koureas@naturalis.nl
Judith Price CMN (retired)
Sarah Vincent Natural History Museum London s.vincent@nhm.ac.uk
Heather Cole Ag Canada Heather.Cole@AGR.GC.CA
Shelley James RBGS Shelley.James@rbgsyd.nsw.gov.au
Quentin Groom Meise Botanic Garden/TDWG/Synthesys+ quentin.groom@plantentuinmeise.be
Ana Casino CETAF ana.casino@cetaf.org
Wim van Dongen Picturae w.vandongen@picturae.com
Sharif Islam Naturalis Biodiversity Center sharif.islam@naturalis.nl

Collection Descriptions Standard (CD) Repository Navigation

Contents of this README.md page assist with understanding of how to contribute and where to find materials related to the development of the collections description data standard. Note that where needed, there exists a very brief description of contents you will find at each link shared below. This group manages development using GitHub as much as possible.

A (not so) brief description of our group

A detailed description of our rationale and goals, motivation, tasks, and strategy. This document outlines the goals and objectives of the task group and plan for reaching these goals.

The community is asked to review these and add to them if they see a missing use case.

This document gathers some of the key known issues to keep in mind as the CD standard is developed. It is meant to help guide and structure both design and implementation considerations of CD and resulting products that CD enables.

CD Way of Work

As much as possible, each group is taking on a self-selected task and will manage delivery of it as they choose (meeting as needed). They may link to working documents however they choose (google docs, other, ...) but will upload summary and completed documents directly to GitHub in the appropriate folder (e.g. meetings and documents) for that task. Where possible, links to external working documents should be added to the document links page to make them easily findable by TG members.

The CD TG as a whole will meet 1/x month. 4th Wednesday of each month (2019) except where holidays require date/time to change. Meetings are held 2x on that day (one Eastern-time friendly, one Western)

We started with a spreadsheet acting as a template for a Gantt-style chart of all our envisioned tasks with dependencies. From this chart, we created GitHub milestones where each group can manage tracking the issues and timelines related to that task. These tasks are now each grouped into GitHub projects.

  1. Landscape and requirements analysis
  2. Communication plan
  3. Data model
  4. Data standards
  5. Documentation
  6. Reference examples
  7. Develop extensions

To manage group activities in more detail, TG members can add new issues and allocate them to the appropriate project and milestone on the right-hand side of the form. This will mean that issues are displayed on the appropriate project page, and their statuses can be easily monitored.

CD Events

  • 2016 met at TDWG
  • 2017 met at TDWG
  • 2018 met at SPNHC-TDWGNZ, with some online meetings
  • 2019 plans to meet in-person and at Biodiversity Next
  • 2020 deliver a standard with implementations

Reference and Historical Materials

This current effort evolves from work started over 10 years ago by the Natural Collections Description Standard IG/TG group (NCD). Here we attempt to link to materials resulting from their efforts. These documents provide a foundation for the CD Group. Some have been copied over into this CD Repo to insure they do not get lost.

Old NCD repository

NCD Repo wiki
NCD Code page
NCD standard versions
NCD Draft Specification NCD cross-walks
NCD TG Charter
NCD use cases on NCD Repo wiki

Other historical docs

2016 and 2017 interest group abstracts
Google Doc with 2016-2017 meeting notes

Glossary

  • CETAF - Consortium of European Taxonomic Facilities. CETAF is the Consortium of European Taxonomic Facilities: a European network of Natural Science Museums, Natural History Museums, Botanical Gardens and Biodiversity Research Centres with their associated biological collections and research expertise.
  • EML - Ecological Metadata Language. The Ecological Metadata Language (EML) is a metadata standard developed by the ecology discipline and for the ecology discipline. It is based on prior work done by the Ecological Society of America and associated efforts (Michener et al., 1997, Ecological Applications). EML is implemented as a series of XML document types that can by used in a modular and extensible manner to document ecological data. Each EML module is designed to describe one logical part of the total metadata that should be included with any ecological dataset.
  • DiSSCo - Distributed System of Scientific Collections. DiSSCo is a new pan-European Research Infrastructure initiative of 21 European countries with a vision to position European natural science collections at the centre of data-driven scientific excellence and innovation in environmental research, climate change, food security, one health and the bioeconomy.
  • GBIF - Global Biodiversity Information Facility. GBIF—the Global Biodiversity Information Facility—is an international network and research infrastructure funded by the world’s governments and aimed at providing anyone, anywhere, open access to data about all types of life on Earth.
    GRBio - Global Repository of Biodiversity Repositories. GRBio is now shepherded by GBIF. Developments underway include implementing the CD standard to support sharing of collection-level metadata worldwide.
  • ICEDIG - Innovation and consolidation for large scale digitisation of natural heritage - is an EU-funded project that aims at supporting the implementation phase of the new Research Infrastructure DiSSCo (“Distributed System of Scientific Collections”) by designing and addressing the technical, financial, policy and governance aspects necessary to operate such a large distributed initiative for natural sciences collections across Europe.
  • iDigBio - Integrated Digitized Biocollections. An NSF-funded initiative to provide access and capacity/community support for digitization, data mobilization, and use of scientific collections both neontological and paleontological.
  • MOBILISE - Mobilising Data, Policies and Experts in Scientific Collections. European Natural Science Collections host approximately 1.5 billion biological and geological collection objects, which represent about 80% of the known current and past biological and geological diversity on earth. The scope of this MOBILISE is to foster a cooperative network in Europe to support excellent research activities, and facilitate knowledge and technology transfer around natural science collections. This will prepare the ground for a future pan-European Distributed System of Scientific Collections (DiSSCo).
  • RDF - Resource Description Framework. From the W3C Semantic Web: RDF is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed. Also see the TDWG Beginner's Guide to RDF.
  • SPNHC - The Society for the Preservation of Natural History Collections [SPNHC] is an international society whose mission is to improve the preservation, conservation and management of natural history collections to ensure their continuing value to society.
  • SYNTHESYS - Synthesis of Systematic Resources. SYNTHESYS+ is a European Commission - funded project, creating an integrated European infrastructure for natural history collections.

Repo structure

The current repository structure is described below.

├── README.md                   : Description of this repository
├── LICENSE                     : Repository license
│
├── charters                    : Interest Group and Task Group charters
│   └── draft                   : Draft charters and historical versions
│
├── documents                    
│   ├── draft                   : Working folder for draft documents
│   ├── final                   : Final versions of group documents
│   ├── historical              : Historical and deprecated documents, and snapshots of exernal drafts in Google Docs etc
│   └── DOCUMENT_LINKS.md       : Links to working documents in Google Docs, Office 365 etc
│
├── meetings                    : Agendas and minutes of IG and TG meetings
│
├── reference
│   ├── crosswalks              : Crosswalks of existing and previous collection descriptions standards and initiatives
│   ├── use_cases               : Documented use cases for a collection descriptions standard
│   └── REFERENCE_LINKS.md      : Links to relevant information resources (publications, sites etc)
│
├── standard
│   ├── data_model              : Data model definitions, schemata and diagrams
│   └── vocabularies            : Controlled vocabularies, ontologies etc relevant to the standard
│
└── .gitignore                  : Files and directories to be ignored by git

Preferred citation

Collections Descriptions interest group. 2019. Collection Descriptions (CD), in development. Biodiversity Information Standards (TDWG) http://www.tdwg.org/standards/