/proteomics-metadata-standard

The Proteomics Experimental Design file format: Standard for experimental design annotation

Primary LanguageShell

Proteomics Experiment Design Project

1. Improving metadata annotation of Proteomics datasets

The Proteomics Experiment Design project aims to define a set of guidelines and file formats to support the annotation of experiment design in Proteomics submissions. The Proteomics Team advocates for open access and reuse of proteomics datasets and works towards providing concrete solutions to achieve this. Our goal with the Experiment Design format is to enssure maximum reusability of the deposited data. Our work aims to define the minimum information required to report the proteomics experiment design, enabling the use and reuse of the deposited data by the proteomics community.

The following Use Cases should be considered to design the Proteomic Experiment design data format:

  • The experiment design file format will complement the proteomeXchange.xml file format implemented by ProteomeXchange to capture the minimum metadata on a Proteomics dataset. The ProteomeXchange submission XML file format is detailed here.

  • The experiment design format SHOULD enable submitters and curators to annotate the proteomics dataset from the sample metadata (e.g. Organism) to the experiment protocol (e.g. Instrument model).

  • The Experiment design format SHOULD facilitate the automatic reanalysis of public proteomics data. The experiment design aims to enable the reanalysis of quantitative data and provide a better representation of quantitative data in public repositories.

2. Notational conventions

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” are to be interpreted as described in RFC-2119 (Bradner 1997).

3. Ontologies

The Experiment Design format should be based on ontology terms (e.g. UNIMOD-35). An ontology encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities that substantiate one, many or all domains of discourse. All Ontologies used in the Proteomic Experiment Design format MUST be indexed in the Ontology Lockup Service. The current ontologies supported in the format are:

⚠️
If you you are contributing with the following guidelines and file format, and WOULD like to add another ontology; please modify the list with a Pull Request.

4. Experiment design structure

5. Core contributors and collaborators

The project is run by different groups:

  • Yasset Perez-Riverol (PRIDE Team, European Bioinformatics Institute - EMBL-EBI, U.K.)

  • Timo Sachsenberg (OpenMS Team, Tübingen University, Germany)

  • Anja Fullgrabe (ExpressionAtlas Team, European Bioinformatics Institute - EMBL-EBI, U.K.)

  • Nancy George (ExpressionAtlas Team, European Bioinformatics Institute - EMBL-EBI, U.K.)

  • Mathias Walzer (PRIDE Team, European Bioinformatics Institute - EMBL-EBI, U.K.)

  • Pablo Moreno (ExpressionAtlas Team, European Bioinformatics Institute - EMBL-EBI, U.K.)

  • Oliver Alka (OpenMS Team, Tübingen University, Germany)

  • Julianus Pfeuffer (OpenMS Team, Tübingen University, Germany)

  • Marc Vaudel (University of Bergen, Norway)

  • Harald Barsnes (University of Bergen, Norway)

  • Niels Hulstaert (Compomics, University of Gent, Belgium)

  • Lennart Martens (Compomics, University of Gent, Belgium)

  • Expression Atlas Team (European Bioinformatics Institute - EMBL-EBI, U.K.)

If you contribute with the following specification, please make sure to add your name to the list of contributors.

6. Code of Conduct

As part of our efforts toward delivering open and inclusive science, we follow the [Contributor Convenant](https://www.contributor-covenant.org) [Code of Conduct for Open Source Projects](docs/CODE_OF_CONDUCT.md).