microbiomedata/DataHarmonizer

align section composition and ordering with @mslarae13's Example Use tab

Closed this issue · 1 comments

  • from SNTC to LinkML
    • slots within the soil_biosample class in the annotated soil biosample YAML file
    • slot/column sections as LinkML annotations
    • slot/column order as LinkML annotations
    • section definitions and section ordering in LInkML ???
  • from LinkML to DH data.tsv

Implement within linkml_to_dh_light.py for the nmdc_biosample_slots and mixs_packages_x_slots tabs

implement within inject_supplementary for all other tabs

Trying to write a clearer specification

  • Implement in linkml_round_trips/modular_gd.py
  • Write tests for this?

Background:

  • DataHarmonizer columns are grouped into sections
  • @mslarae13 knows
    • what the sections should be called
    • how the sections should be ordered left to right
    • which columns should appear in each section
    • how the columns should be ordered left to right within the sections
  • Each tab in Soil-NMDC-Template_Compiled that defines columns to appear in DH should
    • be tagged as input for soil DH template generation in tabSheetIdentification
    • have a column called section
  • There should be a tab called Sections_order that lists the section names with their orders, using columns named order and section
  • the values in Sections_order.section must match the section column values found in and tab that is declared input for soil DH template generation in tabSheetIdentification
    • all Sections_order.section values must match at least one section column value in a input for soil DH template generation tab
    • all section column values in input for soil DH template generation tabs must match one of the Sections_order.section values

It is assumed that the section names and orders will remain the same for all NMDC DH templates.