DH Section review

Question

DH Section review

Closed this issue 2 months ago · 2 comments

DataHarmonizer groups columns into sections. We should review:

Appearance (determined mostly by the Sections_order tab in Soil-NMDC-Template_Compiled)
- Names of sections
  - What happens to the user experience when the section names are especially long
  - In the past, we had talked about including the contents of category columns into the section names when they add something. For example, biogeochemistry for the mixs_packages_x_slots tab.
  - Do we want to include all of these facets in section names?
    - MIxS vs MIxS modified
    - required, recommended, option?
- Ordering of sections
- Assignment of columns to sections
- Ordering of columns within sections
Implementation
- The section modelling should be injected into the LinkML schema, not created on the fly when the DH data.tsv is being created. Solution started in #123
- Is it worth having a short name for each section and a more descriptive title? For now, I have just replaced all appearances of the names like 'biosample_id' with the titles like 'Biosample Identification' across the whole Google Sheet, with the exception of keeping a short_name column in the Sections_order tab, in case we want to revert.

Answer 1 · 2022-02-04T15:39:19.000Z

I changed Biosample Identification to Sample ID because the longer form spills over the gray line on the left that indicates the column freeze

Answer 2 · 2024-07-12T20:01:02.000Z

I don't believe any of this is relevant anymore. If there are problems with the way slots are grouped into sections please open an issue in the submission-schema repo if it specifically about NMDC's schema, specifically, or in the the main DataHarmonizer repo if it is about how DataHarmoinzer interprets a schema.