/NHANES-metadata

Primary LanguagePythonCreative Commons Attribution 4.0 InternationalCC-BY-4.0

NHANES-metadata

This repository contains code and (meta)data components to be included in the NHANES database and tools.

Code

The code folder contains the programs that generate the metadata components described in the next subsections.

  • get_nhanes_metadata.R extracts and saves NHANES metadata from CDC using nhanesA.
  • generate_ontology_mappings.py generates mappings of the metadata to a select set of ontologies.
  • generate_ontology_tables.py downloads SemanticSQL ontology databases and exports tables needed to support ontology-based querying.
  • generate_nhanes_mapping_report.py uses the generic module generate_mapping_report.py to compute counts of direct and inherited mappings and add them to the ontology labels table.

Metadata

The metadata folder contains three files with the extracted metadata:

  • nhanes_tables.tsv contains table identifiers, descriptions, years,...
  • nhanes_variables.tsv contains variable identifiers, labels, full text of questions,...
  • nhanes_variables_codebooks.tsv contains codebooks, which specify possible responses to survey questions (represented by survey variables).

Ontology Mappings

The ontology-mappings folder contains the output of running the text2term ontology mapping tool on the labels used to describe NHANES tables and variables.

  • nhanes_tables_mappings.tsv contains mappings of the table names that are specified in the TableName column of the nhanes_tables.tsv table.
  • nhanes_variables_mappings.tsv contains mappings of the variable labels that are specified in the SASLabel column of the nhanes_variables.tsv table.

Ontology Tables

The ontology-tables folder contains table representations of ontology class hierarchies. We use readily available SemanticSQL-based SQL builds of ontologies from which we extract the tables:

  • ontology_labels.tsv contains the labels of all ontology terms.
  • ontology_edges.tsv contains the asserted relationships between ontology terms.
  • ontology_entailed_edges.tsv contains the inferred relationships between ontology terms (including asserted ones).
  • ontology_synonyms.tsv contains the (exact) synonyms of all ontology terms.
  • ontology_dbxrefs.tsv contains database cross-references that relate ontology terms to other ontologies or databases.

These tables are combined with the ontology mappings to enable ontology-based search of mapped data points.