/cdawmeta-spase

Additional metadata needed for complete SPASE descriptions

cdawmeta-spase

Metadata needed beyond that in CDF Masters and all.xml for the generation SPASE NumericalData descriptions for all CDAWeb datasets.

With this information and the spase_auto code in cdawmeta, correct and consistent (with Masters and self) SPASE NumericalData records can be created for all CDAWeb datasets.

Completing the DOI.json and ResourceID.json files is the minimum requirement for having CDAWeb SPASE NumericalData records for all CDAWeb datasets.

Files in this repository are generated by report.py, which extracts (and corrects) certain attributes from existing SPASE records in the hpde.io git repository. See the log files in reports for additional details on their method of generation, errors and inconsistencies encountered, and counts of attributes.

  • AccessInformation.json - is a template for AccessInformation nodes. Not all content applies to each dataset, and spase_auto.py determines what parts to include based on information in Master CDFs.

  • DOI.json - contains a list of DOI/CDAWeb dataset ID pairs.

  • Epoch.json - contains metadata that is important for interpretation of parameters derived from data in a CDF with a time DataType.

  • InformationURL.json - contains keys of URLs found in InformationURL nodes and an array of associated CDAWeb dataset IDs. In some cases, a pattern (e.g., ^BAR_) is used when we are certain that the InformationURL/URL applies to all CDAWeb dataset IDs matching that pattern.

  • InstrumentID.json contains SPASE InstrumentID/CDAWeb dataset IDs. There are ~20 CDAWeb datasets without SPASE Instrument IDs.

  • ObservedRegion.json - Each key is the "spacecraft" part of a CDAWeb dataset ID (the part before the first _) and the values are all unique ObservedRegion values. There were many instances where different instruments on the same spacecraft had different ObservedRegion elements and we assumed that this was an error; the content of this file was created assuming all instruments were active at one time while the spacecraft was in each region. If this is not the case, extra nodes in the file can be added to indicate this.

  • ResourceID.json - is a list of SPASE ResourceID/CDAWeb dataset ID pairs. There are many CDAWeb dataset IDs without SPASE ResourceIDs. Also, there is inconsistency in the naming (e.g., SWEPAM data products are under spase://VSPO and spase//NASA).

  • Rights.json - contains a placeholder template for anticipated additions to the SPASE metadata model to support FAIR.

  • Units.json - has an object with keys of UNIT strings found in CDF Masters that are not all whitespace. (We do not start with what was found in SPASE due to the mistranslation issues discussed in the Units section of the cdawmeta repository README.) If not null, values are the VOUnit equivalent. If null, a translation is needed. To the variables that have a unit string, search the CDAWeb variable table.

    For completion, the following is required:

    1. Determine the VOUnit representation of all unique units (~1000), when possible, in Units.json.

    2. Determine the VOUnit for all variables that do not have a UNITS or UNIT_PTR attribute or a unit value that is all whitespace (~20,000), which we label as "missing"; see the log file. Although the ISTP conventions require units for variables with VAR_TYPE = data and support_data, ~20% of variables have "missing" UNITS.

    3. Validate that determinations made for 1. and 2. are correct. This could done in two ways: (a) Have two people independently make the determinations and (b) for case 1., use AstroPy to compute the SI conversion and compare with the SI_{conversion,conv,CONVERSION} (all three versions are found in CDF Masters and the ISTP convention documentation).

    Finally, we think that the correct source of the updated units is not SPASE—it should be the CDF Masters; SPASE records should draw this information from the CDF Masters. Many people use CDF Masters for metadata, and if the VOUnits only existed in SPASE, they would have access to them. (For example, CDAWeb links to the Master file in the Metadata links and Autoplot, HAPI, etc. used Master CDF metadata.)