Metadata needed beyond that in CDF Masters and all.xml for the generation SPASE NumericalData
descriptions for all CDAWeb datasets.
With this information and the spase_auto
code in cdawmeta
, correct and consistent (with Masters and self) SPASE NumericalData
records can be created for all CDAWeb datasets.
Completing the DOI.json
and ResourceID.json
files is the minimum requirement for having CDAWeb SPASE NumericalData
records for all CDAWeb datasets.
Files in this repository are generated by report.py
, which extracts (and corrects) certain attributes from existing SPASE records in the hpde.io
git repository. See the log files in reports for additional details on their method of generation, errors and inconsistencies encountered, and counts of attributes.
-
AccessInformation.json
- is a template forAccessInformation
nodes. Not all content applies to each dataset, andspase_auto.py
determines what parts to include based on information in Master CDFs. -
DOI.json
- contains a list of DOI/CDAWeb dataset ID pairs. -
Epoch.json
- contains metadata that is important for interpretation of parameters derived from data in a CDF with a timeDataType
. -
InformationURL.json
- contains keys of URLs found inInformationURL
nodes and an array of associated CDAWeb dataset IDs. In some cases, a pattern (e.g.,^BAR_
) is used when we are certain that theInformationURL/URL
applies to all CDAWeb dataset IDs matching that pattern. -
InstrumentID.json
contains SPASE InstrumentID/CDAWeb dataset IDs. There are ~20 CDAWeb datasets without SPASE Instrument IDs. -
ObservedRegion.json
- Each key is the "spacecraft" part of a CDAWeb dataset ID (the part before the first_
) and the values are all uniqueObservedRegion
values. There were many instances where different instruments on the same spacecraft had differentObservedRegion
elements and we assumed that this was an error; the content of this file was created assuming all instruments were active at one time while the spacecraft was in each region. If this is not the case, extra nodes in the file can be added to indicate this. -
ResourceID.json
- is a list of SPASEResourceID
/CDAWeb dataset ID pairs. There are many CDAWeb dataset IDs without SPASEResourceID
s. Also, there is inconsistency in the naming (e.g.,SWEPAM
data products are underspase://VSPO
andspase//NASA
). -
Rights.json
- contains a placeholder template for anticipated additions to the SPASE metadata model to support FAIR. -
Units.json
- has an object with keys ofUNIT
strings found in CDF Masters that are not all whitespace. (We do not start with what was found in SPASE due to the mistranslation issues discussed in theUnits
section of thecdawmeta
repository README.) If notnull
, values are the VOUnit equivalent. Ifnull
, a translation is needed. To the variables that have a unit string, search the CDAWeb variable table.For completion, the following is required:
-
Determine the VOUnit representation of all unique units (~1000), when possible, in
Units.json
. -
Determine the VOUnit for all variables that do not have a
UNITS
orUNIT_PTR
attribute or a unit value that is all whitespace (~20,000), which we label as "missing"; see the log file. Although the ISTP conventions require units for variables withVAR_TYPE = data
andsupport_data
, ~20% of variables have "missing"UNITS
. -
Validate that determinations made for 1. and 2. are correct. This could done in two ways: (a) Have two people independently make the determinations and (b) for case 1., use AstroPy to compute the SI conversion and compare with the
SI_{conversion,conv,CONVERSION}
(all three versions are found in CDF Masters and the ISTP convention documentation).
Finally, we think that the correct source of the updated units is not SPASE—it should be the CDF Masters; SPASE records should draw this information from the CDF Masters. Many people use CDF Masters for metadata, and if the VOUnits only existed in SPASE, they would have access to them. (For example, CDAWeb links to the Master file in the Metadata links and Autoplot, HAPI, etc. used Master CDF metadata.)
-