See https://github.com/rweigel/cdaweb-metadata, this repository is no longer in use.
See also https://github.com/spase-group/adapt/tree/main/CDAWEB
This repository was developed to improve the HAPI metadata served at https://cdaweb.gsfc.nasa.gov/hapi. Metadata-related code is discussed in Section 2. Output files are available at mag.gmu.edu.
This repository also contains a discussion of alternative approaches for using CDAWeb to serve HAPI data streams. Data-related code is discussed in Section 3.
Three developers have written software to produce HAPI metadata using different methods. Each approach taken has limitations and differences exist in the produced metadata. This repository can be used to compare metadata generated by each method.
This repository contains
-
scripts for a fourth method that uses the CDAS REST service. The script that generates metadata is
CDAS2HAPIinfo.js
($\sim$1000 lines). The output files generated by this script are placed locally inhapi/bw
. The input files used to create the output files are stored locally incache/bw
.Output files are also available at http://mag.gmu.edu/git-data/cdaweb-hapi-metadata/hapi/bw.
(The input files used to create the final output are stored in cache/bw).
-
a script,
compare/compare-meta.js
, that compares the metadata results. The filecompare/meta/compare-meta.json
contains the content from the fourall
files with keys to indicate the method. The keys arebw
,nl
,bh
, andjf
, which are the initials of the person who developed the software that generates the HAPI metadata. See below for additional details.
The four methods that produce HAPI metadata are
-
bw
, which uses the new code in this repository and generated HAPI metadata (stored at mag.gmu.edu.-
Extracts dataset ids and their start and stop dates from https://spdf.gsfc.nasa.gov/pub/catalogs/all.xml
-
For each dataset, makes a
/variables
request to https://cdaweb.gsfc.nasa.gov/WebServices/REST/ to get the list of variable in the dataset that is needed for the next step -
Makes a
/data
request to https://cdaweb.gsfc.nasa.gov/WebServices/REST/ to obtain a sample data file. The time range used is that of the last file returned from a/orig_data
request for all files over the timerange in ` -
After step 3., all of the metadata needed to form a HAPI response is available. The final step is to generate HAPI
/info
responses for each dataset. There is one complication. HAPI mandates that all variables in a dataset have the same time tags. Some CDAWeb datasets have datasets with variables with different time tags. So prior to creating/info
responses, new HAPI datasets are formed. These new datasets have ids that match the CDAWeb ids but have an "@0", "@1", ... appended, where the number indicates the time tag variable index in the original dataset.
The initial generation of the the HAPI
all.json
file usingCDAS2HAPIinfo.js
can take up to 30 minutes, which is similar to the update time required daily by thenl
server. In contrast, subsequent updates usingCDAS2HAPIinfo.js
takes less than a second; on a daily basis, only thestartDate
andstopDate
must be updated, which requires readingall.xml
and updatingall-info.json
. When CDAWeb adds datasets or the master CDF changes, the process outlined above is only required for those dataset; this process typically takes less than 10 seconds per dataset. -
-
nl
, which uses an approach similar to the above for datasets with virtual variables and an approach similar tojf
below otherwise. Output files are available at mag.gmu.edu.This code is used for the production CDAWeb HAPI server.
This production HAPI server has many datasets for which the metadata or data responses are not valid. It appears the HAPI verifier was never run on all datasets. I have found that when randomly selecting datasets and parameters at https://hapi-server.org/servers, one frequently encounters issues.
The production HAPI server becomes unresponsive at 9 am daily due to a similar update that appears to block the meain thread. However, in general, a full update is only needed when content other than the
startDate
andstopDate
changes. -
bh
, which uses SPASE records. Output files are available at mag.gmu.edu.This server is a prototype and it serves only CDAWeb datasets for which a SPASE record is available.
-
jf
, which uses master CDFs, raw CDF files, and code from Autoplot. Output files are available at mag.gmu.edu.The code that produces HAPI metadata is also a prototype and is not indended for production use.
This respository also contains code to compare HAPI CSV generated using N different methods. The script, bin/HAPIdata.js
is a wrapper script that can be used to generate HAPI CSV using the N methods given a HAPI dataset, parameter(s), and start and stop times.
The methods are
- cdaweb-csv-using-js
- cdaweb-cdf-using-pycdf
- cdas-text-using-js
- cdas-cdf-using-pycdf
- cdas-cdf-using-pycdas
- nl-hapi
- bh-hapi
- apds
Other options include
- text-raw
- text-noheader
- nl-hapi
- cdas-cdf-using-pycdf
This script with pre-set choices of HAPI request inputs can be excuted using
cd compare
make compare-data1
make compare-data2
make compare-data3
Requires Node.js
.
node CDAS2HAPIinfo.js --keepids '^AC_'
generates /info
responses for all datasets with IDs starting with AC_
.
cd compare; make compare-data1 IDREGEX='^AC_'
make all
In reference to the above four options, to create metadata for all CDAWeb IDs that start with "AC_"
, use
-
node CDAS2HAPIinfo.js --keepids '^AC_'
, which creates HAPI metadata that is written tohapi/bw
. -
node HAPI2HAPIinfo.js --version 'nl' --keepids '^AC_'
, which createshapi/nl/all.json
andhapi/nl/info/
-
node HAPI2HAPIinfo.js --version 'bh' --keepids '^AC_'
, which createshapi/bh/all.json
, which contains all of the info responses placed inhapi/bh/info/
. -
node HAPI2HAPIinfo.js --version 'jf' --keepids '^AC_'
, which createshapi/jf/all.json
, which contains all of the info responses placed inhapi/jf/info/
.
After these files are created, the program compare-meta.js
can be executed to generate the file compare/compare-meta.json
that shows the metadata created by the four approaches in a single file.