/cmr-stac

Primary LanguageTypeScriptOtherNOASSERTION

NASA CMR STAC

NASA's Common Metadata Repository (CMR) is a metadata catalog of NASA Earth Science data. STAC, or SpatioTemporal Asset Catalog, is a specification for describing geospatial data with JSON and GeoJSON. The related STAC-API specification defines an API for searching and browsing STAC catalogs.

CMR-STAC acts as a proxy between the CMR repository and STAC API queries. The goal is to expose CMR's vast collections of geosptial data as a STAC-compliant API. Even though the core metadata remains the same, a benefit of the CMR-STAC proxy is the ability to use the growing ecosystem of STAC software. Underneath, STAC API queries are translated into CMR queries which are sent to CMR and the responses are translated into STAC Collections and Items. This entire process happens dynamically at runtime, so responses will always be representative of whatever data is currently stored in CMR. If there are any deletions of data in CMR by data providers, those deletions are represented in CMR-STAC immediately.

CMR-STAC follows the STAC API 1.0.0-beta.1 specification, see the OpenAPI documentation.

Usage

Most users will be interested in the deployed versions of CMR_STAC:

  • CMR-STAC: The entire catalog of NASA CMR data, organized by provider.
  • CMR-CLOUDSTAC: Also organized by provider, this API only contains STAC Collections where the Item Assets are available "in the cloud" (i.e., on s3).

See the Usage documentation for how to use available STAC software to browse and use the API.

Development

CMR-STAC is written in NodeJS using the Express.js framework and deployed as an AWS serverless application using API Gateway + Lambda.

The remainder of this README is documentation for developing, testing, and deploying CMR-STAC. See the Usage documentation if you are interested in using the CMR-STAC API.

Repository Structure

Directory Description
bin Scripts used for installation and setup
docs Documentation on usage of the CMR-STAC endpoint(s)
search The CMR-STAC application
search/docs is where the combined specification document made from the STAC and WFS3 specification documents is held. Paths and component schemas are defined here. The generated STAC documentation file is also located in this directory.
search/lib The lib directory contains the main logic of the application. It is broken down into modules pertaining to areas of responsibility. A summary of those modules can be found below.
search/tests The tests directory is where all of the unit tests for the application are held. There is a directory for every corresponding subdirectory in the lib directory. We have not provided examples of how any of our modules work inside of this documentation, however, our test are written in a manner where you can see an example of how a function or module works.
scripts Utility (Python) scripts for validating and crawling CMR-STAC

lib modules

  • api/: The api directory houses the api routing logic for the application
  • application.js: The main Express application.
  • cmr.js: contains logic to query CMR, including searching for collections and granules, getting collections and granules, and building CMR search URLs.
  • convert/: Functions that are used to convert CMR data fields into their corresponding STAC/WFS3 fields.
  • settings.js: Contains settings and controls fetching of settings from environment variables
  • stac/: Contains utility functions used in creating the STAC API endpoints and the links between endpoints. This includes logic to dynamically create or display catalogs during a search.
  • util/: houses utility functions used throughout the application, such as for building URLs

Setup

Set the correct NodeJS version (specified in .nvmrc required by CMR-STAC with nvm (recommended for managing NodeJS versions):

nvm use

Then install dependencies with npm:

npm install

To run the CMR-STAC server locally:

npm start

This will run the process in the current terminal session, the local server will be available at:

http://localhost:3000/dev/stac

Deploying

The deployment is handled via the Serverless Framework. Each service has a separate configuration file (serverless.yml).

You will need to setup a set of AWS credentials for the account where the application is being deployed. This account requires the following permissions:

  • manage cloud formation
  • manage S3 buckets
  • manage labmda function
  • manage api gateway

There are some environment variables included in the serverless.yml file for the search function that gets deployed. Those variables have default values, but when deploying they should be evaluated based on the environment they are being deployed into. e.g. SIT, UAT, PROD

  • LOG_LEVEL: info
  • LOG_DISABLED: false
  • STAC_BASE_URL: http://localhost:3000
  • STAC_VERSION: 1.0.0
  • STAGE: ${self:provider.stage}

STAGE is the AWS API Gateway stage that the application is being deployed. That by default is a setting in the serverless.yml file that environment variable will reference.

Use the npm script deploy to deploy the CMR-STAC application to AWS:

cd search
npm run deploy

This will use the default AWS credentials on the system to deploy. If using profiles, use the aws-profile switch:

npm run deploy -- --aws-profile <profile-name>

To override the environment variables, they can be specified on the command line.

npm run deploy -- --stage <sit|uat|prod> --cmr-search-host <cmr-search-host> --cmr-search-protocol <http|https>

License

CMR-STAC is published under the Apache License, Version 2.0. See LICENSE.txt