Benchmarking Common Data Library Approaches

Primary LanguageJupyter NotebookMIT LicenseMIT


The SCINet Geospatial Common Data Library (GeoCDL) effort is focused on establishing a centralized data library of commonly used geospatial data (Climate, EO such as Landsat/Sentinel/Modis, Landuse-change, etc...) in associated with the USDA ARS SCINet initiative. The objective is to make this library accessible from USDA ARS HPC systems and provide high-performance / high throughput data to computing infrastructure.

If involved with this effort, please feel free to raise questions in the Discussion Section or if there is an error/issue with the code, in the Issue Section


Currently the focus of this effort is on benchmarking different approaches to building, hosting, and accessing this library. This can be found in the ./throughput_benchmarks section.


If there a small changes / suggestions, please feel free to raise it in the issues or discussion sections. For more substantial contributions the following git - github workflow is suggested:

  1. Fork this repository to you own github account
  2. Clone the forked repository (typically on Ceres/Atlas/etc...)
  3. Make desired changes / additions
  4. Stage / Commit / Push changes back to your forked repository
  5. Make a Pull request to this repository to incorporate your changes / additions