List of significant datasets from the U.S. Department of Energy, categorized by the Offices, National Laboratories, and/or functionality.
For the first Energy Open Data Roundtable on Apr 29, 2015, the Center of Open Data Enterprise created a Microsoft Word document of DOE datasets relevant to the conference discussion.
In addition to providing meta-data of the datasets as a Word Doc, we felt that it was important to also deliver it as machine readable data in both CSV and JSON formats.
Thus, this project was formed.
We then realized that if the data is in machine readable format, then there is no need to maintain a human readable version in parallel. Instead, we can use the latest JSON data to render our index page dynamically. With this simple insight, the human readable form will never get out of sync with the machine readable form.
If you wish to contribute to the categories and datasets described in this repo, it's useful to know the categories and datasets relate to each other. Also, it's useful to know where to store and manage the data so that changes are propagated to both machine and human readable formats.
The data model consists of two "tables" - categories, and datasets. The categories table has these fields:
Name | Type | Required | Example |
---|---|---|---|
id | string | yes | doe-explorer |
name | string | yes | DOE Explorer |
url | string | no | http://www.osti.gov/dataexplorer/ |
description | string | no, but recommended | This portal, launched in 2013 by DOE’s Office of Science, provides science, technology, and engineering research and data collections from DOE. |
The datasets table has these fields:
Name | Type | Required | Example |
---|---|---|---|
category_id | string | yes, relates to id in categories table |
doe-explorer |
name | string | yes | DOE Global Energy Storage Database |
url | string | yes | http://www.osti.gov/dataexplorer/biblio/1134061 |
description | string | no, but recommended | The DOE International Energy Storage Database has more than 400 documented energy storage projects from 34 countries around the world. The database provides free, up-to-date information on grid-connected energy storage projects and relevant state and federal policies. |
The IDs are contructed by converting lowercasing all the alphabetic characters and converting all occurrences of non-alphabetics into one hyphen.
The categories
and datasets
tables are stored in a Google Spreadsheet and are exported to CSV files in data/ with the command line utilities bin/categories_csv.sh and bin/datasets_csv.sh.
Once you have valid CSV files, you can convert them into JSON that drives index.html by using the command line utility bin/csv_to_json.rb. This utility will take either piped output from STDIN or one or more filenames specified as arguments. The output from the utility is sent to STDOUT and should be piped to a file.
bin/update_json_files.sh is a convenience script that performs the following in one shot:
- downloads categories and datasets from the Google Spreadsheet
- saves them as CSVs
- converts and saves them as JSONs
If you inspect index.html and js/main.js, you will notice that the HTML file contains no content, and the JavaScript loads the data from JSON files and rendered them dynamically.
The rendering is performed via Handlebar JS templating engine. In index.html you can see the template inside the script
tag with ID category-template
and type text/x-handlebars-template
.
TBD
TBD
TBD