/openaq-toolkit

🚀 Collection of user guides and useful tools for working with OpenAQ data.

Primary LanguagePythonMIT LicenseMIT

OpenAQ Toolkit

Collection of user guides, tools, and links to resources for working with OpenAQ data.

Resources

User Guides

Tools

Links

  • OpenAQ on AWS - OpenAQ's publically available S3 bucket and SNS topic informations.

Download OpenAQ archive data from S3 using awscli

openaq-fetches bucket in S3 Explorer

OpenAQ stores metric data in a S3 bucket, and it's publicly available. One way to download from the archive is using the aws s3 command.

Prerequisites: You need a free AWS account, and have awscli installed and configured.

Download a single file:

aws s3 cp s3://openaq-fetches/realtime-gzipped/2020-06-06/1591476667.ndjson .

Download files for 1 day:

aws s3 cp s3://openaq-fetches/realtime-gzipped/2020-06-06/ . --recursive

You can go up 1 level and download the entire archive if you wish.

If you prefer to not use awscli, take a look at this tool that uses the scraping approach: barronh/scrapenaq.

How big is the OpenAQ S3 bucket?

aws s3 ls --summarize --human-readable --recursive s3://openaq-fetches

As of June 2020, it's 323 GB.

Convert ndjson to InfluxDB line protocol format

The archive files in the S3 bucket are ndjson formatted, or newline delimited JSON. Meaning it's just JSON, but each line is a separate JSON object.

If you were to convert this to InfluxDB's line protocol, you can use ndjson2lineprotocol.py script that's found in this repo.

cat *.ndjson | ./ndjson2lineprotocol.py

The script outputs to standard output, so you may want to redirect it to a file.

Convert CSV to InfluxDB line protocol format

Addition to the S3 option, you can filter and download data as CSV from openaq.org website.

openaq.org's CSV download page

After downloading the CSV, feed the file to csv2lineprotocol.py like so:

cat openaq.csv | ./csv2lineprotocol.py

Contributing

Something missing or need fixing here? Please use the issues page to submit requests and ask questions. You can also create a Pull Request with your changes.