CityofEdmonton/IFTTT-Edmonton

Add a controller for getting open data sets dynamically

Closed this issue · 4 comments

We have quite a bit of work related to open data and IFTTT popping up. We need a controller that pulls a list of open data sets. Look here to see how we handle dynamically pulling our cities that have air quality stations.

This would involve dynamically pulling this set of data. Check out this SO post for more information.

We also need to let user's set which column in a dataset they want to watch, so we would need a controller that takes the data set identifier, then returns a list of columns for that dataset.

Check out this post on getting data types.

In terms of pulling the columns from open data, we have ~1000 open data sets, so we will need to make 1000 calls to pull all that info. Lucky for us, we have Redis!

To start, hit each data set one at a time. Use a Redis key of opendata/sets/"hashed data set name here". Collect all the columns and rpush it to a list with the previously mentioned key. Set an expiry time on the key.

Since we have ~1000 datasets, say 15 columns per dataset, each column has a name of 20 characters and each character takes up 2 bytes, we should have 600KB-1MB of data usage for this service. That won't be too bad.

Here's the docs on the Socrata API endpoints. You are able to pass in parameters to filter the data using filters or SoQL.