Note: Requires requests.
An example script that sets up and periodically updates a CKAN DataStore table with earthquake data from the NGDS.
This example demonstrates how to use DataStore tables to push data directly to them rather than automatically import tabular files via the DataPusher. It can be easily be adapted to different data sources.
See it in action at http://demo.ckan.org/dataset/ngds-earthquakes-data
-
Create a virtualenv and install requests:
virtualenv pyenv cd pyenv && source bin/activate pip install requests
-
Clone this repository:
mkdir src && cd src git clone https://github.com/ckan/example-earthquake-datastore.git
-
Define your CKAN URL and API key in the
config.ini
file. -
Run the setup command, and write the resulting resource id in your
config.ini
file:python datastore_update.py setup
-
Run the update command:
python datastore_update.py update
You probably want to set up this command to run hourly, eg with a cron job:
crontab -e
Add a line like this:
0 0 * * * /path/to/your/pyenv/bin/python /path/to/your/pyenv/src/example-earthquake-datastore/datastore_updater.py update
When running the setup
command we are doing the following things:
-
Creating a new dataset in the remote CKAN instance using the package_create API action.
-
Getting a dump of the remote earthquake data for the past day and extracting the records we want to push to the DataStore.
-
Preparing a mapping of the table fields with the correct field types to ensure they are handled correctly by the DataStore.
-
Pushing the prepared records and the field mapping to a new DataStore resource on the previously created dataset, using the datastore_create API action. Note how we use the id of the previously created dataset. The new resource will be of type
datastore
and will offer a CSV dump of the data stored in the DataStore.
Once we have this initial setup we can use the update
command to periodically request updated earthquake data and push it to our DataStore table using the datastore_upsert API action.
As we defined a primary key when creating the DataStore table we can use the upsert
method, which will update existing records and insert any new ones.
When accessed via the CKAN frontend, the data can be explored in the grid and map previews powered by Recline, and of course it can be accessed programmatically from other applications using the datastore_search API action.