/ckan-php-manager

A tool for managing a CKAN data catalog

Primary LanguagePHPGNU General Public License v3.0GPL-3.0

ckan-php-manager

Build Status Codacy Badge Join the chat at https://gitter.im/GSA/ckan-php-manager

A bunch of scripts to perform tasks using CKAN API and https://github.com/GSA/ckan-php-client

Requirements

Installation

Clone repository

$ git clone https://github.com/GSA/ckan-php-manager.git

Composer

Use composer to install/update dependencies

If you don't have Composer, install it:

$ curl -sS https://getcomposer.org/installer | php
$ mv composer.phar /usr/local/bin/composer

Install dependencies:

$ composer install

Configuration

Copy config.sample.php to config.php. Update it with your custom values, if needed.

$ cp inc/config.sample.php inc/config.php

Usage

Export all packages by Agency name, including all Sub Agencies

  • Update cli/export_packages_by_org.php, editing the title of exported organization ORGANIZATION_TO_EXPORT
  • Run importer using php
    $ php cli/export_packages_by_org.php

Script is taking all terms, including sub-agencies from http://www.data.gov/app/themes/roots-nextdatagov/assets/Json/fed_agency.json and makes CKAN requests, looking for packages by these organization list.

Results can be found in /results/{timestamp} dir after script finished its work, including _{term}.log with package counts for each agency.

DMS legacy tag

To add tag add_legacy_dms_and_make_private to all datasets of some group:

  • Update ORGANIZATION_TO_TAG in the cli/add_legacy_dms_and_make_private.php
  • Double check CKAN_URL and CKAN_API_KEY for editing datasets
  • Run script
    $ php cli/add_legacy_dms_and_make_private.php

Assign groups and category tags to datasets

  • Put csv files to /data dir, with assign_<any-title>.csv (must have assign_ prefix) The format of these files must be: dataset, group, categories

    First line is caption, leave the first line in each file: dataset,group,categories

    Then put one dataset per line.

    1. Dataset can be: * Dataset url, ex. https://catalog.data.gov/dataset/food-access-research-atlas * Dataset name, ex. download-crossing-inventory-data-highway-rail-crossing * Dataset id

    2. Group just one group per line. If you need to add multiple groups, you must create another row in csv with same dataset and another group, because all the categories are tagged by current row group. Make sure your group exist in your CKAN instance (to list all existing groups, go to http://catalog.data.gov/api/3/action/group_list?all_fields=true , replacing catalog.data.gov with your CKAN domain)

    3. Categories one of multiple categories per current row group, separated by semicolon ;

    Example csv file:

    dataset, group, categories
    https://catalog.data.gov/dataset/food-access-research-atlas,Agriculture,"Natural Resources and Environment"
    aerial-image-of-alaskas-arctic-coastal-plain-1955,Climate,"Arctic; Arctic Ocean, Sea Ice and Coasts; Permafrost and Arctic Landscapes"
    28d30c1f-75a5-4042-b0fc-de26cc7d70f2,Climate,Arctic; Arctic Development and Transport
    
  • Double check CKAN_URL and CKAN_API_KEY for editing datasets, defined in inc/config.php

  • Run script

    $ php cli/tagging/assign_groups_and_tags.php
  • Detailed logs and results are stored in folder results/[time-stamp]_ASSIGN_GROUPS

Remove groups and category tags from datasets (revert previous script changes)

  • Prepare same csv file as for previous script, and put them to /data dir, with remove_<any-title>.csv
    $ php cli/tagging/remove_groups_and_tags.php
  • This command will remove listed categories from the dataset of the row. If an empty list of categories is provided, this command will remove the group and all categories from the dataset.

CKAN API DOCs

http://docs.ckan.org/en/latest/api/index.html

Docker setup

To minimize requirements on a system, we've added a minimal setup with docker-compose. This should replace the above usage instructions as the default workflow.

$ docker-compose build
$ docker-compose run --rm app php cli/harvest_stats_csv.php

Run the tests.

$ docker-compose run --rm app phpunit