/guinea-ipc

Scripts and templates for the Guinea IPC (Infection Prevention and Control) cluster HDX/HXL pilot.

Primary LanguageShell

GUINEA IPC HEALTH TRAINING DATA MANAGEMENT
last updated 2015-09-10 by David Megginson


All instructions in this document refer to the shared "Guinea IPC
data" dropbox folder.

Prerequisite: you must have the HXL scripts from
https://github.com/HXLStandard/libhxl-python installed and running on
your system.


Workflow
--------

1. Guinea IPC cluster members upload new copies of their reporting
   spreadsheets into their dedicated subfolders under Uploads.

2. Validate incoming files and fix any issues.

3. From the root of the shared folder, run the Linux command

     sh Scripts/do-merge.sh

   to create a merged dataset in the file Staged/ipc-merged.csv

4. Examine the file ipc-merged.csv for basic sanity (is it at least
   several hundred rows long? does it have HXL tags at the top?).

5. From the root of the shared folder, run the Linux command

     sh Scripts/do-coverage.sh

   to create prefecture and subprefecture coverage datasets under
   Staged/

6. If all is well, run the Linux command

     sh Scripts/do-publish.sh

   to create copies of the merged datasets in the Public/ folder. The
   new data will be available immediately on HDX.


Files and folders
-----------------

Inputs/
  Master data and other supporting input datasets.

Inputs/replacements.csv
  Automatic replacement table for cleanup.

Scripts/do-merge.sh
  Shell shell script for merging the data into a single HXL CSV
  file.

Scripts/do-coverage.sh
  Shell script for generating prefecture and subprefecture
  coverage datasets.

Scripts/do-publish.sh
  Shell script that copies the merged data into the Public
  folder.

Scripts/show-changes.sh
  Shell script to list all new or updated files in Uploads/
  (anything newer than the equivalent version under Tagged/)

Staged/
  Most-recent output of the merged data, before being published to the
  Public/ folder. After running do-merge.sh, look at this first to
  make sure it's OK before running do-publish.sh.

Public/
  Output files that are publicly visible via HDX, generated by the
  do-publish.sh script.  Do not rename these files, or else the HDX
  links will stop working.

Uploads/
  Root folder for uploading new reports from participating NGOs. There
  is a folder inside for each of the reporting organisations, and only
  those subfolders should be shared with them (so that they can't
  accidentally overwrite each-others' data).

Working/
  Folder for temporary working copies of HXL files. Everything in this
  folder can be automatically deleted by the scripts, so don't put
  anything important here.