/paper-osm-2015

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

This repository provides material used for the paper: <add a link here>

Overview
========

The repository contains scripts and Python Notebooks which can be used to perform a comparison of 

  1) Raster water masks derived from Landsat satellite imagery
  2) Vector datasets representing hydrography (OpenStreetMap or Surface Hydrology for Australia)
  3) Drainage networks derived from 30m SRTM

Some of the scripts make use of Google Earth Engine (GEE).

Some of the scripts require https://github.com/gena/ee-runner to run GEE JavaScript API in a command-line.

Dirctories
==========

  bin/ .......................... programs used during conversion
    gdalcopyprj.cmd ............. copies gographic projection from one raster file to another
    gdalcopyprj.py
    osmconvert.exe .............. converts OpenStreetMap pbf binary files to o5m and osm format
    osmfilter.exe ............... filter OpenStreetMap files

  config/ ....................... config files use by some of the scripts
    osmconf.ini ................. used by osmconvert and osmfilter
    water.osmfilter.params ...... combines tags used to extract hydrography from OpenStreetMap files (polylines and polygons)

  data/ ......................... input files
    OpenStreetMap/
    SurfaceHydrology/ ........... major rivers extracted from http://www.ga.gov.au
    grid.kml .................... regular grid used to aggregate results and to parallelize the analysis
    HydroSHEDS_Catchment_MD.* ... HydroSHEDS cachment boundaries for Murray-Darling River Basin
    HydroSHEDS_rivers_MD.* ...... HydroSHEDS drainage network

  graphics/ ..................... images comparing generated water mask and OSM with WOfS and Surface Hydrology
    <images>

  notebooks/ .................... Python Notebooks used to perform different steps of the analysis
    ConvertWaterHistogramsAndThresholds.ipynb
    ExplodeShpPolylines.ipynb
    ExplodeShpPolylinesSH.ipynb
    HAND.ipynb
    HydroBASINS (fix).ipynb
    NDWI_thresholds.ipynb
    OSM water validation using LANDSAT (Murray & Darling).ipynb
    SHP2KML with fix.ipynb
    TileCatchmentsToGrid.ipynb
    UploadToFusionTable.ipynb    

  results/ ...................... some of the results generated during analysis
    water_and_results_examples/
    water.zip
    authoritative/ .............. screenshots comparing L8 watermask to some of the authoritative datastes

  src/
    javascript/
       download_SRTM.js ........... GEE script used to download SRTM data clipped by catchment boundaries
       download_water.js .......... GEE
       training.js ................ training dataset, generated within GEE

    javascript-ee-playground/ ......... GEE Playground scripts
       OSM_automatic_NDWI.js ..........
       OSM_compute_buffers.js ......... computes distances between rivers using Goodchilds's method
       OSM_compute_buffers_export.js .. 
       OSM_stat_per_catchment.js ...... generates aggregated statistics

    python/
       generate-hydro-datasets.py .......... script used to deleniate drainage network and compute HAND
       generate-hydro-datasets-runner.py ... wraps the above script to avoid out-of-memory probles

    scripts/
       convert.cmd .................... script used to extract hydrography from the OSM files

  web/ .......................... source code of the Google App Engine website (http://osm-water.appspot.com)
    httplib2
    media
    oauth2client
    static
    app.yaml
    config_web.py
    index.html
    README.md
    server.py
    six.py

  AuthoritativeDatasets.docx ...... early comparisons to the authoritative datasets in Australia

Tutorial
========

The overall analysis is performed in a number of steps, starting with the download of all required datasets, such as:

* HydroBASINS
* OpenStreetMap extracts, relevant for the study area

Additional step include preparation of the regular grid file (grid.kml) used to parallelize the analysis.

The actual steps required for the analysis are quite time consuming and include:

1. Extraction of the hydrography from the the OSM files (scripts/convert.cmd)
      Produces two shapefiles (polylines and polygons) representing hydrography

2. Split linear hydrography features into small polylines in order to perform the analysis (notebooks/ExplodeShpPolylines.ipynb)

3. Upload results of 2. to Google Fusion Table (UploadToFusionTable.ipynb)

4. Select required catchment boundaries from HydroBASINS. Catchment size should be smaller than current Google Earth Engine download limit!

5. Upload catchments to the Google Fusion Table

6. Prepare regular grid file (grid.kml) and upload to Google Fusion Table, the resulting asset id needs to be updated in the OSM_automatic_NDWI.js script

7. Download SRTM extracts clipped by catchment boundaries (src/javascript/download_SRTM.js)

8. Deleniate drainage network for every sub-catchment DEM from 6. and compute HAND dataset (and all other related hydrological dataset) using python/generate-hydro-datasets-runner.py

9. Convert results of 7. to regular grid tiles and upload to Google Earth Engine as raster assets.

10. Adjust src/javascript/download_water.js do use the new HAND, grid (a copy of src/javascript-ee-playground/OSM_automatic_NDWI.js).

11. Prepare training set for GEE Playground scripts (10.) by performing script runs for a number of tiles, repeat until the results are good for hilly areas,
    For flat areas and sufficient cloud-free images the script should produce good results without any changes

12. Detect surface water mask for every grid tile using src/javascript/download_water.js. src/javascript/templates provides example on how to parallelize water detection.

13. Upload (re-tile if necessart) the resulting water mask to Google Earth Engine

14. Compute positional and thematic accuracy using src/javascript-ee-playground/OSM_compute_buffers_export.js and src/javascript-ee-playground/OSM_stat_per_catchment.js