This repo contains all the various Jupyter notebooks and Google EarthEngine scripts our group used to create a predictive model to map mangrove populations.
Initially, our plan was to create a worldwide model and create full-earth maps — but processing-power and time/money constraints forced us to only focus on one region, the Caribbean.
Even so, the algorithms (and models) in this repo should be generalizable, if you wanted to map mangroves in another region instead of the Caribbean.
For more background on this project and a throrough discussion of our results, take a look at the paper (link TK).
The scripts listed below are grouped into a few broad categories. Some are one-offs, while others were executed in order (and only separated to keep files to a managable size.)
Additionally, some of the constants were tweaked for multiple runs, e.g. to test out different hyperparameters.
Jupyter notebooks were run in Google Colab, while Javascript files were run in the Earth Engine Console.
This initial step was mostly done from the command line, using shapefile maps of mangroves from the Global Mangrove Watch.
## Move downloaded files into Google Cloud Storage
gsutil mv ~/Downloads/gmw_*_v2.shp gs://mangrove-models/
## Import Google Cloud Storage assets into EarthEngine
earthengine upload table --asset_id=users/pandringa/gmw_2007 gs://mangrove-models/gmw_2009/GMW_2007_v2.shp
earthengine upload table --asset_id=users/pandringa/gmw_2008 gs://mangrove-models/gmw_2009/GMW_2008_v2.shp
...
See the "assorted notebooks" section for some attempts at uploading OCO2 data into EarthEngine, which we ended up aborting when rasterization didn't work.
BuildSampleRegions.js
Builds list of testing and training regions based on the Global Mangrove Watch maps, adding a 1km buffer then randomly choosing 70% of polygons for training and 30% for evaluation.TakeSamples.ipynb
A version of an EarthEngine tutorial notebook which downloads Landsat 7 imagery, creates cloud-free composites, then picks samples of the composite from the pre-defined mangrove regions, uploading them to Google Cloud Storage as.tfrecord
files.CheckSamples.js
A simple script to display and check on statistics about the train/test samples and points taken for each area, to make sure the 70/30 split is accurate and that the split of mangroves/not mangroves is approximately 50/50.
ColabModel.ipynb
Builds a smaller, 10-epoch UNET model which can be trained in Google Colab, for crude hyperparemeter tuning and faster iterations. Then, it uses this model (or an alternate, pre-trained model stored in Google Cloud) to make predictions over a sample region.AIPlatformModel.ipynb
Builds a full-scale UNET model as a python package and uploads it to Google AI Platform for training. It also includes a section at the bottom about using the trained model in EarthEngine, although we did that using an EarthEngine JS script. (See below.)
PredictOverRegions.js
Generates predictions from an EarthEngine model, restricting the predictions to regions within 1km of previous mangrove maps (to reduce runtimes of the algorithm).AnalyzeSamplePrediction.js
Analyzes a sample prediction (like that generated byColabModel.ipynb
to compare it against a known source of truth and find false positives/negative rates.
LandsatTest.ipynb
Experiments with loading Landsat data into xarray, before we switched to EarthEngine.EarthEngineTest.ipynb
Experiments with rendering EarthEngine maps in Python notebooks, using a different mangrove dataset we eventually abandoned.NC4_CSV.ipynb
A notebook for converting NetCDF files into CSV, for uploading into EarthEngine tables.OCO2.ipynb
A notebook that attempted to convert the OCO2 files from NetCDF into Shapefiles, which was abandoned when CSV exports turned out to be faster and smaller.NDVI_Model.ipynb
An earlier attempt to start a model based on the NDVI, which was aborted in favor of the UNET self-learning model.RasterizeOco2.js
A script that converts the point-based table of OCO2 readings and attempts to interpolate values to create a raster, using a Kriging Interpolator. We were unable to find a way to get this to work across a large region in a reasonable amount of time (it timed out after two weeks), so we eventually abandonded this effort. Theoretically, this could work on a smaller area or better computing power.