This repository is an end-to-end pipeline for the creation, intercomparison and evaluation of machine learning methods in climate science.
The pipeline carries out a number of tasks to create a unified-data format for training and testing machine learning methods.
These tasks are split into the different classes defined in the src
folder and explained further below:
NOTE: some basic working knowledge of Python is required to use this pipeline, although it is not too onerous
The major entrypoints to the repository are through the notebooks
directory, and the scripts
.
A blog post describing the goals and design of the pipeline can be found here.
Anaconda running python 3.7 is used as the package manager. To get set up with an environment, install Anaconda from the link above, and (from this directory) run
conda env create -f environment.{mac, ubuntu.cpu}.yml
This will create an environment named esowc-drought
with all the necessary packages to run the code. To
activate this environment, run
conda activate esowc-drought
Docker can also be used to run this code. To do this, first
run the docker app (either docker desktop)
or configure the docker-machine
:
# on macOS
brew install docker-machine docker
docker-machine create --driver virtualbox default
docker-machine env default
See here for help on all machines or here for MacOS.
Then build the docker image:
docker build -t ml_drought .
Then, use it to run a container, mounting the data folder to the container:
docker run -it \
--mount type=bind,source=<PATH_TO_DATA>,target=/ml_drought/data \
ml_drought /bin/bash
This pipeline can be tested by running pytest
.
We use mypy for type checking. This can be run by running mypy src
(this runs mypy on the src
directory).
Team: @tommylees112, @gabrieltseng
For updates follow @tommylees112 on twitter or look out for our blog posts!
This was a project completed as part of the ECMWF Summer of Weather Code Challenge #12. The challenge was setup to use ECMWF/Copernicus open datasets to evaluate machine learning techniques for the prediction of droughts.
Huge thanks to @ECMWF for making this project possible!