CLIMATE 405: Machine Learning for Earth and Environmental Sciences (University of Michigan, Fall 2024)

Lecture materials for University of Michigan CLIMATE405: Machine Learning for Earth and Environmental Sciences (Fall 2024)

Dr. Mohammed Ombadi (Email: ombadi@umich.edu)
For any questions regarding this material, please contact Dr. Ombadi. Do not distribute any of the material in this repository before obtaining permission.

Basic Description

This course aims to introduce students (primarily of Earth and Enviromental Sciences background) to data-driven methods, ranging in complexity from autoregression to machine learning models. The course covers the basic theory behind machine learning and provides hands-on experience in building machine learning models. Students will learn to apply these models for both prediction and hypothesis formulation purposes. The methods will be taught through example applications in environmental sciences, with a specific focus on climate and hydrologic applications. Examples include short-term forecasts of temperature and precipitation, streamflow forecasting in selected hydrologic basins, understanding the relative contributions of temperature and precipitation in snowmelt trends, regional clustering of precipitation patterns and trends, and climatic teleconnections in regulating regional precipitation patterns.

The Canvas coursespace can be accessed here
The course syllabus is available here

Contents of this repository

This repository contains the following content:

Lecture Slides

Each lecture slides are provided in a separate folder. The slides are provided both as a Jupyter Notebook and HTML. The HTML version can be downloaded or viewed directly via your web browser without the need to install any software or download additional files. In order to run the live Jupyter notebooks, you can download the folder in your local machine, and then execute the code (requires installing Jupyter Notebooks in your machine; see Notes section for more information).

Lecture 0: Introduction to Class
Lecture 1: Stochastic and Deterministic systems; Random Variables; Review of Basic Statistics
Lecture 2: Probability Distributions; Model Evaluation Metrics
Lecture 3: Statistical Dependence, Hypothesis Testing and Statistical Significance
Lecture 4: Feature Selection, Dimensionality Reduction
Lecture 5: Supervised vs Unsupervised Learning, K-means Clustering
Lecture 6: Hierarchical and Density-based Clustering Algorithms; Examples of Clustering in EES
Lecture 7: Simple Linear Regression; Mutilple Linear Regression; Goodness of Fit
Lecture 8: Choice of Loss Function; Regularization (Ridge and Lasso Regression)
Lecture 9: Gradient Descent; The Anatomy of a Neural Network
Lecture 10: Multilayer Perceptron: Applications in Hydrology; Activation Functions
Lecture 11: Hands-on Session
Lecture 12: Decision Trees
Lecture 13: Interpretability of Decision Trees, Feature Importance, Shapley Values
Lecture 14: Recurrent Neural Networks
Guest Lecture: LSTM Applications in Hydrology
Lecture 15: RNNs continued; Techniques used in training ML Models
Lecture 16: Long Short-term Memory (LSTM) Networks
Lecture 17: Convolutional Neural Networks (CNNs)
Lecture 18: Applications of CNNs in Earth and Environmental Sciences

Homework Assignments

Homework assignments will be posted here on Wednesdays. See lecture notes for more information on homework policy.

Homework 1: here
Homework 2:
Homework 3:
Homework 4:

Notes

How to install Jupyter Notebook for Mac here
How to install Jupyter Notebook for Windows here
Basics of GitHub here

Requirements

We will be using Python3, mainly working with the following packages:

numpy
scipy
pandas
statistics
matplotlib
sickit-learn
tensorflow (with keras)

Other packages will be used depending on the topics covered in each class. At the top of each Jupyter Notebook, you will find the required pacakages to execute the code under the heading "Import Libraries"

Attributions

The Recurrent Neural Network application example in Lecture 14 is adapted from Chollet & Allaire: Time Series Forecasting with Recurrent Neural Networks (RStudio AI blog)
The Singular Value Decomposition (SVD) example in Lecture 4 is adapted from Brunton and Kutz, Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control, Cambridge University Press (2017)

mombadi/umich-climate405