CLIMATE 405: Machine Learning for Earth and Environmental Sciences (University of Michigan, Fall 2024)
Lecture materials for University of Michigan CLIMATE405: Machine Learning for Earth and Environmental Sciences (Fall 2024)
- Dr. Mohammed Ombadi (Email: ombadi@umich.edu)
- For any questions regarding this material, please contact Dr. Ombadi. Do not distribute any of the material in this repository before obtaining permission.
This course aims to introduce students (primarily of Earth and Enviromental Sciences background) to data-driven methods, ranging in complexity from autoregression to machine learning models. The course covers the basic theory behind machine learning and provides hands-on experience in building machine learning models. Students will learn to apply these models for both prediction and hypothesis formulation purposes. The methods will be taught through example applications in environmental sciences, with a specific focus on climate and hydrologic applications. Examples include short-term forecasts of temperature and precipitation, streamflow forecasting in selected hydrologic basins, understanding the relative contributions of temperature and precipitation in snowmelt trends, regional clustering of precipitation patterns and trends, and climatic teleconnections in regulating regional precipitation patterns.
This repository contains the following content:
Each lecture slides are provided in a separate folder. The slides are provided both as a Jupyter Notebook and HTML. The HTML version can be downloaded or viewed directly via your web browser without the need to install any software or download additional files. In order to run the live Jupyter notebooks, you can download the folder in your local machine, and then execute the code (requires installing Jupyter Notebooks in your machine; see Notes section for more information).
- Lecture 0: Introduction to Class
- Lecture 1: Stochastic and Deterministic systems; Random Variables; Review of Basic Statistics
- Lecture 2: Probability Distributions; Model Evaluation Metrics
- Lecture 3: Statistical Dependence, Hypothesis Testing and Statistical Significance
- Lecture 4: Feature Selection, Dimensionality Reduction
- Lecture 5: Supervised vs Unsupervised Learning, K-means Clustering
- Lecture 6: Hierarchical and Density-based Clustering Algorithms; Examples of Clustering in EES
- Lecture 7: Simple Linear Regression; Mutilple Linear Regression; Goodness of Fit
- Lecture 8: Choice of Loss Function; Regularization (Ridge and Lasso Regression)
- Lecture 9: Gradient Descent; The Anatomy of a Neural Network
- Lecture 10: Multilayer Perceptron: Applications in Hydrology; Activation Functions
- Lecture 11: Hands-on Session
- Lecture 12: Decision Trees
- Lecture 13: Interpretability of Decision Trees, Feature Importance, Shapley Values
- Lecture 14: Recurrent Neural Networks
- Guest Lecture: LSTM Applications in Hydrology
- Lecture 15: RNNs continued; Techniques used in training ML Models
- Lecture 16: Long Short-term Memory (LSTM) Networks
- Lecture 17: Convolutional Neural Networks (CNNs)
- Lecture 18: Applications of CNNs in Earth and Environmental Sciences
Homework assignments will be posted here on Wednesdays. See lecture notes for more information on homework policy.
- Homework 1: here
- Homework 2:
- Homework 3:
- Homework 4:
- How to install Jupyter Notebook for Mac here
- How to install Jupyter Notebook for Windows here
- Basics of GitHub here
We will be using Python3, mainly working with the following packages:
- numpy
- scipy
- pandas
- statistics
- matplotlib
- sickit-learn
- tensorflow (with keras)
Other packages will be used depending on the topics covered in each class. At the top of each Jupyter Notebook, you will find the required pacakages to execute the code under the heading "Import Libraries"
- The Recurrent Neural Network application example in Lecture 14 is adapted from Chollet & Allaire: Time Series Forecasting with Recurrent Neural Networks (RStudio AI blog)
- The Singular Value Decomposition (SVD) example in Lecture 4 is adapted from Brunton and Kutz, Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control, Cambridge University Press (2017)