18.337J/6.338J: Parallel Computing and Scientific Machine Learning (Spring 2023)
Professor Alan Edelman (and Philip the Corgi)
MW 3:00 to 4:30 @ Room 2-190
TA and Office hours: (To be confirmed)
Piazza Link
Canvas will only be used for homework and project (+proposal) submission + lecture videos
SciMLBook.
Classes are recorded and will be uploaded on canvas. Another great resource is Chris Rackauckas' videos of 2021 spring class. SeeAnnouncement:
There will be a small number of homeworks, followed by the final project. Everyone needs to present their work and submit a project report.
1-page Final Project proposal due : March 24
Final Project presentations : April 26 to May 15
Final Project reports due: May 15
Lecture Schedule (tentative) (Warning: links currently out of phase starting with lecture 3)
# | Day | Date | Topic | SciML lecture | Materials |
---|---|---|---|---|---|
1 | M | 2/6 | Intro to Julia. My Two Favorite Notebooks. | [Julia is fast], [AutoDiff], [autodiff video],[The Parallel Dream] | |
2 | W | 2/8 | Matrix Calculus I | See [IAP 2023 Class on Matrix Calculus] | |
3 | M | 2/13 | Matrix Calculus II | [video] | |
4 | W | 2/15 | Automatic differentiation I : Forward mode AD | 8 | [video 1] [video2] |
5 | T | 2/21 | Automatic differentiation II : Reverse mode AD | 10 | [video] |
6 | W | 2/22 | Models of Parallelism | 6 | [video] |
7 | M | 2/27 | Multithreading, Static and Dynamic Scheduling | Slides | |
8 | W | 3/1 | GPU Parallelism I | 7 | [video 1],[video2] |
9 | M | 3/6 | GPU Paralellism II | [video], [Eig&SVD derivatives notebooks], [2022 IAP Class Matrix Calculus] | |
10 | W | 3/8 | MPI | Slides, [video, Lauren Milichen],[Performance Metrics] see p317,15.6 | |
11 | M | 3/13 | Differential Equations I | 9 | |
12 | W | 3/15 | Differential Equations II | 10 | |
13 | M | 3/20 | Neural ODE | 11 | |
14 | W | 3/22 | 13 | ||
Spring Break | |||||
15 | M | 4/3 | GPU Slides Prefix Materials | ||
16 | W | 4/5 | Convolutions and PDEs | 14 | |
17 | M | 4/10 | Chris R on ode adjoints, PRAM Model | 11 | [video] |
18 | W | 4/12 | Linear and Nonlinear System Adjoints | 11 | [video] |
M | 4/17 | Patriots' Day | |||
19 | W | 4/19 | Lagrange Multipliers, Spectral Partitioning | Partitioning Slides | |
20 | M | 4/24 | 15 | [video],notes on adjoint | |
21 | W | 4/26 | Project Presentation I | ||
22 | M | 5/1 | Project Presentation II | Materials | |
23 | W | 5/3 | Project Presentation III | 16 | [video] |
24 | M | 5/8 | Project Presentation IV | ||
25 | W | 5/10 | Project Presentation V | ||
26 | M | 5/15 | Project Presentation VI |
Lecture Summaries and Handouts
Lecture 1: Introduction and Syllabus
Lecture and Notes
Homeworks
Final Project
For the second half of the class students will work on the final project. A one-page final project proposal must be sumbitted by March 24 Friday, through canvas.
Last three weeks (tentative) will be student presentations.
Possible Project Topics
One possibility is to review an interesting algorithm not covered in the course and develop a high performance implementation. Some examples include:
- High performance PDE solvers for specific PDEs like Navier-Stokes
- Common high performance algorithms (Ex: Jacobian-Free Newton Krylov for PDEs)
- Recreation of a parameter sensitivity study in a field like biology, pharmacology, or climate science
- Augmented Neural Ordinary Differential Equations
- Neural Jump Stochastic Differential Equations
- Parallelized stencil calculations
- Distributed linear algebra kernels
- Parallel implementations of statistical libraries, such as survival statistics or linear models for big data. Here's one example parallel library) and a second example.
- Parallelization of data analysis methods
- Type-generic implementations of sparse linear algebra methods
- A fast regex library
- Math library primitives (exp, log, etc.)
Another possibility is to work on state-of-the-art performance engineering. This would be implementing a new auto-parallelization or performance enhancement. For these types of projects, implementing an application for benchmarking is not required, and one can instead benchmark the effects on already existing code to find cases where it is beneficial (or leads to performance regressions). Possible examples are:
- Create a system for automatic multithreaded parallelism of array operations and see what kinds of packages end up more efficient
- Setup BLAS with a PARTR backend and investigate the downstream effects on multithreaded code like an existing PDE solver
- Investigate the effects of work-stealing in multithreaded loops
- Fast parallelized type-generic FFT. Starter code by Steven Johnson (creator of FFTW) and Yingbo Ma can be found here
- Type-generic BLAS. Starter code can be found here
- Implementation of parallelized map-reduce methods. For example,
pmapreduce
extension topmap
that adds a paralellized reduction, or a fast GPU-based map-reduce. - Investigating auto-compilation of full package codes to GPUs using tools like CUDAnative and/or GPUifyLoops.
- Investigating alternative implementations of databases and dataframes. NamedTuple backends of DataFrames, alternative type-stable DataFrames, defaults for CSV reading and other large-table formats like JuliaDB.
Additionally, Scientific Machine Learning is a wide open field with lots of low hanging fruit. Instead of a review, a suitable research project can be used for chosen for the final project. Possibilities include:
- Acceleration methods for adjoints of differential equations
- Improved methods for Physics-Informed Neural Networks
- New applications of neural differential equations
- Parallelized implicit ODE solvers for large ODE systems
- GPU-parallelized ODE/SDE solvers for small systems