18.337J/6.338J: Parallel Computing and Scientific Machine Learning (Spring 2023)

Professor Alan Edelman (and Philip the Corgi)

MW 3:00 to 4:30 @ Room 2-190

TA and Office hours: (To be confirmed)

Piazza Link

Canvas will only be used for homework and project (+proposal) submission + lecture videos

Classes are recorded and will be uploaded on canvas. Another great resource is Chris Rackauckas' videos of 2021 spring class. See SciMLBook.

Announcement:

There will be a small number of homeworks, followed by the final project. Everyone needs to present their work and submit a project report.

1-page Final Project proposal due : March 24

Final Project presentations : April 26 to May 15

Final Project reports due: May 15

Lecture Schedule (tentative) (Warning: links currently out of phase starting with lecture 3)

#	Day	Date	Topic	SciML lecture	Materials
1	M	2/6	Intro to Julia. My Two Favorite Notebooks.		[Julia is fast], [AutoDiff], [autodiff video],[The Parallel Dream]
2	W	2/8	Matrix Calculus I		See [IAP 2023 Class on Matrix Calculus]
3	M	2/13	Matrix Calculus II		[video]
4	W	2/15	Automatic differentiation I : Forward mode AD	8	[video 1] [video2]
5	T	2/21	Automatic differentiation II : Reverse mode AD	10	[video]
6	W	2/22	Models of Parallelism	6	[video]
7	M	2/27	Multithreading, Static and Dynamic Scheduling		Slides
8	W	3/1	GPU Parallelism I	7	[video 1],[video2]
9	M	3/6	GPU Paralellism II		[video], [Eig&SVD derivatives notebooks], [2022 IAP Class Matrix Calculus]
10	W	3/8	MPI		Slides, [video, Lauren Milichen],[Performance Metrics] see p317,15.6
11	M	3/13	Differential Equations I	9
12	W	3/15	Differential Equations II	10
13	M	3/20	Neural ODE	11
14	W	3/22		13
			Spring Break
15	M	4/3			GPU Slides Prefix Materials
16	W	4/5	Convolutions and PDEs	14
17	M	4/10	Chris R on ode adjoints, PRAM Model	11	[video]
18	W	4/12	Linear and Nonlinear System Adjoints	11	[video]
	M	4/17	Patriots' Day
19	W	4/19	Lagrange Multipliers, Spectral Partitioning		Partitioning Slides
20	M	4/24		15	[video],notes on adjoint
21	W	4/26	Project Presentation I
22	M	5/1	Project Presentation II	Materials
23	W	5/3	Project Presentation III	16	[video]
24	M	5/8	Project Presentation IV
25	W	5/10	Project Presentation V
26	M	5/15	Project Presentation VI

Lecture Summaries and Handouts

Lecture 1: Introduction and Syllabus

Lecture and Notes

Homeworks

Final Project

For the second half of the class students will work on the final project. A one-page final project proposal must be sumbitted by March 24 Friday, through canvas.

Last three weeks (tentative) will be student presentations.

Possible Project Topics

One possibility is to review an interesting algorithm not covered in the course and develop a high performance implementation. Some examples include:

High performance PDE solvers for specific PDEs like Navier-Stokes
Common high performance algorithms (Ex: Jacobian-Free Newton Krylov for PDEs)
Recreation of a parameter sensitivity study in a field like biology, pharmacology, or climate science
Augmented Neural Ordinary Differential Equations
Neural Jump Stochastic Differential Equations
Parallelized stencil calculations
Distributed linear algebra kernels
Parallel implementations of statistical libraries, such as survival statistics or linear models for big data. Here's one example parallel library) and a second example.
Parallelization of data analysis methods
Type-generic implementations of sparse linear algebra methods
A fast regex library
Math library primitives (exp, log, etc.)

Another possibility is to work on state-of-the-art performance engineering. This would be implementing a new auto-parallelization or performance enhancement. For these types of projects, implementing an application for benchmarking is not required, and one can instead benchmark the effects on already existing code to find cases where it is beneficial (or leads to performance regressions). Possible examples are:

Create a system for automatic multithreaded parallelism of array operations and see what kinds of packages end up more efficient
Setup BLAS with a PARTR backend and investigate the downstream effects on multithreaded code like an existing PDE solver
Investigate the effects of work-stealing in multithreaded loops
Fast parallelized type-generic FFT. Starter code by Steven Johnson (creator of FFTW) and Yingbo Ma can be found here
Type-generic BLAS. Starter code can be found here
Implementation of parallelized map-reduce methods. For example, pmapreduce extension to pmap that adds a paralellized reduction, or a fast GPU-based map-reduce.
Investigating auto-compilation of full package codes to GPUs using tools like CUDAnative and/or GPUifyLoops.
Investigating alternative implementations of databases and dataframes. NamedTuple backends of DataFrames, alternative type-stable DataFrames, defaults for CSV reading and other large-table formats like JuliaDB.

Additionally, Scientific Machine Learning is a wide open field with lots of low hanging fruit. Instead of a review, a suitable research project can be used for chosen for the final project. Possibilities include:

Acceleration methods for adjoints of differential equations
Improved methods for Physics-Informed Neural Networks
New applications of neural differential equations
Parallelized implicit ODE solvers for large ODE systems
GPU-parallelized ODE/SDE solvers for small systems

kennyweichen/18337