MathematicalEngineeringDeepLearning
Material for The Mathematical Engineering of Deep Learning. See the actual book content on deeplearningmath.org or (when it is out) purchase the book from CRC press.
This repository contains general supporting material for the book.
Below is a detailed list of the source code used for creating figures and tables in the book. We use Julia , Python , or R and the code is sometimes in stand alone files, sometimes in Jupyter notebooks, sometimes as R Markdown , and sometimes in Google Colab . Many of our static illustrations were created using TikZ by Ajay Hemanth and Vishnu Prasath with the source of their illustrations also available so you can adapt it for purposes.
Figure
Topic
Source Code
2.1
Supervised Learning
TikZ
2.2
Unsupervised Learning
TikZ
2.3
Simple regression
R
2.4
Breast Cancer ROC curves
R
2.5
Least Squares
TikZ
2.6
Loss functions
Julia
Table 2.1
Linear MNIST classification
Julia
2.7
Gradient Descent Learning Rate
Python
2.8
Loss Landscape
R
2.9
Generalization and Training
TikZ or Julia
2.10
Polynomial fit
R
2.11
K-fold cross validation
TikZ
2.12
K-means clustering
R
2.13
K-means image segmentation
R
2.14
Breast Cancer PCA
R
2.15
SVD Compression
Julia
Figure
Topic
Source Code
3.1 and 3.2
Logistic regression model curves and boundary
R
3.3
Components of an artificial neuron
TikZ
3.4
Loss landscape of MSE vs. CE on logistic regression
Python
3.5
Evolution of gradient descent learning in logistic regression
R(a,b) First file , R(a,b) Second file
3.6
Shallow multi-output neural network with softmax
TikZ
3.7
Multinomial regression for classification
R
Table 3.1
Different approaches for creating an MNIST digit classifier.
Julia
3.8
Feature engineering in simple logistic regression
R
3.9
Non-linear classification decision boundaries with feature engineering in logistic regression
R
3.10
Non-linear classification decision boundaries with feature engineering in multinomial regression
R same as 3.7
3.11
Single hidden layer autoencoder
TikZ
3.12
Autoencoder projections of MNIST including using PCA
R TikZ
3.13
Manifolds and autoencoders
R TikZ
3.14
MNIST using autoencoders
R same as 3.12
3.15
Denoising autoencoder
TikZ
3.16
Interpolations with autoencoders
R same as 3.12, Julia
Figure
Topic
Source Code
4.1
Convexity and local/global extrema
Python
4.2
Gradient descent with fixed or time dependent learning rate
Python
4.3
Stochastic gradient descent
Python
4.4
Early stopping in deep learning
Julia
4.5
Non-convex loss landscapes
Python
4.6
Momentum enhancing gradient descent
Python
4.7
The computational graph for automatic differentiation
TikZ
4.8
Line search concepts
Python
4.9
The zig-zagging property of line search
Python
4.10
Newton's method in one dimension
Python
Figure
Topic
Source Code
5.1
Fully Connected Feedforward Neural Networks
TikZ(a) , TikZ(b)
5.2
Arbitrary function approximation with neural nets
TikZ(a) , Julia(b,c)
5.3
Binary classification with increasing depth
R
5.4
A continuous multiplication gate with 4 hidden units
TikZ
5.5
A deep model with 10 layers
TikZ
5.6
Several common scalar activation functions
Julia(a,b)
5.7
Flow of information in general back propagation
TikZ
5.8
Simple neural network hypothetical example
TikZ
5.9
Flow of information in standard neural network back propagation
TikZ
5.10
Computational graph for batch normalization
TikZ
5.11
The effect of dropout
TikZ
Figure
Topic
Source Code
6.2
VGG19 architecture
TikZ
6.3
Convolutions
TikZ(a) , TikZ(b)
6.6
Convolution padding
TikZ
6.7
Convolution stride
TikZ
6.8
Convolution dilation
TikZ
6.9
Convolution input channels
TikZ
6.10
Convolution output channels
TikZ
6.11
Pooling
TikZ(a) , TikZ(b)
6.13
Inception module
TikZ
6.14
Resnets
TikZ
6.17
Siamese network
TikZ
Figure
Topic
Source Code
7.1
Sequence RNN tasks
TikZ(a) , TikZ(b) , TikZ(c) , TikZ(d)
7.2
Sequence RNN input output paradigms
TikZ(a) , TikZ(b) , TikZ(c) , TikZ(d)
7.3
RNN recursive graph and unfolded graph
TikZ
7.4
RNN unit
TikZ
7.5
RNN language prediction training
TikZ
7.6
Backpropagation through time
TikZ
7.7
Alternative RNN configurations
TikZ(a) , TikZ(b)
7.8
LSTM and GRU
TikZ(a) , TikZ(b)
7.9
Encoder decoder architectures
TikZ(a) , TikZ(b)
7.10
Encoder decoder with attention
TikZ
7.11
Attention weights
TikZ
7.12
Flow of information with self attention
TikZ
7.13
Multi-head self attention
TikZ
7.14
Positional embedding
Julia(a,b)
7.15
Transformer blocks
TikZ(a) , TikZ(b)
7.16
Transformer encoder decoder architecture
TikZ
7.17
Transfomer auto-regressive application
TikZ