- Develop fluency in Python for scientific computing
- Explain how common statistical algorithms work
- Construct models using probabilistic programming
- Implement, test, optimize, and package a statistical algorithm
- Homework 40%
- Midterm 1 15%
- Midterm 2 15%
- Project 30%
- A 94 - 100
- B 85 - 93
- C 70 - 85
- D Below 70
- Introduction to Jupyter
- Using Markdown
- Magic functions
- REPL
- Data types
- Operators
- Collections
- Functions and methods
- Control flow
- Packages and namespace
- Coding style
- Understanding error messages
- Getting help
- Saving and exporting Jupyter notebooks
- The
string
package - String methods
- Regular expressions
- Loading and saving text files
- Context managers
- Dealing with encoding errors
- Issues with floating point numbers
- The
math
package - Constructing
numpy
arrays - Indexing
- Splitting and merging arrays
- Universal functions - transforms and reductions
- Broadcasting rules
- Sparse matrices with
scipy.sparse
- Series and DataFrames
- Creating, loading and saving DataFrames
- Basic information
- Indexing
- Method chaining
- Selecting rows and columns
- Transformations
- Aggregate functions
- Split-apply-combine
- Window functions
- Hierarchical indexing
- Piping with
dfply
- Graphics from the group up with
matplotlib
- Statistical visualizations with
seaborn
- Grammar of graphics with
altair
- Building dashboards with
dash
- Writing a custom function
- Pure functions
- Anonymous functions
- Lazy evaluation
- Higher-order functions
- Decorators
- Partial application
- Using
operator
- Using
functional
- Using
itertools
- Pipelines with
toolz
- Sequence and mapping containers
- Using
collections
- Sorting
- Priority queues
- Working with recursive algorithms
- Tabling and dynamic programing
- Time and space complexity
- Measuring time
- Measuring space
- Solving
$Ax = b$ - Gaussian elimination and LR decomposition
- Symmetric matrices and Cholesky decomposition
- Geometry of the normal equations
- Gradient descent to solve linear equations
- Using
scipy.linalg
- Change of basis
- Spectral decomposition
- Geometry of spectral decomposition
- The four fundamental subspaces of linear algebra
- The SVD
- Geometry of spectral decomposition
- SVD and low rank approximation
- Using
scipy.linalg
- Root finding
- Univariate optimization
- Geometry and calculus of optimization
- Gradient descent
- Batch, mini-batch and stochastic variants
- Improving gradient descent
- Root finding and univariate optimization with
scipy.optim
- Nelder-Mead (Zeroth order method)
- Line search methods
- Trust region methods
- IRLS
- Lagrange multipliers, KKT and constrained optimization
- Multivariate optimization with
scipy.optim
- Matrix factorization - PCA and SVD, MMF
- Optimization methods - MDS and t-SNE
- Using
sklearn.decomposition
andsklearn.manifold
- Polynomial
- Spline
- Gaussian process
- Using
scipy.interpolate
- Partitioning (k-means)
- Hierarchical (agglomerative Hierarchical Clustering)
- Density based (dbscan, mean-shift)
- Model based (GMM)
- Self-organizing maps
- Cluster initialization
- Cluster evaluation
- Cluster alignment (Munkres)
- Using
skearn.cluster
- Working with probability distributions
- Using
random
- Using
np.random
- Using
scipy.statistics
- Simulations
- Sampling from data
- Bootstrap
- Permutation resampling
- Sampling from distributions
- Rejection sampling
- Importance sampling
- Monte Carlo integration
- Density estimation
- Bayes theorem and integration
- Numerical integration (quadrature)
- MCMC concepts
- Makrov chains
- Metropolis-Hastings random walk
- Gibbs sampler
- Hamiltonian systems
- Integration of Hamiltonian system dynamics
- Energy and probability distributions
- HMC
- NUTS
- Domain-specific languages
- Multi-level Bayesian models
- Using
daft
to draw plate diagrams - Using
pymc
- Using
pystan
- TensorFlow basics
- Distributions and transformations
- Building probabilistic models with
Edward2
- Why test?
- Test-driven development
- Using
doctest
as documentation - Using
pytest
to run unit tests - Using
hypothesis
to auto-generate test cases - Functional and integration testing
- Always add test if error found
- Python modules
- Organization of a module
- Writing the setup script
- The Python Package Index
- Package managers
- Containers
- Data structures and algorithms
- Vectorization
- JIT compilation with
numba
- AOT compilation with
cython
- Interpreters and compilers
- Review of C++
- Wrapping C++ functions with
pybind11
- Parallel, concurrent, asynchronous, distributed
- Threads and processes
- Shared memory programming pitfalls: deadlock and race conditions
- Embarrassingly parallel programs with
concurrent.futures
andmultiprocessing
- Map-reduce
- Master-worker
- Using
ipyparallel
for interactive parallelization