Copyright 2020, Yung-Yu Chen yyc@solvcon.net. All rights reserved.
- 2020 autumn at NCTU: notebook/20au_nctu/.
- 2020 spring at NCTU: notebook/20sp_nctu/.
- 2019 autumn at NCTU: notebook/19au_nctu/.
Objectives: This course discusses the art to build numerical software, i.e., computer programs applying numerical methods for solving mathematical or physical problems. We will be using the combination of Python and C++ and related tools (e.g., bash, git, make, etc.) to learn the modern development processes. By completing this course, students will acquire the fundamental skills for developing modern numerical software.
Prerequisites: This is a graduate or senior level course open to students who have taken computer architecture, engineering mathematics or equivalents. Working knowledge of Linux and Unix-like is required. Prior knowledge to numerical methods is recommended. The instructor uses English in the lectures and discussions.
- You are expected to learn programming languages yourself. Python is never a problem, but you could find it challenging to self-teach C++. Students are encouraged to form study groups for practicing C++, and discuss with the instructor and/or the teaching assistant.
- Grading: homework 30%, mid-term exam: 30%, term project: 40%.
- There are 14 lectures for the subjects of numerical software developing using Python and C++.
- There will be 6 homework assignments for you to exercise. Programming in Python and/or C++ is required.
- Mid-term examination will be conducted to assess students' understandings to the analytical materials.
- Term project will be used to assess students' overall coding skills. Presentation is required. Failure to present results in 0 point for this part. Check the term project page before you start.
- This is a practical course. No textbook is available for this specific interdisciplinary subject.
- To study the subject, students are required to research with online documents and source code, and write programs to practice.
- In-class instruction and course notes are provided for guidance.
- References:
- Computer Systems: A Programmer's Perspective, Randal E. Bryant and David R. O'Hallaron: https://csapp.cs.cmu.edu/
- Python documentation: https://docs.python.org/3/
- Cppreference: https://en.cppreference.com/
- Effective Modern C++, Scott Meyer, O'Reilly, 2014
- Source code: cpython, numpy, xtensor, and pybind11
- W1 (9/14) Lecture 1: Introduction
- W2 (9/21) Lecture 2: Fundamental engineering practices (homework #1)
- W3 (9/28) Lecture 3: Python and numpy (term project proposal start)
- W4 (10/5) Lecture 4: C++ and computer architecture (homework #2)
- W5 (10/12) Lecture 5: Matrix operations
- W6 (10/19) Lecture 6: Cache optimization (homework #3) (term project proposal due)
- W7 (10/26) Lecture 7: SIMD
- W8 (11/2) Mid-term examination
- W9 (11/9) Lecture 8: Memory management (homework #4)
- W10 (11/16) Lecture 9: Ownership and smart pointers
- W11 (11/23) Lecture 10: Modern C++ (homework #5)
- W12 (11/30) Lecture 11: C++ and C for Python
- W13 (12/7) Lecture 12: Array code in C++ (homework #6)
- W14 (12/14) Lecture 13: Array-oriented design
- W15 (12/21) Lecture 14: Advanced Python
- W16 (12/28) Term project presentation
- W17 (1/4) No meeting (optional lecture is not planned)
- W18 (1/11) No meeting (optional lecture is not planned)
Lecture 1 Introduction
- Part 1: What is numerical software
- Why develop numerical software
- Hybrid architecture
- Numerical software = C++ + Python
- Part 2: What to learn in this course
- Term project
- How to write a proposal
- Term project grading guideline
- Online discussion
- Part 3: Runtime and course marterials
- Runtime environment: Linux and AWS
- Jupyter notebook
Lecture 2 Fundamental engineering practices
A large chunk of efforts is spent in the infrastructure for coding. The key to the engineering system is automation.
- Automation
- Bash scripting
- Makefile
- Cmake (cross-platform, multi-language automation)
- Version control and regression
- Git version control system
- Automatic testing: author and run with google-test and py.test
- Wrap to Python and test there: pybind11
- Continuous integration to avoid regression
- Work that cannot be automated
- Code review (use github for demonstration)
- Timing to debug for performance
- Wall time and CPU cycles
- System time and user time
- Python timing tools
Lecture 3 Python and numpy
Python is a popular choice for the scripting engine that makes the numerical software work as a platform.
The platform works like a library providing data structures and helpers to solve problems. The users will use Python to build applications.
- Organize Python modules
- Scripts
- Modules
- Package
- Use numpy for array-oriented code
- Data type
- Construction
- Multi-dimensional arrays
- Selection
- Broadcasting
- Use tools for numerical analysis
- Matplotlib
- Linear algebra using numpy ans scipy
- Package management wtih conda and pip
Lecture 4 C++ and computer architecture
The low-level code of numerical software must be high-performance. The industries chose C++ because it can take advantage of everything that a hardware architecture offers while using any level of abstraction.
- Fundamental data types
- Command-line interface for compiler tools
- Compiler, linker
- Multiple source files, separation of declaration and definition, external libraries
- Build multiple binaries and shared objects (dynamically linked libraries)
- Integer, signness, pointer, array indexing
- Floating-point, rounding, exception handling
- Numeric limit
- Command-line interface for compiler tools
- Object-oriented programming
- Class, encapsulation, accessor, reference type
- constructor and destructor
- Polymorphism and RTTI
- CRTP
- Standard template library (STL)
- std::vector, its internal and why the buffer address is dangerous
- std::array, std::list
- std::map, std::set, std::unordered_map, std::unordered_set
Lecture 5 Matrix operations
Matrices are everywhere in numerical analysis. Arrays are the fundamental data structure and used for matrix-vector, matrix-matrix, and other linear algebraic operations.
- POD arrays and majoring
- Vector: 1D array
- Matrix: 2D array
- Row- and column-majoring
- A simple class for matrix
- Matrix-vector and matrix-matrix operations
- Matrix-vector multiplication
- Matrix-matrix multiplication
- Linear algebra
- Linear system solution
- Eigenvalue and singular value problems
- Least square problems
Lecture 6 Cache optimization
How cache works, its importance to performance, and optimization with cache.
- Memory hierarchy
- How cache works
- Cache block (line) size determines speed
- Locality
- Matrix population in C++
- Array majoring in numpy
- Tiling
Lecture 7 SIMD
Parallelism and x86 assembly for SIMD.
- Types of parallelism
- Shared-memory parallelism
- Distributed-memory parallelism
- Vector processing
- SIMD instructions
- CPU capabilities
- x86 intrinsic functions
- Symbol table
- Inspect assembly: 1, 3, 5 multiplications
Lecture 8 Memory management
Numerical software tends to use as much memory as a workstation has. The memory has two major uses: (i) to hold the required huge amount of data, and (ii) to gain speed.
- Linux memory model: stack, heap, and memory map
- C memory management API
- C++ memory management API
- STL allocator API
- Object counter
Lecture 9 Ownership and smart pointers
Ownership and memory management using C++ smart pointers.
- Pointers and ownership
- Raw pointer
- Reference
- Ownership
- Smart pointers
unique_ptr
shared_ptr
- Revisit shared pointer
- Make Data exclusively managed by
shared_ptr
- Get
shared_ptr
fromthis
- Cyclic reference and
weak_ptr
- Make Data exclusively managed by
Lecture 10 Modern C++
Copy elision and move semantics. Variadic template and perfect forwarding. Closure.
- Copy elision / return value optimization
- Make a tracker class for copy construction
- Show the copy elision in action
- Inspect the assembly
- Move semantics and copy elision
- Forced move is a bad idea
- Data concatenation
- Style 1: return
vector
- Style 2: use output
vector
- Style 3: use a class for both return and output argument
- Style 1: return
- Variadic template
- Perfect forwarding
- Lambda expression
- Keep a lambda in a local variable
- Difference between
auto
andstd::function
- Closure
- Comments on functional style
Lecture 11 C++ and C for Python
Use C++ and C to control the CPython interpreter.
- Pybind11 build system
- Setuptools
- Cmake with a sub-directory
- Cmake with install pybind11
- Additional wrapping layer for customization
- Wrapping API
- Functions and property
- Named ane keyword arguments
- What happens in Python stays in Python (or pybind11)
- See how Python plays
- Linear wave
- The inviscid Burgers equation
- Manipulate Python objects in C++
- Python containers
tuple
list
dict
- Use cpython API with pybind11
PyObject
reference counting- Built-in types
- Cached value
- Attribute access
- Function call
- Tuple
- Dictionary
- List
- Useful operations
- Import
- Exception
- Python memory management
- PyMem interface
- Small memory optimization
- Tracemalloc
Lecture 12 Array code in C++
Dissect the array-based code and element-based code and when to use them.
- Python is slow but easy to write
- Speed up by using numpy (still in Python)
- Xtensor: write iterative code in C++ speed using arrays
- Effect of house-keeping code
Lecture 13 Array-oriented design
Software architecture that take advantage of array-based code.
- Design interface with arrays
- Conversion between dynamic and static semantics
- Insert profiling code
Lecture 14 Advanced Python
Advanced topics in Python programming.
- Iterator
- List comprehension
- Generator
- Generator expression
- Stack frame
frame
object
- Customizing module import with
sys.meta_path
- Descriptor
- Keep data on the instance
- Metaclass
- Type introspection and abstract base class (abc)
- Method resolution order (mro)
- Abstract base class (abc)
- Abstract method