This code accompanies my PhD thesis on Polynomial Multiplication for Post-Quantum Cryptography.

If you find mistakes, parts that could use some more comments, examples that could be added, or anything else that you think should be improved, please drop me an e-mail at <matthias@kannwischer.eu>. I'm also planning to extend this collection to more algorithms that others have used to implement PQC. If you have input about what should be covered, I'd be grateful for any input.

What this code tries to achieve

Demonstrate the core ideas of polynomial multiplication implementations over the last years in a readable and easy to understand fashion.

What this code does not try to achieve

Setting new speed records at anything. I tried to keep the code general and whenever possible independent of the used parameters. In particular, I tried to keep orthogonal optimization strategies separate to make it easier to grasp them. This is of course not yielding an optimal implementation.
Provide any security guarantees at all. A lot of the code is not implemented in constant time (especially those parts that are usually pre-computed anyway). Hence, this code is not production-ready and before using it one needs to eliminate all the timing leaks. Additionally, I did not extensively verify correctness for a wide range of parameters. There are likely parameters that seem to be supported but cause some overflows in some places which in the best case renders them completely non-functional and in the worst-case works most of the time but fails for extreme values. Careful range analysis is required for actual parameters.

Structure

Thesis Chapter	Python Example	C Example
2.2.1 Schoolbook	01schoolbook.py	01schoolbook.c
2.2.2 Karatsuba	02karatsuba.py	02karatsuba.c
2.2.3 Toom--Cook	03toom.py	03toom.c
2.2.4 NTT	04ntt.py	04ntt.c
2.2.5 Radix-2 FFT	05fft.py	05fft.c
2.2.6 Radix-3 FFT	06radix3fft.py	06radix3fft.c
2.2.7 Incomplete NTT	07incomplete.py	07incomplete.c
2.2.8 Good's trick	08goods.py	08goods.c

Note that the NTT implementations do modular reductions using % q. A real implementation would not do that since it is both slow and potentially vulnerable to timing attacks. In a real implementation one would use Montgomery or Barrett reductions. This also affects the twiddle factors as those need to be transformed to Montgomery domain before.

For more details on the general framework, see Python/README.md and [C/README.md](./C/README.md]

mkannwischer/polymul

Structure