GPU-FFT

Welcome to the GPU-FFT-Optimization repository! We present cutting-edge algorithms and implementations for optimizing the Fast Fourier Transform (FFT) on Graphics Processing Units (GPUs).

The associated research paper: https://eprint.iacr.org/2023/1410

NTT variant of GPU-FFT is available: https://github.com/Alisah-Ozcan/GPU-NTT

Development

Requirements

Build & Install

Two different fix-point data type supported. They represented as numbers:

COPLEX_DATA_TYPE=0 -> FLOAT_64(64 bit)
COPLEX_DATA_TYPE=1 -> FLOAT_32(32 bit)

To build:

$ cmake -D CMAKE_CUDA_ARCHITECTURES=86 -D COPLEX_DATA_TYPE=0 -B./build
$ cmake --build ./build/ --parallel

To install:

$ cmake -D CMAKE_CUDA_ARCHITECTURES=86 -D COPLEX_DATA_TYPE=0 -B./build
$ cmake --build ./build/ --parallel
$ sudo cmake --install build

Testing & Benchmarking

CPU & GPU FTT Testing & Benchmarking

To run examples:

$ cmake -D CMAKE_CUDA_ARCHITECTURES=86 -D COPLEX_DATA_TYPE=0 -D GPUFFT_BUILD_EXAMPLES=ON -B./build
$ cmake --build ./build/ --parallel

$ ./build/bin/cpu_fft_examples        <RING_SIZE_IN_LOG2> <BATCH_SIZE>
$ ./build/bin/gpu_fft_examples        <RING_SIZE_IN_LOG2> <BATCH_SIZE>
$ Example: ./build/bin/gpu_fft_examples 12 1

To run benchmarks:

$ cmake -D CMAKE_CUDA_ARCHITECTURES=86 -D COPLEX_DATA_TYPE=0 -D GPUFFT_BUILD_BENCHMARKS=ON -B./build
$ cmake --build ./build/ --parallel

$ ./build/bin/gpu_fft_mult_benchmark  <RING_SIZE_IN_LOG2> <BATCH_SIZE>
$ ./build/bin/gpu_fft_benchmark       <RING_SIZE_IN_LOG2> <BATCH_SIZE>
$ Example: ./build/bin/gpu_fft_examples 12 1

Using GPU-FFT in a downstream CMake project

Make sure GPU-FFT is installed before integrating it into your project. The installed GPU-FFT library provides a set of config files that make it easy to integrate GPU-FFT into your own CMake project. In your CMakeLists.txt, simply add:

project(<your-project> LANGUAGES CXX CUDA)
find_package(CUDAToolkit REQUIRED)
# ...
find_package(GPUFFT)
# ...
target_link_libraries(<your-target> (PRIVATE|PUBLIC|INTERFACE) GPUNTT::ntt CUDA::cudart)
# ...
add_compile_definitions(FLOAT_64) # Builded reduction method 
target_compile_definitions(<your-target> PRIVATE FLOAT_64)
set_target_properties(<your-target> PROPERTIES CUDA_SEPARABLE_COMPILATION ON)
# ...

How to Cite GPU-FFT

Please use the below BibTeX, to cite GPU-FFT in academic papers.

@misc{cryptoeprint:2023/1410,
      author = {Ali Şah Özcan and Erkay Savaş},
      title = {Two Algorithms for Fast GPU Implementation of NTT},
      howpublished = {Cryptology ePrint Archive, Paper 2023/1410},
      year = {2023},
      note = {\url{https://eprint.iacr.org/2023/1410}},
      url = {https://eprint.iacr.org/2023/1410}
}

License

This project is licensed under the Apache License. For more details, please refer to the License file.

Contact

If you have any questions or feedback, feel free to contact me:

Email: alisah@sabanciuniv.edu
LinkedIn: Profile

Alisah-Ozcan/GPU-FFT