Parallel programming: implementation of a CNN kernel using different optimizations on CUDA GPU
Primary LanguageCuda