Accelerating CNN's convolution operation on GPUs by using memory-efficient data access patterns.
Primary LanguageCuda