The repository covers a wide range of topics, each aimed at improving efficiency and performance in GPU programming. Here’s a detailed look at what I learned:
Description : This section provides an introduction to CUDA programming, designed for those new to GPU programming. This post includes the basics of CUDA, including how to set up your development environment, write and compile your first CUDA program.
Description : This section provides detailed explanation about the hierarchical structure of CUDA threads, including grids, blocks, and threads. This post includes calculating global thread index through thread indexing and some example code about image processing.
CUDA thread hierarchy, memory hierarchy, GPU cache structure
Description : This section delves into the advanced aspects of CUDA and Nvidia GPU architecture, including the hierarchical organization of threads, the different levels of memory, and the structure of GPU caches.
CUDA memories : registers, shared memory, global memory
Decsription : This section explores the different types of memory in CUDA, focusing on registers, shared memory, and global memory. his post delves into the characteristics of each memory type and provides strategies for effectively utilizing them to enhance the efficiency of CUDA kernels.