/cuda_programming

Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch

Primary LanguageCudaGNU General Public License v3.0GPL-3.0

GPGPU Programming with CUDA

This repository contains all code from the YouTube series "GPGPU Programming with CUDA" by CoffeeBeforeArch.

Contact

Suggestions for specific content can be sent to: CoffeeBeforeArch@gmail.com

An up to date list on all series is available at: Google Sheets

Environment

Operating System: Windows 10 & Ubuntu 18.04

IDE: Visual Studio 2017

Text Editor: VIM

GPU: NVIDIA GTX 1050 Ti

CUDA version: 10.0, 9.1

Concepts covered in each video

Video Concepts Files
GPGPU Programming with CUDA: Vector Add GPU Threads, Memory Allocation, Memory Copy, GPU Kernels, Running Kernels vector_add.cu
GPGPU Programming with CUDA: Vector Add with Unified Memory Unified Memory, Prefetching vector_add_um.cu
GPGPU Programming with CUDA: Matrix Multiplication 2-D Threadblocks, Alligned Memory Accesses matrix_mul.cu
GPGPU Programming with CUDA: Tiled Matrix Multiplication Shared Memory, Cache Tiling, Performance Analysis, Optimization tiled_matrix_mul.cu
CUDA Crash Course: Why Coalescing Matters Transposing Matrices, Coalescing Techniques alignment_matrix_mul.cu
CUDA Crash Course: cuBLAS for Vector Add cuBLAS, SAXPY simple_cublas.cu
CUDA Crash Course: cuBLAS for Matrix Multiplication Column-Major Order, SGEMM, cuRAND cublas_matrix_mul.cu
CUDA Crash Course: Sum Reduction Part 1 Sum Reduction, Warp Divergence sum_reduction_diverged.cu
CUDA Crash Course: Sum Reduction Part 2 Expensive Operations, Optimization, Warp Divergence sum_reduction_bank_conflicts.cu
CUDA Crash Course: Sum Reduction Part 3 Optimization, Shared Memory Bank Conflicts sum_reduction_no_conflicts.cu
CUDA Crash Course: Sum Reduction Part 4 Optimization, Idle Threads sum_reduction_reduce_idle_threads.cu
CUDA Crash Course: Sum Reduction Part 5 Optimization, Device Function, Loop Unrolling sum_reduction_device_function.cu
CUDA Crash Course: Visual Studio 2017 Environment Setup Setup, Linker, Visual Studio, Environmen, Build Paths vs_setup.cu
CUDA Crash Course: Programming in Linux NVCC, NVprof, Vector Addition vector_add.cu
CUDA Crash Course: Video Corrections TB Calculations, Verification vector_add.cu
matrix_mul.cu
CUDA Crash Course: Video Corrections Cooperative Groups, Synchronization, Atomic Instructions sum_reduction_cooperative_groups.cu