CUDA Course

GitHub Repo for CUDA Course on FreeCodeCamp

Note: This course is designed for Ubuntu Linux. Windows users can use Windows Subsystem for Linux or Docker containers to simulate the ubuntu Linux environment.

Table of Contents

  1. The Deep Learning Ecosystem
  2. Setup/Installation
  3. C/C++ Review
  4. Gentle Intro to GPUs
  5. Writing Your First Kernels
  6. CUDA APIs (cuBLAS, cuDNN, etc)
  7. Optimizing Matrix Multiplication
  8. Triton
  9. PyTorch Extensions (CUDA)
  10. Final Project
  11. Extras

Course Philosophy

This course aims to:

  • Lower the barrier to entry for HPC jobs
  • Provide a foundation for understanding projects like Karpathy's llm.c
  • Consolidate scattered CUDA programming resources into a comprehensive, organized course

Overview

  • Focus on GPU kernel optimization for performance improvement
  • Cover CUDA, PyTorch, and Triton
  • Emphasis on technical details of writing faster kernels
  • Tailored for NVIDIA GPUs
  • Culminates in a simple MLP MNIST project in CUDA

Prerequisites

  • Python programming (required)
  • Basic differentiation and vector calculus for backprop (recommended)
  • Linear algebra fundamentals (recommended)

Key Takeaways

  • Optimizing existing implementations
  • Building CUDA kernels for cutting-edge research
  • Understanding GPU performance bottlenecks, especially memory bandwidth

Hardware Requirements

  • Any NVIDIA GTX, RTX, or datacenter level GPU
  • Cloud GPU options available for those without local hardware

Use Cases for CUDA/GPU Programming

  • Deep Learning (primary focus of this course)
  • Graphics and Ray-tracing
  • Fluid Simulation
  • Video Editing
  • Crypto Mining
  • 3D modeling
  • Anything that requires parallel processing with large arrays

Resources

  • GitHub repo (this repository)
  • Stack Overflow
  • NVIDIA Developer Forums
  • NVIDIA and PyTorch documentation
  • LLMs for navigating the space

Other Learning Material

Fun YouTube Videos:

Find me