cuda
There are 5322 repositories under cuda topic.
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
hashcat/hashcat
World's fastest and most advanced password recovery utility
NVIDIA/nvidia-docker
Build and run Docker containers leveraging NVIDIA GPUs
NVlabs/instant-ngp
Instant neural graphics primitives: lightning fast NeRF and more
kaldi-asr/kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
isl-org/Open3D
Open3D: A Modern Library for 3D Data Processing
numba/numba
NumPy aware dynamic Python compiler using LLVM
srush/GPU-Puzzles
Solve puzzles. Learn CUDA.
vosen/ZLUDA
CUDA on non-NVIDIA GPUs
cupy/cupy
NumPy & SciPy for GPU
rapidsai/cudf
cuDF - GPU DataFrame Library
catboost/catboost
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
replicate/cog
Containers for machine learning
kroma-network/tachyon
Modular ZK(Zero Knowledge) backend accelerated by GPU
hybridgroup/gocv
Go package for computer vision using OpenCV 4 and beyond. Includes support for DNN, CUDA, OpenCV Contrib, and OpenVINO.
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Oneflow-Inc/oneflow
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
chainer/chainer
A flexible framework of neural networks for deep learning
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
chrxh/alien
ALIEN is a CUDA-powered artificial life simulation program.
NVIDIA/thrust
[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
XuehaiPan/nvitop
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
OAID/Tengine
Tengine is a lite, high performance, modular inference engine for embedded device
arrayfire/arrayfire
ArrayFire: a general purpose GPU library.
NVIDIAGameWorks/kaolin
A PyTorch Library for Accelerating 3D Deep Learning Research
rapidsai/cuml
cuML - RAPIDS Machine Learning Library
ROCm/HIP
HIP: C++ Heterogeneous-Compute Interface for Portability
NVlabs/tiny-cuda-nn
Lightning fast C++/CUDA neural network framework
OpenNMT/CTranslate2
Fast inference engine for Transformer models
bytedance/lightseq
LightSeq: A High Performance Library for Sequence Processing and Generation
Celtoys/Remotery
Single C file, Realtime CPU/GPU Profiler with Remote Web Viewer
Jittor/jittor
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
uber/aresdb
A GPU-powered real-time analytics storage and query engine.
heavyai/heavydb
HeavyDB (formerly OmniSciDB)
iree-org/iree
A retargetable MLIR-based machine learning compiler and runtime toolkit.