qelk123's Stars
NVIDIA/cub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
HuaizhengZhang/AI-System-School
🚀 Awesome System for Machine Learning AI System 🚀 Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys, etc. 🗃️ Llama3, Mistral, etc. 🧑💻 Video Tutorials.
XiaoSong9905/CUDA-Optimization-Guide
Xiao's CUDA Optimization Guide [Active Adding New Contents]
cmooredev/RepoReader
Explore and ask questions about a GitHub code repository using OpenAI's GPT.
mlc-ai/mlc-llm
Universal LLM Deployment Engine with ML Compilation
NVIDIA/thrust
[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
cusplibrary/cusplibrary
CUSP : A C++ Templated Sparse Matrix Library
dmlc/dgl
Python package built to ease deep learning on graph, on top of existing DL frameworks.
dgSPARSE/dgSPARSE-Lib
PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity
nvixnu/pmpp__programming_massively_parallel_processors
Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (Third Edition)
mlc-ai/web-llm
High-performance In-browser LLM Inference Engine
cslab-ntua/artificial-matrix-generator
An artificial matrix generator in C
openxla/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
uwsampl/SparseTIR
SparseTIR: Sparse Tensor Compiler for Deep Learning
Syencil/Programming_Massively_Parallel_Processors
CUDA 6大并行计算模式 代码与笔记
yangxuntu/vrd
two models for visual relationship detection
GriffinLiang/vrd-dsr
Code for Visual Relationship Detection with Deep Structural Ranking (AAAI2018)
OAID/Tengine
Tengine is a lite, high performance, modular inference engine for embedded device
merrymercy/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
BBuf/tvm_mlir_learn
compiler learning resources collect.
apache/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
halide/Halide
a language for fast, portable data-parallel computation
rohany/taco
The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs
LeiWang1999/tvm_gpu_gemm
play gemm with tvm
xjtuiair-cag/XJTU-Tripler
XJTU-Tripler is based on HiPU100, an FPGA-friendly DNN accelerator, developed by CAG, Institute of AI & Robotics, XJTU.
areusch/microtvm-blogpost-eval