YellowHCH's Stars
adam-maj/tiny-gpu
A minimal GPU design in Verilog to learn how GPUs work from the ground up
modularml/mojo
The Mojo Programming Language
bondhugula/pluto
Pluto: An automatic polyhedral parallelizer and locality optimizer
bytedance/byteps
A high performance and generic framework for distributed DNN training
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
apache/brpc
brpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" means "better RPC".
ml-explore/mlx
MLX: An array framework for Apple silicon
ray-project/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Tiramisu-Compiler/tiramisu
A polyhedral compiler for expressing fast and portable data parallel algorithms
facebookincubator/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
triton-lang/triton
Development repository for the Triton language and compiler
bytedance/matxscript
A high-performance, extensible Python AOT compiler.
plaidml/tpp-mlir
TPP experimentation on MLIR for linear algebra
abseil/abseil-cpp
Abseil Common Libraries (C++)
google/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
alibaba/BladeDISC
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
taichi-dev/taichi
Productive, portable, and performant GPU programming in Python.
henline/streamexecutordoc
Documentation for StreamExecutor open source proposal
llvm/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
tikv/tikv
Distributed transactional key-value database, originally created to complement TiDB
google/leveldb
LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
coder2gwy/coder2gwy
互联网首份程序员考公指南,由3位已经进入体制内的前大厂程序员联合献上。
goldsborough/lru-cache
:dizzy: A feature complete LRU cache implementation in C++
lucasayres/url-feature-extractor
Extracting features from URLs to build a data set for machine learning. The purpose is to find a machine learning model to predict phishing URLs, which are targeted to the Brazilian population.
duoergun0729/nlp
兜哥出品 <一本开源的NLP入门书籍>
wavii/darner
simple, lightweight message queue
skywind3000/RenderHelp
:zap: 可编程渲染管线实现,帮助初学者学习渲染
arvidn/libtorrent
an efficient feature complete C++ bittorrent implementation