jysh1214's Stars
ggerganov/llama.cpp
LLM inference in C/C++
karpathy/llama2.c
Inference Llama 2 in one file of pure C
BBuf/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
nod-ai/SHARK-Studio
SHARK Studio -- Web UI for SHARK+IREE High Performance Machine Learning Distribution
llvm/torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
openxla/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
Ben-McKay/concrete-algebra
A textbook of elementary undergraduate algebra with an emphasis on hand and computer computation, as a precursor to the usual big algebra texts. The files are in LaTeX, and the main source file is algebra.tex.
OpenHEVC/openHEVC
HEVC decoder
DvorakDwarf/Infinite-Storage-Glitch
ISG lets you use YouTube as cloud storage for ANY files, not just video
lemire/SIMDCompressionAndIntersection
A C++ library to compress and intersect sorted lists of integers using SIMD instructions
chenzomi12/AISystem
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
BBuf/tvm_mlir_learn
compiler learning resources collect.
huggingface/safetensors
Simple, safe way to store and distribute tensors
buddy-compiler/buddy-mlir
An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).
Bruce-Lee-LY/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
LitLeo/OpenCUDA
python-lz4/python-lz4
LZ4 bindings for Python
NVIDIA/nvcomp
Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.
google/XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
iree-org/iree-nvgpu
sysprog21/semu
A minimalist RISC-V system emulator capable of running Linux kernel
huggingface/candle
Minimalist ML framework for Rust
sysprog21/shecc
A self-hosting and educational C optimizing compiler
Lorex/FHIR-Universal-Conversion-Kit
FHIR Universal Conversion Kit (F.U.C.K.) is a conversion kit that can convert albitary data to HL7 FHIR data.
facebook/zstd
Zstandard - Fast real-time compression algorithm
kenjihiranabe/The-Art-of-Linear-Algebra
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"
MaJerle/c-code-style
Recommended C code style and coding rules for standard C99 or later
bmaltais/kohya_ss
camenduru/stable-diffusion-webui-colab
stable diffusion webui colab