jysh1214

jysh1214's Stars

ggerganov/llama.cpp
LLM inference in C/C++
Language:C++65.9k9.5k
karpathy/llama2.c
Inference Llama 2 in one file of pure C
Language:C17.3k2.1k
BBuf/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
Language:Cuda1.5k122
nod-ai/SHARK-Studio
SHARK Studio -- Web UI for SHARK+IREE High Performance Machine Learning Distribution
Language:Python1.4k170
llvm/torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
Language:C++1.3k491
openxla/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
Language:C++2.6k409
Ben-McKay/concrete-algebra
A textbook of elementary undergraduate algebra with an emphasis on hand and computer computation, as a precursor to the usual big algebra texts. The files are in LaTeX, and the main source file is algebra.tex.
Language:Jupyter Notebook8120
OpenHEVC/openHEVC
HEVC decoder
Language:C531192
DvorakDwarf/Infinite-Storage-Glitch
ISG lets you use YouTube as cloud storage for ANY files, not just video
Language:Rust11.4k905
lemire/SIMDCompressionAndIntersection
A C++ library to compress and intersect sorted lists of integers using SIMD instructions
Language:C++42058
chenzomi12/AISystem
AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Language:Jupyter Notebook10.7k1.5k
BBuf/tvm_mlir_learn
compiler learning resources collect.
Language:Python2.1k324
huggingface/safetensors
Simple, safe way to store and distribute tensors
Language:Python2.8k191
buddy-compiler/buddy-mlir
An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).
Language:C++495160
Bruce-Lee-LY/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
Language:Cuda27264
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Language:C6.2k1.8k
LitLeo/OpenCUDA
Language:Cuda256112
python-lz4/python-lz4
LZ4 bindings for Python
Language:C27569
NVIDIA/nvcomp
Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.
Language:C++55779
google/XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Language:C1.8k348
iree-org/iree-nvgpu
Language:MLIR4819
sysprog21/semu
A minimalist RISC-V system emulator capable of running Linux kernel
Language:C25047
huggingface/candle
Minimalist ML framework for Rust
Language:Rust15.4k911
sysprog21/shecc
A self-hosting and educational C optimizing compiler
Language:C1.1k118
Lorex/FHIR-Universal-Conversion-Kit
FHIR Universal Conversion Kit (F.U.C.K.) is a conversion kit that can convert albitary data to HL7 FHIR data.
Language:JavaScript356
facebook/zstd
Zstandard - Fast real-time compression algorithm
Language:C23.5k2.1k
kenjihiranabe/The-Art-of-Linear-Algebra
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"
Language:PostScript17.8k2.2k
MaJerle/c-code-style
Recommended C code style and coding rules for standard C99 or later
Language:Python1k231
bmaltais/kohya_ss
Language:Python9.5k1.2k
camenduru/stable-diffusion-webui-colab
stable diffusion webui colab
Language:Jupyter Notebook15.6k2.6k