yiliu30

Talk is cheap, pick one and do it.

AI Frameworks Engineer @IntelSH

yiliu30's Stars

chatanywhere/GPT_API_free
Free ChatGPT API Key，免费ChatGPT API，支持GPT4 API（免费），ChatGPT国内可用免费转发API，直连无需代理。可以搭配ChatBox等软件/插件使用，极大降低接口使用成本。国内即可无限制畅快聊天。
Language:Python24.8k 118 2961.9k
HigherOrderCO/Bend
A massively parallel, high-level programming language
Language:Rust17.5k 93 259428
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
Language:Jupyter Notebook13.7k 98 181.1k
adityatelange/hugo-PaperMod
A fast, clean, responsive Hugo theme.
Language:HTML10.3k 41 5512.7k
srush/GPU-Puzzles
Solve puzzles. Learn CUDA.
Language:Jupyter Notebook10k 192 32858
dottxt-ai/outlines
Structured Text Generation
Language:Python9.5k 48 626490
jzhang38/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Language:Python7.9k 110 158466
NVlabs/tiny-cuda-nn
Lightning fast C++/CUDA neural network framework
Language:C++3.8k 49 392458
turboderp/exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
Language:Python3.7k 35 462282
iree-org/iree
A retargetable MLIR-based machine learning compiler and runtime toolkit.
Language:C++2.8k 88 4k620
turboderp/exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
Language:Python2.8k 37 219220
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python2.6k 23 185206
flame/blis
BLAS-like Library Instantiation Software Framework
Language:C2.3k 79 444369
neuralmagic/sparseml
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Language:Python2.1k 48 206148
numba/llvmlite
A lightweight LLVM python binding for writing JIT compilers
Language:Python1.9k 57 456319
HazyResearch/ThunderKittens
Tile primitives for speedy kernels
Language:Cuda1.7k 29 2970
srush/Triton-Puzzles
Puzzles for learning Triton
Language:Jupyter Notebook1.1k 10 1383
pytorch/kineto
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
Language:HTML732 25 216170
microsoft/BitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
Language:Python420 15 6834
FMInference/H2O
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
Language:Python392 5 3842
KEKE046/mlir-tutorial
Hands-On Practical MLIR Tutorial
Language:C++338 3 346
fpgaminer/GPTQ-triton
GPTQ inference Triton kernel
Language:Jupyter Notebook284 12 2023
facebookincubator/dynolog
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also integrates with pytorch and can trigger traces for distributed training applications.
Language:C++271 16 2941
Deep-Learning-Profiling-Tools/triton-viz
Language:Python153 7 1013
intel/intel-xpu-backend-for-triton
OpenAI Triton backend for Intel® GPUs
Language:MLIR143 25 1.1k44
gpu-mode/triton-index
Cataloging released Triton kernels.
137 7 07
nod-ai/SHARK-ModelDev
Unified compiler/runtime for interfacing with PyTorch Dynamo.
Language:Python95 32 58248
NVIDIA/online-softmax
Benchmark code for the "Online normalizer calculation for softmax" paper
Language:Cuda59 6 07
iree-org/iree-turbine
IREE's PyTorch Frontend, based on Torch Dynamo.
Language:Python55 24 3025
pytorch-labs/triton-cpu
An experimental CPU backend for Triton (https//github.com/openai/triton)
Language:C++35 17 12

yiliu30

yiliu30's Stars

chatanywhere/GPT_API_free

HigherOrderCO/Bend

naklecha/llama3-from-scratch

adityatelange/hugo-PaperMod

srush/GPU-Puzzles

dottxt-ai/outlines

jzhang38/TinyLlama

NVlabs/tiny-cuda-nn

turboderp/exllamav2

iree-org/iree

turboderp/exllama

ModelTC/lightllm

flame/blis

neuralmagic/sparseml

numba/llvmlite

HazyResearch/ThunderKittens

srush/Triton-Puzzles

pytorch/kineto

microsoft/BitBLAS

FMInference/H2O

KEKE046/mlir-tutorial

fpgaminer/GPTQ-triton

facebookincubator/dynolog

Deep-Learning-Profiling-Tools/triton-viz

intel/intel-xpu-backend-for-triton

gpu-mode/triton-index

nod-ai/SHARK-ModelDev

NVIDIA/online-softmax

iree-org/iree-turbine

pytorch-labs/triton-cpu