yiliu30

Talk is cheap, pick one and do it.

AI Frameworks Engineer @IntelSH

yiliu30's Stars

xai-org/grok-1
Grok open release
Language:Python49.6k 574 2108.3k
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python37.5k 377 3186k
facebookresearch/faiss
A library for efficient similarity search and clustering of dense vectors.
Language:C++31.6k 479 2.5k3.6k
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.2k 226 2643.1k
karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda24.5k 246 1412.8k
karpathy/nn-zero-to-hero
Neural Networks: Zero to Hero
Language:Jupyter Notebook11.9k 287 331.5k
ai-boost/awesome-prompts
Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.
5.3k 63 8487
pytorch/torchtune
PyTorch native finetuning library
Language:Python4.3k 47 720440
openai/transformer-debugger
Language:Python4k 25 14238
ROCm/HIP
HIP: C++ Heterogeneous-Compute Interface for Portability
Language:C++3.8k 142 877539
pytorch/executorch
On-device AI across mobile, embedded and edge for PyTorch
Language:C++2.2k 60 516367
pytorch/ao
PyTorch native quantization and sparsity for training and inference
Language:Python1.6k 41 299177
pytorch/functorch
functorch is JAX-like composable function transforms for PyTorch.
Language:Jupyter Notebook1.4k 27 522102
huggingface/optimum-quanto
A pytorch quantization backend for optimum
Language:Python828 8 13561
OpenGVLab/OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
Language:Python730 16 8556
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Language:Python649 18 2743
tspeterkim/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
Language:Cuda630 4 654
IST-DASLab/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Language:Python626 15 2948
Vahe1994/SpQR
Language:Python527 19 2443
google-research/sputnik
A library of GPU kernels for sparse matrix operations.
Language:C++249 10 850
yxli2123/LoftQ
Language:Python199 4 3819
Aaronhuang-778/BiLLM
(ICML 2024) BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Language:Python197 6 1713
puttsk/cuda-tutorial
A set of hands-on tutorials for CUDA programming
Language:Cuda194 5 333
IST-DASLab/QUIK
Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024
Language:C++173 6 712
xijiu9/Train_Transformers_with_INT4
Language:Python134 5 34
thu-nics/qllm-eval
Code Repository of Evaluating Quantized Large Language Models
Language:Python104 5 54
sunlex0717/DissectingTensorCores
Language:Cuda79 3 418
iree-org/iree-torch
Torch Frontend for IREE
Language:Python25 16 1111
facebookexperimental/protoquant
Prototype routines for GPU quantization written using PyTorch.
Language:Python19 9 08
Quansight/torch-build
Collection of scripts to build PyTorch and the domain libraries from source.
Language:Shell8 3 18

yiliu30

yiliu30's Stars

xai-org/grok-1

karpathy/nanoGPT

facebookresearch/faiss

meta-llama/llama3

karpathy/llm.c

karpathy/nn-zero-to-hero

ai-boost/awesome-prompts

pytorch/torchtune

openai/transformer-debugger

ROCm/HIP

pytorch/executorch

pytorch/ao

pytorch/functorch

huggingface/optimum-quanto

OpenGVLab/OmniQuant

SqueezeAILab/SqueezeLLM

tspeterkim/flash-attention-minimal

IST-DASLab/marlin

Vahe1994/SpQR

google-research/sputnik

yxli2123/LoftQ

Aaronhuang-778/BiLLM

puttsk/cuda-tutorial

IST-DASLab/QUIK

xijiu9/Train_Transformers_with_INT4

thu-nics/qllm-eval

sunlex0717/DissectingTensorCores

iree-org/iree-torch

facebookexperimental/protoquant

Quansight/torch-build