SunMarc

Machine Learning Engineer @huggingface

Hugging Face New York

SunMarc's Stars

mlabonne/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Language:Jupyter Notebook37.8k 397 674k
karpathy/LLM101n
LLM101n: Let's build a Storyteller
29.2k 2.2k 01.6k
karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda23.7k 233 1382.7k
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python19.6k 302 1.4k2.5k
apple/ml-ferret
Language:Python8.3k 157 0485
huggingface/lerobot
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Language:Python6.7k 73 107593
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.8k 62 625889
pytorch/torchtitan
A native PyTorch Library for large model training
Language:Python2.3k 37 139170
huggingface/huggingface_hub
The official Python client for the Huggingface Hub.
Language:Python2k 59 949531
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Language:Python2k 45 125138
microsoft/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Language:Python1.9k 41 301174
huggingface/cookbook
Open-source AI cookbook
Language:Jupyter Notebook1.6k 35 62224
intel/intel-extension-for-pytorch
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Language:Python1.6k 36 537241
cuda-mode/resource-stream
CUDA related news and material links
1.1k 37 267
IST-DASLab/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Language:Python575 15 2845
mit-han-lab/qserve
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Language:Python407 9 3019
huggingface/local-gemma
Gemma 2 optimized for your local machine.
Language:Python329 8 1128
SqueezeAILab/KVQuant
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Language:Python284 13 1525
spcl/QuaRot
Code for QuaRot, an end-to-end 4-bit inference of large language models.
Language:Python259 11 4020
jy-yuan/KIVI
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
Language:Python217 5 2421
muellerzr/minimal-trainer-zoo
Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines
Language:Python195 4 113
NetEase-FuXi/EETQ
Easy and Efficient Quantization for Transformers
Language:C++174 6 2214
VITA-Group/Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
Language:Python161 11 613
neuralmagic/AutoFP8
Language:Python150 13 2617
mit-han-lab/lmquant
Language:Python107 2 177
exo-explore/mlx-bitnet
1.58 Bit LLM on Apple Silicon using MLX
Language:Python104 4 15
LiqunMa/FBI-LLM
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
Language:Python43 4 12
neuralmagic/compressed-tensors
A safetensors extension to efficiently store sparse quantized tensors on disk
Language:Python31 11 40
aredden/torch-bnb-fp4
Faster Pytorch bitsandbytes 4bit fp4 nn.Linear ops
Language:Python22 4 00
muellerzr/import-timer
Pragmatic approach to parsing import profiles for CI's
Language:Python11 1 01

SunMarc

SunMarc's Stars

mlabonne/llm-course

karpathy/LLM101n

karpathy/llm.c

microsoft/unilm

apple/ml-ferret

huggingface/lerobot

NVIDIA/FasterTransformer

pytorch/torchtitan

huggingface/huggingface_hub

huggingface/datatrove

microsoft/DeepSpeed-MII

huggingface/cookbook

intel/intel-extension-for-pytorch

cuda-mode/resource-stream

IST-DASLab/marlin

mit-han-lab/qserve

huggingface/local-gemma

SqueezeAILab/KVQuant

spcl/QuaRot

jy-yuan/KIVI

muellerzr/minimal-trainer-zoo

NetEase-FuXi/EETQ

VITA-Group/Q-GaLore

neuralmagic/AutoFP8

mit-han-lab/lmquant

exo-explore/mlx-bitnet

LiqunMa/FBI-LLM

neuralmagic/compressed-tensors

aredden/torch-bnb-fp4

muellerzr/import-timer