Sakits

Ph.D. student @ MIT; MLSys & Algo.

MIT, EECSCambridge, MA

Sakits's Stars

hiyouga/LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
Language:Python31.9k 204 4.9k3.9k
modularml/mojo
The Mojo Programming Language
Language:Mojo23k 264 2.1k2.6k
joonspk-research/generative_agents
Generative Agents: Interactive Simulacra of Human Behavior
16.5k 134 1252.1k
facebookresearch/nougat
Implementation of Nougat Neural Optical Understanding for Academic Documents
Language:Python8.8k 63 213559
jzhang38/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Language:Python7.7k 108 156454
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Language:C6.2k 121 2381.8k
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Language:Python4.3k 35 1.4k390
ray-project/llm-numbers
Numbers every LLM developer should know
4.1k 59 17138
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Language:Python3.8k 34 516302
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python2.4k 23 179195
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Language:Jupyter Notebook2.2k 33 87153
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Language:Python1.7k 16 394202
horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
Language:Python1.2k 43 283
microsoft/Llama-2-Onnx
Language:Python1k 337 2692
punica-ai/punica
Serving multiple LoRA finetuned LLM as one
Language:Python964 12 3945
zhanglj37/Tutorial-on-PhD-Application
Tutorial on PhD Application
853 9 099
mit-han-lab/TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
Language:C++717 15 4068
mryab/efficient-dl-systems
Efficient Deep Learning Systems course materials (HSE, YSDA)
Language:Jupyter Notebook652 14 4105
THUDM/LongBench
[ACL 2024] LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Language:Python632 6 6945
abacusai/Long-Context
This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.
Language:Python578 13 635
huggingface/pytorch_block_sparse
Fast Block Sparse Matrices for Pytorch
Language:C++545 61 1535
openppl-public/ppl.cv
ppl.cv is a high-performance image processing library of openPPL supporting various platforms.
Language:C++485 16 41108
bojone/rerope
Rectified Rotary Position Embeddings
Language:Python332 11 2029
FMInference/DejaVu
Language:Python273 6 3333
mlc-ai/binary-mlc-llm-libs
200 11 3345
mit-han-lab/parallel-computing-tutorial
Language:C++134 10 015
kyegomez/FlashAttention20
Get down and dirty with FlashAttention2.0 in pytorch, plug in and play no complex CUDA kernels
Language:Python91 2 36
mit-han-lab/tinychat-tutorial
Language:C++37 7 313
DS3Lab/Decentralized_FM_alpha
Language:Python19 8 07
mlc-ai/dlight-bench
Language:Python3 7 14

Sakits

Sakits's Stars

hiyouga/LLaMA-Factory

modularml/mojo

joonspk-research/generative_agents

facebookresearch/nougat

jzhang38/TinyLlama

NVIDIA/cuda-samples

InternLM/lmdeploy

ray-project/llm-numbers

InternLM/xtuner

ModelTC/lightllm

FasterDecoding/Medusa

casper-hansen/AutoAWQ

horseee/Awesome-Efficient-LLM

microsoft/Llama-2-Onnx

punica-ai/punica

zhanglj37/Tutorial-on-PhD-Application

mit-han-lab/TinyChatEngine

mryab/efficient-dl-systems

THUDM/LongBench

abacusai/Long-Context

huggingface/pytorch_block_sparse

openppl-public/ppl.cv

bojone/rerope

FMInference/DejaVu

mlc-ai/binary-mlc-llm-libs

mit-han-lab/parallel-computing-tutorial

kyegomez/FlashAttention20

mit-han-lab/tinychat-tutorial

DS3Lab/Decentralized_FM_alpha

mlc-ai/dlight-bench