brucechin

In scaling law we trust.

Bytedance Inc

brucechin's Stars

lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Language:Python35.4k 346 1.7k4.3k
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python33.5k 339 2.6k3.9k
hiyouga/LLaMA-Factory
Unify Efficient Fine-Tuning of 100+ LLMs
Language:Python25k 169 4.1k3.1k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python21.6k 198 3.2k3k
Bin-Huang/chatbox
User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
Language:TypeScript19.6k 123 1.3k2k
wilsonfreitas/awesome-quant
A curated list of insanely awesome libraries, packages and resources for Quants (Quantitative Finance)
Language:Python16.7k 681 642.5k
triton-lang/triton
Development repository for the Triton language and compiler
Language:C++11.8k 184 1.3k1.4k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python11.7k 105 8461k
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python10.6k 195 2.1k2.2k
FMInference/FlexGen
Running large language models on a single GPU for throughput-oriented scenarios.
Language:Python9.1k 109 80528
huggingface/text-generation-inference
Large Language Model Text Generation Inference
Language:Python8.3k 99 1.2k938
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++7.3k 85 1.5k788
cloneofsimo/lora
Using Low-rank adaptation to quickly fine-tune diffusion models.
Language:Jupyter Notebook6.7k 59 137472
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.6k 65 623874
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Language:Python5.6k 46 73492
FlagAI-Open/FlagAI
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.
Language:Python3.8k 43 210416
facebookarchive/beringei
Beringei is a high performance, in-memory storage engine for time series data.
Language:C++3.2k 201 0296
neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
Language:Python2.9k 55 129168
risc0/risc0
RISC Zero is a zero-knowledge verifiable general computing platform based on zk-STARKs and the RISC-V microarchitecture.
Language:C++1.5k 53 496362
deepseek-ai/DeepSeek-LLM
DeepSeek LLM: Let there be answers
Language:Makefile1.3k 22 3287
km1994/LLMs_interview_notes
该仓库主要记录大模型（LLMs）算法工程师相关的面试题
1.1k 9 188
facebookresearch/MetaCLIP
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
Language:Python1.1k 13 2248
NUS-HPC-AI-Lab/OpenDiT
OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference
Language:Python1.1k 20 5161
HillZhang1999/llm-hallucination-survey
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
849 12 346
scaleapi/llm-engine
Scale LLM Engine public repository
Language:Python753 21 5248
jguamie/system-design
657 15 0200
volcengine/veScale
A PyTorch Native LLM Training Framework
Language:Python470 34 519
AmadeusChan/Awesome-LLM-System-Papers
433 13 121
sallenkey-wei/cuda-handbook
pdf
8232
RomanArzumanyan/VALI
Video processing in Python
Language:C++19 3 131

brucechin

brucechin's Stars

lm-sys/FastChat

microsoft/DeepSpeed

hiyouga/LLaMA-Factory

vllm-project/vllm

Bin-Huang/chatbox

wilsonfreitas/awesome-quant

triton-lang/triton

Dao-AILab/flash-attention

NVIDIA/NeMo

FMInference/FlexGen

huggingface/text-generation-inference

NVIDIA/TensorRT-LLM

cloneofsimo/lora

NVIDIA/FasterTransformer

facebookresearch/DiT

FlagAI-Open/FlagAI

facebookarchive/beringei

neuralmagic/deepsparse

risc0/risc0

deepseek-ai/DeepSeek-LLM

km1994/LLMs_interview_notes

facebookresearch/MetaCLIP

NUS-HPC-AI-Lab/OpenDiT

HillZhang1999/llm-hallucination-survey

scaleapi/llm-engine

jguamie/system-design

volcengine/veScale

AmadeusChan/Awesome-LLM-System-Papers

sallenkey-wei/cuda-handbook

RomanArzumanyan/VALI