brucechin's Stars
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
hiyouga/LLaMA-Factory
Unify Efficient Fine-Tuning of 100+ LLMs
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Bin-Huang/chatbox
User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
wilsonfreitas/awesome-quant
A curated list of insanely awesome libraries, packages and resources for Quants (Quantitative Finance)
triton-lang/triton
Development repository for the Triton language and compiler
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
FMInference/FlexGen
Running large language models on a single GPU for throughput-oriented scenarios.
huggingface/text-generation-inference
Large Language Model Text Generation Inference
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
cloneofsimo/lora
Using Low-rank adaptation to quickly fine-tune diffusion models.
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
FlagAI-Open/FlagAI
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.
facebookarchive/beringei
Beringei is a high performance, in-memory storage engine for time series data.
neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
risc0/risc0
RISC Zero is a zero-knowledge verifiable general computing platform based on zk-STARKs and the RISC-V microarchitecture.
deepseek-ai/DeepSeek-LLM
DeepSeek LLM: Let there be answers
km1994/LLMs_interview_notes
该仓库主要记录 大模型(LLMs) 算法工程师相关的面试题
facebookresearch/MetaCLIP
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
NUS-HPC-AI-Lab/OpenDiT
OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference
HillZhang1999/llm-hallucination-survey
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
scaleapi/llm-engine
Scale LLM Engine public repository
jguamie/system-design
volcengine/veScale
A PyTorch Native LLM Training Framework
AmadeusChan/Awesome-LLM-System-Papers
sallenkey-wei/cuda-handbook
pdf
RomanArzumanyan/VALI
Video processing in Python