Pinned Repositories
awesome-lm-system
Summary of system papers/frameworks/codes/tools on training or serving large model
Dipoorlet
Offline Quantization Tools for Deploy.
EasyLLM
Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.
LightLLM
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
llmc
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
MQBench
Model Quantization Benchmark
OmniBal
Outlier_Suppression_Plus
Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
TFMQ-DM
[CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
United-Perception
United Perception
ModelTC's Repositories
ModelTC/LightLLM
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
ModelTC/MQBench
Model Quantization Benchmark
ModelTC/LightX2V
Light Video Generation Inference Framework
ModelTC/llmc
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
ModelTC/Dipoorlet
Offline Quantization Tools for Deploy.
ModelTC/TFMQ-DM
[CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
ModelTC/awesome-lm-system
Summary of system papers/frameworks/codes/tools on training or serving large model
ModelTC/Outlier_Suppression_Plus
Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
ModelTC/EasyLLM
Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.
ModelTC/QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
ModelTC/OmniBal
ModelTC/LPCV_2023_solution
ModelTC/L2_Compression
ModelTC/quant_horizon
ModelTC/msbench
A tool for model sparse based on torch.fx
ModelTC/general-sam
A general suffix automaton implementation in Rust with Python bindings
ModelTC/FCPTS
ModelTC/mtc-token-healing
Token healing implementation in Rust
ModelTC/general-sam-py
Python bindings for general-sam and some utilities
ModelTC/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
ModelTC/statecs
ModelTC/verl
verl: Volcano Engine Reinforcement Learning for LLMs
ModelTC/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
ModelTC/greedy-tokenizer
Greedily tokenize strings with the longest tokens iteratively.
ModelTC/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型
ModelTC/lightllm-blog
ModelTC/LLM_QAT
ModelTC/modeltc.github.io
ModelTC/UP_LPCV2023_Plugin
ModelTC/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)