Pinned Repositories
awesome-lm-system
Summary of system papers/frameworks/codes/tools on training or serving large model
Dipoorlet
Offline Quantization Tools for Deploy.
LightCompress
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLM, VLM, and video generation models.
LightLLM
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
LightX2V
Light Video Generation Inference Framework
MQBench
Model Quantization Benchmark
Qwen-Image-Lightning
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
TFMQ-DM
[CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
United-Perception
United Perception
Wan2.2-Lightning
Wan2.2-Lightning: Speed up wan2.2 model with distillation
ModelTC's Repositories
ModelTC/United-Perception
United Perception
ModelTC/Dipoorlet
Offline Quantization Tools for Deploy.
ModelTC/awesome-lm-system
Summary of system papers/frameworks/codes/tools on training or serving large model
ModelTC/Outlier_Suppression_Plus
Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
ModelTC/mqbench-paper
ModelTC/rank_dataset
PyTorch Dataset Rank Dataset
ModelTC/QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
ModelTC/NART
NART = NART is not A RunTime, a deep learning inference framework.
ModelTC/NNLQP
ModelTC/LPCV2021_Winner_Solution
ModelTC/pyvlova
Yet another Polyhedra Compiler for DeepLearning
ModelTC/LPCV_2023_solution
ModelTC/Prototype
ModelTC/AAAI2023_EAMPD
AAAI2023 Efficient and Accurate Models towards Practical Deep Learning Baseline
ModelTC/L2_Compression
ModelTC/msbench
A tool for model sparse based on torch.fx
ModelTC/FCPTS
ModelTC/Imagenet-S
Robustness for real-world system noise
ModelTC/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
ModelTC/pyrotom
Python Code Hotfix and Refactor on the fly
ModelTC/statecs
ModelTC/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型
ModelTC/systemnoise_web
ModelTC/tvm-vit
ModelTC/UNRT
UNiversal RunTime
ModelTC/UP_LPCV2023_Plugin