taoyao1221's Stars
microsoft/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
pytorch/captum
Model interpretability and understanding for PyTorch
FMInference/FlexGen
Running large language models on a single GPU for throughput-oriented scenarios.
microsoft/LLMLingua
To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
SpursGoZmy/Tabular-LLM
本项目旨在收集开源的表格智能任务数据集(比如表格问答、表格-文本生成等),将原始数据整理为指令微调格式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。
jacobgil/pytorch-grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
raoyongming/DynamicViT
[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
THUDM/LongBench
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
tech-srl/layer_norm_expressivity_role
Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)
luo3300612/Visualizer
assistant tools for attention visualization in deep learning
dvlab-research/LongLoRA
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
huggingface/transformers-bloom-inference
Fast Inference Solutions for BLOOM
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
bigscience-workshop/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
allenai/longformer
Longformer: The Long-Document Transformer
microsoft/UDOP
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
doccano/doccano
Open source annotation tool for machine learning practitioners.
chrismattmann/tika-python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
daodao97/chatdoc
Chat with your doc by openai
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
wangwen-whu/WTW-Dataset
This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
fundamentalvision/Deformable-DETR
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Academic-Hammer/SciTSR
Table structure recognition dataset of the paper: Complicated Table Structure Recognition
MaxKinny222/TabRecSet
A large scale camera-taken table detection and recognition dataset.