xiangqi1997's Stars
Thinklab-SJTU/Awesome-LLM4AD
A curated list of awesome LLM for Autonomous Driving resources (continually updated)
OpenGVLab/GUI-Odyssey
GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 201 apps, and 1.4K app combos.
mit-han-lab/qserve
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
FoundationVision/GenerateU
[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection
gligen/GLIGEN
Open-Set Grounded Text-to-Image Generation
AILab-CVC/YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
shikras/shikra
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
jianzongwu/Awesome-Open-Vocabulary
(TPAMI 2024) A Survey on Open Vocabulary Learning
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Atten4Vis/MS-DETR
[CVPR 2024] The official implementation for "MS-DETR: Efficient DETR Training with Mixed Supervision"
JHYu-Researcher/Eyemovement-Decision
taodeng/CDNN-traffic-saliency
IDEA-Research/detrex
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
exiawsh/StreamPETR
[ICCV 2023] StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
zli2014/python--
总结-改进-汇总
buyizhiyou/NRVQA
no reference image/video quaity assessment(BRISQUE/NIQE/PIQE/DIQA/deepBIQ/VSFA
chaofengc/IQA-PyTorch
👁️ 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including LPIPS, FID, NIQE, NRQM(Ma), MUSIQ, TOPIQ, NIMA, DBCNN, BRISQUE, PI and more...
chaofengc/Awesome-Image-Quality-Assessment
A comprehensive collection of IQA papers
fengyang0317/unsupervised_captioning
Code for Unsupervised Image Captioning
tangbinh/image-captioning