xinyuliu-jeffrey's Stars
megvii-research/FullMatch
Official implementation of FullMatch (CVPR2023)
lllyasviel/Omost
Your image is almost there!
LLaVA-VL/LLaVA-NeXT
SLDGroup/EMCAD
Deaddawn/MovieLLM-code
franciszzj/TP-SIS
[NeurIPS 2023] Text Promptable Surgical Instrument Segmentation with Vision-Language Models
HJYao00/DenseConnector
Dense Connector for MLLMs
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
sunanhe/MedDr
A generalist foundation model for healthcare capable of handling diverse medical data modalities.
RPIDIAL/Disease-informed-VLM-Adaptation
MICCAI 2024 - Disease-informed Adaptation of Vision-Language Models
whwu95/FreeVA
FreeVA: Offline MLLM as Training-Free Video Assistant
HITsz-TMG/UMOE-Scaling-Unified-Multimodal-LLMs
The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"
THU-MIG/yolov10
YOLOv10: Real-Time End-to-End Object Detection
yuanli2333/Teacher-free-Knowledge-Distillation
Knowledge Distillation: CVPR2020 Oral, Revisiting Knowledge Distillation via Label Smoothing Regularization
boheumd/MA-LMM
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Victorwz/LLaVA-Llama-3
Reproduction of LLaVA-v1.5 based on Llama-3-8b LLM backbone.
OpenBMB/MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
CAMMA-public/SSG-VQA
SSG-VQA is a Visual Question Answering (VQA) dataset on laparoscopic videos providing diverse, geometrically grounded, unbiased and surgical action-oriented queries generated using scene graphs.
THUDM/CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
JieShibo/MemVP
[ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
synlp/R2-LLM
The official GitHub repository of the AAAI-2024 paper "Bootstrapping Large Language Models for Radiology Report Generation".
IDEA-Research/Grounding-DINO-1.5-API
API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
yan-hao-tian/VW
iclr2024 poster Varying Window Attention
WeixiongLin/PMC-CLIP
The official codes for "PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents"
deepseek-ai/DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
deepseek-ai/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
himashi92/Co-BioNet
[Nature Machine Intelligence Journal] Official pytorch implementation for Uncertainty-Guided Dual-Views for Semi-Supervised Volumetric Medical Image Segmentation
sotiraslab/AgileFormer
This the repo for the paper tiltled "AgileFormer: Spatially Agile Transformer UNet for Medical Image Segmentation"
iamhyc/Overleaf-Workshop
Open Overleaf/ShareLaTex projects in vscode, with full collaboration support.