Pinned Repositories
fluid
Fluid, elastic data abstraction and acceleration for BigData/AI applications in cloud. (Project under CNCF)
AVSegFormer
[AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
detr
End-to-End Object Detection with Transformers
mae
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
MYSCU_01
云上川大测试
MMFuser
The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". MMFuser addresses the limitations of current MLLMs in capturing complex image details by simply yet efficiently integrating multi-layer features from ViTs.
MMInstruct
The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity". The MMInstruct dataset includes 973K instructions from 24 domains and four instruction types.
lll2343's Repositories
lll2343/AVSegFormer
[AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer
lll2343/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
lll2343/detr
End-to-End Object Detection with Transformers
lll2343/mae
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377