vhzy's Stars
MLNLP-World/Paper-Writing-Tips
MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips
AkariAsai/self-rag
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
ChenHsing/Awesome-Video-Diffusion-Models
[CSUR] A Survey on Video Diffusion Models
awesome-stable-diffusion/awesome-stable-diffusion
Curated list of awesome resources for the Stable Diffusion AI Model.
AlibabaResearch/DAMO-ConvAI
DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
facebookresearch/atlas
Code repository for supporting the paper "Atlas Few-shot Learning with Retrieval Augmented Language Models",(https//arxiv.org/abs/2208.03299)
AlonzoLeeeooo/awesome-text-to-image-studies
A collection of awesome text-to-image generation studies.
BradyFU/Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
EvolvingLMMs-Lab/LongVA
Long Context Transfer from Language to Vision
WengLean/hands-on-research-tutorial
《动手做科研》面向科研初学者,一步一步地展示如何入门人工智能科研
boheumd/MA-LMM
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
mbzuai-oryx/Video-LLaVA
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
ttengwang/Awesome_Long_Form_Video_Understanding
Awesome papers & datasets specifically focused on long-term videos.
darcula1993/diffusion-models-class-CN
Materials for the Hugging Face Diffusion Models Course
YueFan1014/VideoAgent
This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)
imagegridworth/IG-VLM
LDLINGLINGLING/MiniCPM_Series_Tutorial
Minicpm和MiniCPM-V的项目和教程。包括推理,量化,边端部署,微调,技术报告、应用六个主题
Ziyang412/VideoTree
Code for paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"
LDLINGLINGLING/AutoPlan
本项目是自动化学报中AUTOPLAN的代码地址,使用大语言模型完成了复杂任务的任务规划以及任务执行
ziplab/LongVLM
IVG-SZ/Flash-VStream
Please refer to our official repo at https://github.com/IVGSZ/Flash-VStream.
Liuziyu77/Soda
Search, organize, discover anything!
orrzohar/Video-STaR
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
Stanford-ILIAD/explore-eqa
Public release for "Explore until Confident: Efficient Exploration for Embodied Question Answering"
rxtan2/Koala-video-llm
kkahatapitiya/LangRepo
Language Repository for Long Video Understanding
kahnchana/mvu
Multimodal Video Understanding Framework (MVU)
declare-lab/Sealing
[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"
Espere-1119-Song/Paper-Writing-Tips
该仓库是MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips
lntzm/CVPR24Track-LongVideo