shallowdream66

I'm running hard to catch up with the person I once had high hopes for.

shallowdream66's Stars

xuanso/uav-206
Language:Python2
baaivision/Emu3
Next-Token Prediction is All You Need
Language:Python1.8k72
zipper112/CDeFuse
Language:Python3
GXYM/STGT
Video-Language Alignment via Spatio–Temporal Graph Transformer; ArXiv: https://arxiv.org/abs/2407.11677
Language:Python10
PKU-YuanGroup/Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Language:Python3k220
uark-cviu/Micron-BERT
[CVPR 2023] Micron-BERT: BERT-based Facial Micro-Expression Recognition
Language:Python12810
ssyze/EVE
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Language:Python91
Paranioar/Awesome_Matching_Pretraining_Transfering
The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
40147
radarFudan/Awesome-state-space-models
Collection of papers on state-space models
55620
amusi/CVPR2024-Papers-with-Code
CVPR 2024 论文和开源项目合集
18.3k2.6k
xai-org/grok-1
Grok open release
Language:Python49.6k8.3k
jpthu17/HBI
[CVPR 2023 Highlight] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
Language:Python1095
whwu95/Cap4Video
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Language:Python24020
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Language:Jupyter Notebook11.9k1.5k

shallowdream66

shallowdream66's Stars

xuanso/uav-206

baaivision/Emu3

zipper112/CDeFuse

GXYM/STGT

PKU-YuanGroup/Video-LLaVA

uark-cviu/Micron-BERT

ssyze/EVE

Paranioar/Awesome_Matching_Pretraining_Transfering

radarFudan/Awesome-state-space-models

amusi/CVPR2024-Papers-with-Code

xai-org/grok-1

jpthu17/HBI

whwu95/Cap4Video

CompVis/latent-diffusion