zhouyuan888888

zhouyuan888888's Stars

ml-research/ledits_pp
Language:Python72
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Language:Python11.9k836
AILab-CVC/UniRepLKNet
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Language:Python89553
Ucas-HaoranWei/Vary-toy
Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)
Language:Python58742
DLYuanGod/TinyGPT-V
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Language:Python1.2k75
Yuliang-Liu/Monkey
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Language:Python1.8k122
facebookresearch/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
Language:Python1.8k107
fangwei123456/spikingjelly
SpikingJelly is an open-source deep learning framework for Spiking Neural Network (SNN) based on PyTorch.
Language:Python1.3k238
sustcsonglin/flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Language:Python1.2k62
l0o0/jasminum
A Zotero add-on to retrive CNKI meta data. 一个简单的Zotero 插件，用于识别中文元数据
Language:TypeScript5.2k280
windingwind/zotero-plugin-template
A plugin template for Zotero.
Language:TypeScript436114
windingwind/zotero-better-notes
Everything about note management. All in Zotero.
Language:TypeScript5.2k184
mit-han-lab/efficientvit
EfficientViT is a new family of vision models for efficient high-resolution vision.
Language:Python1.8k161
SHI-Labs/NATTEN
Neighborhood Attention Extension. Bringing attention to a neighborhood near you!
Language:Cuda34125
XuezheMax/fairseq-apollo
FairSeq repo with Apollo optimizer
Language:Python10815
cvpr-org/author-kit
Language:TeX20337
KwaiVGI/LivePortrait
Bring portraits to life!
Language:Python11.7k1.2k
LeapLabTHU/MLLA
Official repository of MLLA
Language:Python1676
hustvl/ViG
Language:Python811
Event-AHU/Mamba_State_Space_Model_Paper_List
[Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications
57733
NVlabs/MambaVision
Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Language:Python70240
YoYo000/MVSNet
MVSNet (ECCV2018) & R-MVSNet (CVPR2019)
Language:Python1.4k303
ewrfcas/MVSFormer
Codes of MVSFormer: Multi-View Stereo by Learning Robust Image Features and Temperature-based Depth (TMLR2023)
Language:Python18010
doubleZ0108/GeoMVSNet
[CVPR 23'] GeoMVSNet: Learning Multi-View Stereo with Geometry Perception
Language:Python1542
cvlab-columbia/zero123
Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)
Language:Python2.6k191
HengLan/CGSTVG
[CVPR 2024] Context-Guided Spatio-Temporal Video Grounding
Language:Python373
Audio-WestlakeU/ATST-SED
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
Language:Jupyter Notebook8011
frednam93/FDY-SED
Language:Python7610
THU-LYJ-Lab/T3Bench
T3Bench: Benchmarking Current Progress in Text-to-3D Generation
Language:Python1.1k10
aimagelab/multimodal-garment-designer
This is the official repository for the paper "Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing". ICCV 2023
Language:Python40347