chen-si-jia
PhD,Huazhong University of Science and Technology
Huazhong University of Science and Technology
chen-si-jia's Stars
HZAI-ZJNU/Mamba-YOLO
the official pytorch implementation of “Mamba-YOLO:SSMs-based for Object Detection”
Boyiliee/LLaDA-AV
Driving Everywhere with Large Language Model Policy Adaptation
FeipengMa6/VLoRA
[NeurIPS 2024] Visual Perception by Large Language Model’s Weights
Fei-Long121/DeepBDC
The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).
UARK-AICV/TrackGUI
Mark12Ding/SAM2Long
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree
chengche6230/ReST
[ICCV 2023] ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking
hou-yz/MVDet
[ECCV 2020] Codes and MultiviewX dataset for "Multiview Detection with Feature Perspective Transformation".
sungonce/SENet
Official PyTorch Implementation of Revisiting Self-Similarity: Structural Embedding for Image Retrieval, CVPR 2023
YehLi/xmodaler
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
PyRetri/PyRetri
Open source deep learning based unsupervised image retrieval toolbox built on PyTorch🔥
willard-yuan/awesome-cbir-papers
📝Awesome and classical image retrieval papers
abewley/sort
Simple, online, and realtime tracking of multiple objects in a video sequence.
WangzcBruce/DHD
PaddlePaddle/PaddleDetection
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Ruzim/NSFC-application-template-latex
国家自然科学基金申请书正文(面上项目)LaTeX 模板(非官方)
Zplusdragon/CION_ReIDZoo
[NeurIPS2024] Cross-video Identity Correlating for Person Re-identification Pre-training
Nightmare-n/DepthAnyVideo
Depth Any Video with Scalable Synthetic Data
Datacastle-Algorithm-Department/images
timesler/facenet-pytorch
Pretrained Pytorch face detection (MTCNN) and facial recognition (InceptionResnet) models
NExT-ChatV/NExT-Chat
The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".
ml-research/deictic-segment-anything
Segment Anything with Deictic Prompting
vision4robotics/PRL-Track
ayesha-ishaq/Open3DTrack
Code for Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking
jyrao/MatchTime
[EMNLP 2024 Oral] MatchTime: Towards Automatic Soccer Game Commentary Generation
leiurayer/downkyi
哔哩下载姬downkyi,哔哩哔哩网站视频下载工具,支持批量下载,支持8K、HDR、杜比视界,提供工具箱(音视频提取、去水印等)。
academicpages/academicpages.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
mbzuai-oryx/GeoChat
[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.