Midkey

Midkey's Stars

jingyi0000/VLM_survey
Collection of AWESOME vision-language models for vision tasks
2k182
ZHU-Zhiyu/NVS_Solver
Source code of paper "NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer"
Language:Python2072
NVlabs/SegFormer
Official PyTorch implementation of SegFormer
Language:Python2.4k334
yoxu515/aot-benchmark
An efficient modular implementation of Associating Objects with Transformers for Video Object Segmentation in PyTorch
Language:Python592107
z-x-yang/Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
Language:Jupyter Notebook2.7k323
z-x-yang/AOT
Associating Objects with Transformers for Video Object Segmentation
12710
983632847/Awesome-Multimodal-Object-Tracking
A personal investigative project to track the latest progress in the field of multi-modal object tracking.
374
laybebe/TATrans_SVOT
This is an implementation of “Target-aware Transformer for Satellite Video Object Tracking”
3
KindXiaoming/pykan
Kolmogorov Arnold Networks
Language:Jupyter Notebook13.9k1.2k
wangxiao5791509/VisEvent_SOT_Benchmark
[IEEE TCYB 2023] The first large-scale tracking dataset by fusing RGB and Event cameras.
Language:Python1099
RS-Devotee/OODT
OODT：Oriented Object Detection and Tracking in SVD
71
YZCU/OOTB
[ISPRS 2024] Satellite Video Single Object Tracking: A Systematic Review and An Oriented Object Tracking Benchmark
Language:C20
YZCU/LMOD
[Datasets] LMOD: a large-scale and multiclass object detection dataset for satellite videos
3
zcablii/SARDet_100K
Offical implementation of MSFA and release of SARDet_100K dataset for Large-Scale Synthetic Aperture Radar (SAR) Object Detection
Language:Python28321
Wprofessor/SV248S_toolkit
Language:Python5
OpenSpaceAI/UVLTrack
The official pytorch implementation of our AAAI 2024 paper "Unifying Visual and Vision-Language Tracking via Contrastive Learning"
Language:Python162
JacobYuan7/RLIPv2
[ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
Language:Python1023
microsoft/RegionCLIP
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
Language:Python67849
labring/FastGPT
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
Language:TypeScript15.7k4.1k
hustvl/4DGaussians
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
Language:Jupyter Notebook1.9k154
facebookresearch/jepa
PyTorch code and models for V-JEPA self-supervised learning from video.
Language:Python2.6k246
embodied-generalist/embodied-generalist
[ICML 2024] Official code repository for 3D embodied generalist agent LEO
Language:Python29527
Wprofessor/SVLPNet
2
state-spaces/mamba
Mamba SSM architecture
Language:Python11.9k985
facebookresearch/habitat-sim
A flexible, high-performance 3D simulator for Embodied AI research.
Language:C++2.5k405
FengZicai/Cluster3DSeg
This is the official implementation of "Clustering based Point Cloud Representation Learning for 3D Analysis" (Accepted at ICCV 2023).
Language:Python27
983632847/WebUAV-3M
WebUAV-3M: A million-scale multi-modal UAV tracking benchmark
483
EmbodiedGPT/EmbodiedGPT_Pytorch
Language:Python31529
google-deepmind/open_x_embodiment
Language:Jupyter Notebook66944
FengZicai/Interpretable3D
This is the official implementation of "Interpretable3D: An Ad-Hoc Interpretable Classifier for 3D Point Clouds" (Accepted at AAAI 2024).
Language:Python4