vmmm123's Stars
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
alibaba/MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
ashawkey/stable-dreamfusion
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
open-mmlab/mmpose
OpenMMLab Pose Estimation Toolbox and Benchmark.
clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
ChaoningZhang/MobileSAM
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
cvlab-columbia/zero123
Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)
isl-org/ZoeDepth
Metric depth estimation from a single image
Ucas-HaoranWei/Vary
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
Jamie-Stirling/RetNet
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
kemaloksuz/ObjectDetectionImbalance
Lists the papers related to imbalance problems in object detection [TPAMI]
lucidrains/meshgpt-pytorch
Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch
shikras/shikra
SunzeY/AlphaCLIP
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
facebookresearch/ov-seg
This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.
hzxie/CityDreamer
The official implementation of "CityDreamer: Compositional Generative Model of Unbounded 3D Cities". (CVPR 2024)
TinyZeaMays/CircleLoss
Pytorch implementation of the paper "Circle Loss: A Unified Perspective of Pair Similarity Optimization"
mlfoundations/model-soups
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Jiawei-Yang/Denoising-ViT
This is the official code release for our work, Denoising Vision Transformers.
seasonSH/Probabilistic-Face-Embeddings
(ICCV 2019) Uncertainty-aware Face Representation and Recognition
fuenwang/Equirec2Perspec
A tool to project equirectangular panorama into perspective images
chenjun2hao/SRN.pytorch
Unofficial PyTorch implementation of Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
IDEA-Research/ED-Pose
[ICLR 2023] Official implementation of the paper "Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation "
ByungKwanLee/Full-Segment-Anything
This is Pytorch Implementation Code for adding new features in code of Segment-Anything. Here, the features support batch-input on the full-grid prompt (automatic mask generation) with post-processing: removing duplicated or small regions and holes, under flexible input image size
TongkunGuan/SIGA
[CVPR2023] Self-supervised Implicit Glyph Attention for Text Recognition
yuliangguo/OmniFusion
[CVPR 2022 Oral] Official Pytorch Implementation of "OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion"