vmmm123

vmmm123's Stars

facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook48.4k 313 6805.7k
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python23k 190 5242.3k
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python20.5k 307 1.4k2.6k
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Language:Python13k 107 615910
alibaba/MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
Language:C++8.9k 198 2.6k1.7k
ashawkey/stable-dreamfusion
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Language:Python8.4k 129 301740
open-mmlab/mmpose
OpenMMLab Pose Estimation Toolbox and Benchmark.
Language:Python6k 56 1.5k1.3k
clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Language:Python5.9k 51 308479
ChaoningZhang/MobileSAM
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
Language:Jupyter Notebook4.9k 43 126511
open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
Language:Python3.5k 30 7861.1k
cvlab-columbia/zero123
Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)
Language:Python2.8k 40 129200
isl-org/ZoeDepth
Metric depth estimation from a single image
Language:Jupyter Notebook2.4k 35 119222
Ucas-HaoranWei/Vary
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
Language:Python1.8k 54 136161
Jamie-Stirling/RetNet
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
Language:Python1.2k 13 26101
kemaloksuz/ObjectDetectionImbalance
Lists the papers related to imbalance problems in object detection [TPAMI]
1.1k 71 4175
lucidrains/meshgpt-pytorch
Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch
Language:Python787 17 7464
shikras/shikra
Language:Python754 8 6645
SunzeY/AlphaCLIP
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Language:Jupyter Notebook751 13 5447
facebookresearch/ov-seg
This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.
Language:Jupyter Notebook703 13 3261
hzxie/CityDreamer
The official implementation of "CityDreamer: Compositional Generative Model of Unbounded 3D Cities". (CVPR 2024)
Language:Python625 25 2943
TinyZeaMays/CircleLoss
Pytorch implementation of the paper "Circle Loss: A Unified Perspective of Pair Similarity Optimization"
Language:Python456 4 3593
mlfoundations/model-soups
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Language:Python436 10 1840
Jiawei-Yang/Denoising-ViT
This is the official code release for our work, Denoising Vision Transformers.
Language:Python343 18 139
seasonSH/Probabilistic-Face-Embeddings
(ICCV 2019) Uncertainty-aware Face Representation and Recognition
Language:Python343 9 1758
fuenwang/Equirec2Perspec
A tool to project equirectangular panorama into perspective images
Language:Python286 6 1257
chenjun2hao/SRN.pytorch
Unofficial PyTorch implementation of Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
Language:Python193 4 2338
IDEA-Research/ED-Pose
[ICLR 2023] Official implementation of the paper "Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation "
Language:Python164 3 318
ByungKwanLee/Full-Segment-Anything
This is Pytorch Implementation Code for adding new features in code of Segment-Anything. Here, the features support batch-input on the full-grid prompt (automatic mask generation) with post-processing: removing duplicated or small regions and holes, under flexible input image size
Language:Python147 2 109
TongkunGuan/SIGA
[CVPR2023] Self-supervised Implicit Glyph Attention for Text Recognition
Language:Python106 4 83
yuliangguo/OmniFusion
[CVPR 2022 Oral] Official Pytorch Implementation of "OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion"
Language:Jupyter Notebook95 4 1214