mscoco
There are 56 repositories under mscoco topic.
microsoft/Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
SwinTransformer/Swin-Transformer-Object-Detection
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.
apple/ml-cvnets
CVNets: A library for training computer vision networks
peteanderson80/bottom-up-attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
HRNet/HRNet-Object-Detection
Object detection with multi-level representations generated from deep high-resolution representation learning (HRNetV2h). This is an official implementation for our TPAMI paper "Deep High-Resolution Representation Learning for Visual Recognition". https://arxiv.org/abs/1908.07919
JDAI-CV/CoTNet
This is an official implementation for "Contextual Transformer Networks for Visual Recognition".
sacmehta/EdgeNets
This repository contains the source code of our work on designing efficient CNNs for computer vision
hyz-xmaster/VarifocalNet
VarifocalNet: An IoU-aware Dense Object Detector
hyz-xmaster/swa_object_detection
SWA Object Detection
ViTAE-Transformer/ViTAE-Transformer
The official repo for [NeurIPS'21] "ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias" and [IJCV'22] "ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond"
MichiganCOG/ViP
Video Platform for Action Recognition and Object Detection in Pytorch
YehLi/ImageNetModel
Official ImageNet Model repository
hustvl/BMaskR-CNN
[ECCV 2020] Boundary-preserving Mask R-CNN
peteanderson80/SPICE
Semantic Propositional Image Caption Evaluation
HRNet/HRNet-FCOS
High-resolution Networks for the Fully Convolutional One-Stage Object Detection (FCOS) algorithm
ntrang086/image_captioning
generate captions for images using a CNN-RNN model that is trained on the Microsoft Common Objects in COntext (MS COCO) dataset
610265158/mobilenetv3_centernet
A tensorflow implement mobilenetv3 centernet, which can be easily deployeed on android(MNN) and ios(CoreML).
Weed-AI/Weed-AI
A repository to support the development of a repository and interchange format for weed identification annotation
peteanderson80/coco-caption
Adds SPICE metric to coco-caption evaluation server codes
lightly-ai/labelformat
A tool for converting computer vision label formats.
oswaldoludwig/visually-informed-embedding-of-word-VIEW-
Visually informed embedding of word (VIEW) is a tool for transferring multimodal background knowledge to NLP algorithms.
utahnlp/consistency
Implementation of models in our EMNLP 2019 paper: A Logic-Driven Framework for Consistency of Neural Models
ayansengupta17/GAN
We aim to generate realistic images from text descriptions using GAN architecture. The network that we have designed is used for image generation for two datasets: MSCOCO and CUBS.
gautamchitnis/cocoapi
Clone of COCO API - Dataset @ http://cocodataset.org/ - with changes to support Windows build and python3
deepplants/ViT-PCM
Official implementation of "Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation"
leftthomas/DeepMask
A Keras implementation of DeepMask based on NIPS 2015 paper "Learning to Segment Object Candidates"
howardyclo/ImageNet2COCO
A demo for mapping class labels from ImageNet to COCO.
CLT29/semantic_neighborhoods
Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval [ECCV 2020]
jakarto3d/jakarnotator
The Jakarnotator is an annotation tool to create your own database for instance segmentation problem.
nayeem8527/Chitra-VarNan
Hindi Image Captioning
canesee-project/Arabic-COCO
MS COCO captions in Arabic
VladimirSinitsin/labelme_converter
LabelMe to MsCOCO, PascalVOC, Yolo
Lukeasargen/Show-Attend-and-Tell-Pytorch-Lightning
Encoder-Decoder CNN-LSTM Model with an attention mechanism for image captioning. Trained using the Microsoft COCO Dataset.
biyoml/PyTorch-SSD
PyTorch implementation of SSD: Single Shot MultiBox Detector.
shunk031/huggingface-datasets_COCOA
COCOA: Semantic Amodal Segmentation for huggingface datasets