ha-ov's Stars
WendellGul/DCMH
PyTorch implementation for paper "Deep Cross-Modal Hashing"
BruceW91/CVSE
The official source code for the paper Consensus-Aware Visual-Semantic Embedding for Image-Text Matching (ECCV 2020)
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
jina-ai/clip-as-service
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
kuanghuei/SCAN
PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
google-research/vision_transformer
dk-liang/Awesome-Visual-Transformer
Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)
lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
amzn/image-to-recipe-transformers
Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
dandelin/ViLT
Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
XMUNLP/Tagger
Deep Semantic Role Labeling with Self-Attention
Atmegal/DGCPN
Deep Graph-neighbor Coherence Preserving Network for Unsupervised Cross-modal Hashing
zhouyu1996/DAQN
An implement of our paper “DEEP ADVERSARIAL QUANTIZATION NETWORK FOR CROSS-MODAL RETRIEVAL”
shivram1987/VisionTransformerHashing
lukemelas/PyTorch-Pretrained-ViT
Vision Transformer (ViT) in PyTorch
facebookresearch/mae
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
sksq96/pytorch-summary
Model summary in PyTorch similar to `model.summary()` in Keras
cocodataset/cocoapi
COCO API - Dataset @ http://cocodataset.org/
WZMIAOMIAO/deep-learning-for-image-processing
deep learning for image processing including classification and object-detection etc.
akshitac8/BiAM
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages
KaiserLew/JDSH
Joint-modal Distribution-based Similarity Hashing for Large-scale Unsupervised Deep Cross-modal Retrieval
zs-zhong/DJSRH
The code for Deep Joint-Semantics Reconstructing Hashing for Large-Scale Unsupervised Cross-Modal Retrieval (ICCV 2019)
Huyp777/CMHN
Cross-Modal Hashing for Efficiently Retrieving Moments in Videos
uta-smile/TCL
code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022
MIT-LCP/wfdb-python
Native Python WFDB package
TorchSSL/TorchSSL
A PyTorch-based library for semi-supervised learning (NeurIPS'21)
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
yitu-opensource/T2T-ViT
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet