tomoyukun's Stars
python/cpython
The Python programming language
deepmind/deepmind-research
This repository contains implementations and illustrative code to accompany DeepMind publications
facebookresearch/mmf
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
microsoft/muzic
Muzic: Music Understanding and Generation with Artificial Intelligence
paperswithcode/releasing-research-code
Tips for releasing research code in Machine Learning (with official NeurIPS 2020 recommendations)
facebookresearch/TimeSformer
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
SHI-Labs/Neighborhood-Attention-Transformer
Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
YehLi/xmodaler
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
activitynet/ActivityNet
This repository is intended to host tools and demos for ActivityNet
jonbarron/robust_loss_pytorch
A pytorch port of google-research/google-research/robust_loss/
ykasten/layered-neural-atlases
fxia22/stn.pytorch
pytorch version of spatial transformer networks
chaoyuaw/pytorch-coviar
Compressed Video Action Recognition
vimeo/vimeo.py
Official Python library for the Vimeo API.
rishigami/Swin-Transformer-TF
Tensorflow implementation of Swin Transformer model.
WenxueCui/Deep-Image-Compression-Video-Coding
Recent papers and codes related to deep learning/deep neural network based image compression and video coding framework.
gsssrao/youtube-8m-videos-frames
Youtube-8m Videos, Frames and Ids Generator. Extract videos from youtube-8m. Extract frames from youtube-8m.
esceptico/perceiver-io
Unofficial implementation of Perceiver IO
daniel-code/TubeViT
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
Rishit-dagli/Perceiver
Implementation of Perceiver, General Perception with Iterative Attention
ariG23498/mae-scalable-vision-learners
A TensorFlow 2.x implementation of Masked Autoencoders Are Scalable Vision Learners
UMBCvision/CompRess
Compressing Representations for Self-Supervised Learning
yuzhms/Streaming-Video-Model
[CVPR2023] Code for "Streaming Video Model"
StanLei52/GEBD
[ICCV2021] Generic Event Boundary Detection: A Benchmark for Event Segmentation
KimManjin/RSA
Official Pytorch Implementation of Relational Self-Attention, NeurIPS 2021
eyanq/sdr
Simple Digit Recognition OCR in OpenCV
danielgordon10/youtube8m-data
Extracted YouTube 8M URLs and Labels without all the TF Record parsing/features
oncescuandreea/QuerYD_downloader
RobMulla/helmet-assignment
Helper code for the 2021 Kaggle NFL Helmet Assignment Task
YongyiTang92/pytorch-tutorial
PyTorch Tutorial for Deep Learning Researchers