serkansulun's Stars
google-research/google-research
Google Research
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
timesler/facenet-pytorch
Pretrained Pytorch face detection (MTCNN) and facial recognition (InceptionResnet) models
clovaai/deep-text-recognition-benchmark
Text recognition (optical character recognition) with deep learning methods, ICCV 2019
CSAILVision/places365
The Places365-CNNs for Scene Classification
WuJie1010/Facial-Expression-Recognition.Pytorch
A CNN based pytorch implementation on facial expression recognition (FER2013 and CK+), achieving 73.112% (state-of-the-art) in FER2013 and 94.64% in CK+ dataset
idiap/fast-transformers
Pytorch library for fast transformer implementations
qiuqiangkong/audioset_tagging_cnn
rmokady/CLIP_prefix_caption
Simple image captioning model
bearpelican/musicautobot
Using deep learning to generate music in MIDI format.
minzwon/sota-music-tagging-models
wzk1015/video-bgm-generation
[ACM MM 2021 Best Paper Award] Video Background Music Generation with Controllable Music Transformer
YatingMusic/compound-word-transformer
Official implementation of compound word transformer (AAAI'21)
saahiluppal/catr
Image Captioning Using Transformer
juansgomez87/datasets_emotion
This repository collects information about different data sets for Music Emotion Recognition.
yaoing/DAN
Official implementation of DAN
sergiooramas/tartarus
Deep Learning for audio and text
Dsqvival/hierarchical-structure-analysis
Algorithm and Data for paper "Automatic Detection of Hierarchical Structure and Influence of Structure on Melody, Harmony and Rhythm in Popular Music"
vlgiitr/Group-Level-Emotion-Recognition
Model submitted for the ICMI 2018 EmotiW Group-Level Emotion Recognition Challenge
AmirSh15/FECNet
Facial Expression Feature Extractor
ZZWaang/polyphonic-chord-texture-disentanglement
The repository of the paper: Wang et al., Learning interpretable representation for controllable polyphonic music generation, ISMIR 2020.
Irurnnen/Songsterr-saver
m-bain/CondensedMovies-chall
Condensed Movies Challenge 2021
jwehrmann/lmtd
Labeled Movie Trailer Dataset
sudongtan/synesthesia
yagyapandeya/Music_Video_Emotion_Dataset
Music video emotion dataset
Ashima-19/Deep-learning-based-Movie-Trailer-Genre-Classification-
We discuss a novel deep affect-based movie trailer classification framework
mmathys/songsterr-crawler
crawling songsterr