serkansulun

Researcher at INESC TEC, Porto working on deep learning and music processing.

serkansulun's Stars

google-research/google-research
Google Research
Language:Jupyter Notebook34.6k 755 1.3k8k
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Language:Jupyter Notebook26.7k 325 4053.4k
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python20.5k 306 1.4k2.6k
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10.1k 97 676980
cmhungsteve/Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
4.7k 130 30490
open-mmlab/mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
Language:Python4.4k 60 901755
karan/Projects-Solutions
:pager: Links to others' solutions to Projects (https://github.com/karan/Projects/)
4.2k 239 121.3k
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
Language:Python3.8k 41 160342
CSAILVision/places365
The Places365-CNNs for Scene Classification
Language:Python1.9k 58 94537
rmokady/CLIP_prefix_caption
Simple image captioning model
Language:Jupyter Notebook1.3k 7 79220
piergiaj/pytorch-i3d
Language:Python989 12 81252
jeffreyyihuang/two-stream-action-recognition
Using two stream architecture to implement a classic action recognition method on UCF101 dataset
Language:Python863 22 81249
AndreyGuzhov/AudioCLIP
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)
Language:Python781 17 096
hassony2/kinetics_i3d_pytorch
Inflated i3d network with inception backbone, weights transfered from tensorflow
Language:Python532 14 27116
minzwon/sota-music-tagging-models
Language:Python405 8 1666
YuanGongND/ssast
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
Language:Python368 7 3760
OpenGVLab/unmasked_teacher
[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Language:Python306 12 4817
tbmoon/facenet
FaceNet for face recognition using pytorch
Language:Jupyter Notebook247 9 967
juansgomez87/datasets_emotion
This repository collects information about different data sets for Music Emotion Recognition.
227 2 223
lucidrains/bidirectional-cross-attention
A simple cross attention that updates both the source and target in one step
Language:Python159 4 212
sergiooramas/tartarus
Deep Learning for audio and text
Language:Python101 9 326
Dsqvival/hierarchical-structure-analysis
Algorithm and Data for paper "Automatic Detection of Hierarchical Structure and Influence of Structure on Melody, Harmony and Rhythm in Popular Music"
Language:Python89 2 310
Xeaver/EmotionCLIP
[CVPR 2023] Code for "Learning Emotion Representations from Verbal and Nonverbal Communication"
Language:Python42 2 57
nku-zhichengzhang/CTEN
[CVPR 2023] This is the official implementation of "Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network"
Language:Python36 2 81
Irurnnen/Songsterr-saver
Language:Python23 1 37
xiaobai1217/DomainAdaptation
CVPR2022
Language:Python20 3 11
ekazakos/MTCN
Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch
Language:Python18 3 14
m-bain/CondensedMovies-chall
Condensed Movies Challenge 2021
Language:Python17 3 42
jalexander1/Python_Course_Slideware
Intro to Python Course
16 4 017
dkrst/Multi_Label_Confusion_Matrix
Language:Jupyter Notebook1

serkansulun

serkansulun's Stars

google-research/google-research

openai/CLIP

microsoft/unilm

salesforce/LAVIS

cmhungsteve/Awesome-Transformer-Attention

open-mmlab/mmocr

karan/Projects-Solutions

FunAudioLLM/SenseVoice

CSAILVision/places365

rmokady/CLIP_prefix_caption

piergiaj/pytorch-i3d

jeffreyyihuang/two-stream-action-recognition

AndreyGuzhov/AudioCLIP

hassony2/kinetics_i3d_pytorch

minzwon/sota-music-tagging-models

YuanGongND/ssast

OpenGVLab/unmasked_teacher

tbmoon/facenet

juansgomez87/datasets_emotion

lucidrains/bidirectional-cross-attention

sergiooramas/tartarus

Dsqvival/hierarchical-structure-analysis

Xeaver/EmotionCLIP

nku-zhichengzhang/CTEN

Irurnnen/Songsterr-saver

xiaobai1217/DomainAdaptation

ekazakos/MTCN

m-bain/CondensedMovies-chall

jalexander1/Python_Course_Slideware

dkrst/Multi_Label_Confusion_Matrix