david-gimeno
I am PhD student in Computer Science at Universitat Politècnica de València. My interests are Speech Technologies, Computer Vision, and Affective Computing.
Pattern Recognition and Human Languages Technology, Research CenterValencia, Spain
Pinned Repositories
multimodal-depression-from-video
Official source code for the paper: "Reading Between the Frames Multi-Modal Non-Verbal Depression Detection in Videos"
Evaluation_of_End-to-End_Continuous_Spanish_Lipreading_in_Different_Data_Conditions
Visual Speech Recognition for Spanish
LIP-RTVE
An Audiovisual Database for Continuous Spanish in the Wild
lipreading-thesis
Official source code developed for my Ph.D Thesis Dissertation: "Contributions to Automatic Lipreading for Spanish"
tailored-avsr
Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"
espnet
End-to-End Speech Processing Toolkit
av_hubert
A self-supervised learning framework for audio-visual speech
AnnoTheia
AnnoTheia is a data annotation toolkit that identifies when a person speaks in a scene and transcribes their speech, also offering flexibility to replace modules for different languages.
captum
Model interpretability and understanding for PyTorch
Fotoapparat
Making Camera for Android more friendly. 📸
david-gimeno's Repositories
david-gimeno/LIP-RTVE
An Audiovisual Database for Continuous Spanish in the Wild
david-gimeno/lipreading-thesis
Official source code developed for my Ph.D Thesis Dissertation: "Contributions to Automatic Lipreading for Spanish"
david-gimeno/tailored-avsr
Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"
david-gimeno/Evaluation_of_End-to-End_Continuous_Spanish_Lipreading_in_Different_Data_Conditions
Visual Speech Recognition for Spanish