ferugit
Ph.D. student in AI, Speech and Audio Technologies at Universidad Autónoma de Madrid | AI Research and Prototyping @Telefonica 's Discovery Innovation Team
TelefónicaMadrid, Spain
Pinned Repositories
Audiomer-PyTorch
A Convolutional Transformer for Keyword Spotting (AAAI 2022 DSTC Workshop)
crnn-audio-classification
UrbanSound classification using Convolutional Recurrent Networks in PyTorch
ctc-loss
A PyTorch implementation of CTCLoss (for learning purposes)
ctc-segmentation
Segment an audio file and obtain utterance alignments. (Python package)
iterative-pseudo-forced-alignment-ctc
The code for the https://arxiv.org/pdf/2210.15226.pdf
quark
Efficient Keyword Spotting
sonopytorch
Torch implementation of Sonopy
speechbrain
A PyTorch-based Speech Toolkit
transformer-corrector
Transformer-based Spanish corrector
lullaby-generation-spanish
Notebooks used by Menara, Gonimix y Andino team in the AI Song Contest 2021 to generate lullabies in Spanish.
ferugit's Repositories
ferugit/iterative-pseudo-forced-alignment-ctc
The code for the https://arxiv.org/pdf/2210.15226.pdf
ferugit/ctc-loss
A PyTorch implementation of CTCLoss (for learning purposes)
ferugit/Audiomer-PyTorch
A Convolutional Transformer for Keyword Spotting (AAAI 2022 DSTC Workshop)
ferugit/crnn-audio-classification
UrbanSound classification using Convolutional Recurrent Networks in PyTorch
ferugit/ctc-segmentation
Segment an audio file and obtain utterance alignments. (Python package)
ferugit/degan
Deep Effect Generation using GANs
ferugit/denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
ferugit/quark
Efficient Keyword Spotting
ferugit/sonopytorch
Torch implementation of Sonopy
ferugit/speechbrain
A PyTorch-based Speech Toolkit
ferugit/transformer-corrector
Transformer-based Spanish corrector
ferugit/DESED_task
Domestic environment sound event detection task
ferugit/diart
Lightweight python library for streaming speaker diarization in real-time implemented in pytorch
ferugit/EfficientAT
This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.
ferugit/examples
TensorFlow examples
ferugit/ferugit.github.io
W. Fernando López Gavilánez public page
ferugit/performer-pytorch
An implementation of Performer, a linear attention-based transformer, in Pytorch
ferugit/pytorch_introduction
Several pytorch projects
ferugit/RugPullDetection
ferugit/speaker-recognition-exploration
Speaker Recognition Exploration
ferugit/ssast
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
ferugit/transducer-tutorial
Example code for a neural transducer model.
ferugit/udacity-deep-learning
Udacity Deep Learning Course
ferugit/wuw-challenge-2024
Baseline of the Wake-up Word Challenge of the 2024 Albayzin Evaluations