Pinned Repositories
asr-evaluation
Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).
conformer
Pytorch implementation of conformer with with training script for end-to-end speech recognition on the LibriSpeech dataset.
conformerLucidrains
Implementation of the convolutional module from the Conformer paper, for use in Transformers
conformerModel
PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
id-nlp-resource
A list of Indonesian NLP resources.
ivector-xvector
Extract xvector and ivector under kaldi
maps_reproducible
Reproducible Research documentation for MaPS-f0
mcd
Mel cepstral distortion (MCD) computations in python.
mellotron
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
MOSNet
Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
TitiAffandi's Repositories
TitiAffandi/asr-evaluation
Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).
TitiAffandi/conformer
Pytorch implementation of conformer with with training script for end-to-end speech recognition on the LibriSpeech dataset.
TitiAffandi/conformerLucidrains
Implementation of the convolutional module from the Conformer paper, for use in Transformers
TitiAffandi/conformerModel
PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
TitiAffandi/id-nlp-resource
A list of Indonesian NLP resources.
TitiAffandi/ivector-xvector
Extract xvector and ivector under kaldi
TitiAffandi/maps_reproducible
Reproducible Research documentation for MaPS-f0
TitiAffandi/mcd
Mel cepstral distortion (MCD) computations in python.
TitiAffandi/mellotron
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
TitiAffandi/MOSNet
Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
TitiAffandi/multi-speaker-tacotron
VCTK multi-speaker tacotron for ICASSP 2020
TitiAffandi/multi-speaker-tacotron-tensorflow
Multi-speaker Tacotron in TensorFlow.
TitiAffandi/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
TitiAffandi/probing-TTS-models
Link to paper: https://arxiv.org/abs/1912.10915
TitiAffandi/pytorch-kaldi-neural-speaker-embeddings
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
TitiAffandi/PyTorch_Speaker_Verification
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
TitiAffandi/sas-python-work
TitiAffandi/tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
TitiAffandi/tacotron2-ZS
TitiAffandi/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
TitiAffandi/waveglow
A Flow-based Generative Network for Speech Synthesis
TitiAffandi/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
TitiAffandi/whisper-finetune
Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.