OleguerCanal's Stars
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
huggingface/diarizers
TaoRuijie/ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
clovaai/voxceleb_trainer
In defence of metric learning for speaker recognition
Jungjee/RawNet
Official repository for RawNet, RawNet2, and RawNet3
csukuangfj/transducer-loss-benchmarking
k2-fsa/k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
k2-fsa/fast_rnnt
A torch implementation of a recursion which turns out to be useful for RNN-T.
krstopro/quantum-computing-cheat-sheet
Quantum Computing Cheat Sheet
HarunoriKawano/BEST-RQ
Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.
xai-org/grok-1
Grok open release
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
gavi/mlx-whatsapp
An mlx project to train a base model on your whatsapp chats using (Q)Lora finetuning
state-spaces/mamba
Mamba SSM architecture
johnma2006/mamba-minimal
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
Vaibhavs10/insanely-fast-whisper
weghornlab/SigNet
Mutational signature fitting with an ANN
OpenNLPLab/Tnn
[ICLR 2023] Official implementation of Transnormer in our ICLR 2023 paper - Toeplitz Neural Network for Sequence Modeling
SawyerHood/draw-a-ui
Draw a mockup and generate html for it
wq2012/SpectralCluster
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
yanghaha0908/FastHuBERT
Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning
apptek/SubER
SubER - Subtitle Edit Rate
Edresson/YourTTS
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
mhagiwara/github-typo-corpus
GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors
kensho-technologies/pyctcdecode
A fast and lightweight python-based CTC beam search decoder for speech recognition.
zh217/torch-asg
Auto Segmentation Criterion (ASG) implemented in pytorch
gtn-org/gtn
Automatic differentiation with weighted finite-state transducers.
py-pdf/pypdf
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
idiap/fast-transformers
Pytorch library for fast transformer implementations