Pinned Repositories
3D-convolutional-speaker-recognition-pytorch
:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
attention-module
Official PyTorch code for "BAM: Bottleneck Attention Module (BMVC2018)" and "CBAM: Convolutional Block Attention Module (ECCV2018)"
awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Bag_of_Tricks_for_Image_Classification_with_Convolutional_Neural_Networks
experiments on Paper <Bag of Tricks for Image Classification with Convolutional Neural Networks> and other useful tricks to improve CNN acc
Factorized-TDNN
PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi
pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
SpeakerRecognition_tutorial
Simple d-vector based Speaker Recognition (verification and identification) using Pytorch
VAD-python
Voice Activity Detector in Python
Y-vector
Y-vector: Multiscale Waveform Encoder for Speaker Embedding
swhan9873's Repositories
swhan9873/Factorized-TDNN
PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi
swhan9873/pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
swhan9873/Y-vector
Y-vector: Multiscale Waveform Encoder for Speaker Embedding
swhan9873/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
swhan9873/byol-pytorch
Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch
swhan9873/Conference-Acceptance-Rate
Statistics of acceptance rate for the main AI conference
swhan9873/Conv-TasNet-1
A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).
swhan9873/Conv-TasNet-3
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement
swhan9873/data-driven-harmonic-filters
swhan9873/Dual-Path-RNN-Pytorch
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
swhan9873/ECANet
Code for ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
swhan9873/ECAPA-TDNN
swhan9873/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
swhan9873/keras-attention-mechanism
Attention mechanism Implementation for Keras.
swhan9873/keras-tcn
Keras Temporal Convolutional Network.
swhan9873/meta-SR
Pytorch implementation of Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs (Interspeech, 2020)
swhan9873/meta-tasnet
A PyTorch implementation of Meta-TasNet from "Meta-learning Extractors for Music Source Separation
swhan9873/pase
Problem Agnostic Speech Encoder
swhan9873/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding
swhan9873/pytorch-gradual-warmup-lr
Gradually-Warmup Learning Rate Scheduler for PyTorch
swhan9873/pytorch-kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
swhan9873/pytorch-kaldi-neural-speaker-embeddings
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
swhan9873/pytorch-loss
label-smooth, amsoftmax, focal-loss, triplet-loss. Maybe useful
swhan9873/RawNet
Reproducing RawNet paper with Keras and additional experiments with PyTorch.
swhan9873/Res2Net-PretrainedModels
(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"
swhan9873/Speech-Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
swhan9873/TDNN-1
Time delay neural network (TDNN) implementation in Pytorch using unfold method
swhan9873/the-incredible-pytorch
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.
swhan9873/torch-plda
PyTorch implementation of PLDA as described in https://ravisoji.com/assets/papers/ioffe2006probabilistic.pdf
swhan9873/voxceleb_trainer
In defence of metric learning for speaker recognition