Pinned Repositories
3D-convolutional-speaker-recognition
:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
AID
algo-class-assignments
Programming assignments for Stanford online Design and Analysis of Algorithms course
articles
thoughts on programming
asv-subtools
An Open Source Tools for Speaker Recognition
athena
an open-source implementation of sequence-to-sequence based speech processing engine
audfprint
Landmark-based audio fingerprinting
Mistral-Speaker-Recognition-Tutorial
Experimenting Speaker Verification and Recognition with Mistral A.K.A Alize
Podcastmix
PodcastMix A dataset for separating music and speech in podcasts.
TFGAN-PLC
A Temporal-Spectral Generative Adversarial Network based End-to-end Packet Loss Concealment for Wideband Speech Transmission
AIDman's Repositories
AIDman/TFGAN-PLC
A Temporal-Spectral Generative Adversarial Network based End-to-end Packet Loss Concealment for Wideband Speech Transmission
AIDman/athena
an open-source implementation of sequence-to-sequence based speech processing engine
AIDman/Podcastmix
PodcastMix A dataset for separating music and speech in podcasts.
AIDman/Crazy-shooter-alarm-video-sentiment-analysis
AIDman/D-TDNN
PyTorch implementation of Densely Connected Time Delay Neural Network
AIDman/deep-speaker
Deep Speaker: an End-to-End Neural Speaker Embedding System.
AIDman/DeepLip
deep-learning based audio-visual lip bometrics
AIDman/distribution_augmentation
Code for the paper, "Distribution Augmentation for Generative Modeling", ICML 2020.
AIDman/e2e_dialect
End to end dialect classification
AIDman/espnet
End-to-End Speech Processing Toolkit
AIDman/Forward
A library for high performance deep learning inference on NVIDIA GPUs.
AIDman/Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
AIDman/lingvo
Lingvo
AIDman/MaskSpec
The Pytorch implementation of paper: Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training
AIDman/MLP_Scratch_Python
Implement MLP from Scratch using Python
AIDman/MoneyPrinterTurbo
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
AIDman/mslearn-use-git-from-vs-code
Sample code for MS Learn module
AIDman/MTL-Speaker-Embeddings
Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" accepted at Interspeech 2021
AIDman/NeMo
NeMo: a toolkit for conversational AI
AIDman/PaSST
Efficient Training of Audio Transformers with Patchout
AIDman/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
AIDman/pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
AIDman/SASVC2022_Baseline
Baseline for the Spoofing-aware Speaker Verification Challenge 2022
AIDman/scholarly
Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
AIDman/speaker-id
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
AIDman/SpeakerEmbeddingLossComparison
Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP 2020
AIDman/SpeakerProfiling
Estimating the Age, Height, and Gender of a speaker with their speech signal.
AIDman/StarGANv2-VC
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
AIDman/w2v2-speaker-few-samples
Research code for the paper "Training speaker recognition systems with limited data" at https://arxiv.org/abs/2203.14688
AIDman/wespeaker
Production First and Production Ready Speaker Recognition Toolkit