Single or Multiple speakers detection

Slide

https://docs.google.com/presentation/d/10Jpm2rLMkhuI4_mkF0qBUQUKfVoF-u6_oSzNxOy_LfY/edit?usp=sharing

Survey

sgle_mtp_spk_diarization

Dataset

Original source:

Mixture Dataset

Algorithm

mixing algorithm overlap mixing algorithm non overlap

Source

Code to create mixture dataset

Image Recognition task

Code to train ConvNeXt

Code to train Conformer

Speaker Diarization

Speech Activity Detection:

Speaker Embedding

Clustering:

Wandb experiment logging (train, test)