Deep Speaker: an End-to-End Neural Speaker Embedding System https://arxiv.org/pdf/1705.02304.pdf
This project is still WORK IN PROGRESS
!
Work accomplished so far:
- Triplet loss
- Implementation Model
- Define the inputs to the models
- Determine the sample rate
- Train the models
- We're going to use the LibriSpeech dataset with 5000+ different speakers
So please message me if you want to contribute. I'll be happy to hear your ideas. There are a lot of undisclosed things in the paper, such as:
- Input size to the network?
- How many filter banks do we use?
- Sample Rate?