Can we train custom keywords?

Question

Can we train custom keywords?

Closed this issue 2 years ago · 2 comments

Answer 1 · 2022-10-11T14:18:52.000Z

Yes, you should be able to, if you have the data.

If you go to the sample config, the very first section is like this:

data_root: ./data/
train_list_file: ./data/training_list.txt
val_list_file: ./data/validation_list.txt
test_list_file: ./data/testing_list.txt
label_map: ./data/label_map.json

First, you need to arrange your own data in the form of a class folder structure under some root folder.

./data/
├── keyword_a/
│   ├── a.wav
│   ├── b.wav
│   ├── ...
│   └── ...
├── keyword_b/
├── ...
├── ...
└── keyword_n/

You then need three .txt files to define your training, validation, and test sets. Each file is basically a list of the paths of the audio data.

Then you need to provide a label_map.json file, which contains the class_id to keyword class mapping. Something like:

{"0": "hello", "1": "world", "2": "ready", "3": "something"}

Inside the config you will use, remember to also update the number of classes. If you have for example 15 keywords, you have to set num_classes to 15.

Tricky part:

This repo was made with Speech Commands dataset. So it kind of expects audio keywords to be <= 1 second by default, and either pads or trims everything to 1 s.

If your audio clips are like that of Speech Commands in length, then there is no problem.

However, if you have keyword clips which are longer (e.g. ~2 s), or if you want to train with some specific audio clip length and spectrogram size, then I will have to make some small changes in the repository. (I can possibly get it done on the weekend if I have time, shouldn't be too time consuming.)

Answer 2 · 2022-10-11T15:42:36.000Z

@ID56 Thankyou so much for the detailed answer got it!
Yes please if you find the time kindly update it if possible.
Once again thankyou for your reply.