Tony607/Keras-Trigger-Word

How to generate X,Y dataset?

madan0511 opened this issue · 2 comments

How do you create X,Y dataset for training, should we add arrays of voices and then save it as that or should we create an array file for train.wav?

That depends on your goal. If you are trying to just use the model for the "activate" word, just follow the tutorial exactly, while downloading the Data.zip that is in the Readme.
If you are trying to develop a custom model with your own trigger word, creating the training data would come with these steps :

  • In the raw_data/activates put your new wav files that got your new trigger word. The more you put the better. Leave the other folders as they are.
  • Modify the In [19] cell, by going to the number_of_activates variable initialisation. In its initialization, you would find the random number initialized between 0 and 5, change the '5' number, and make it into the number of custom wav files you added into the activates folder.
  • In the notebook, instead of loading the data as in the cell : In [23] , you could try something similar to this code :

for background in backgrounds:
for i in range(0,100):
x, y = create_training_example(background, activates, negatives)
X.append(x)
Y.append(y)

  • This would make X and Y arrays with all your data. you now would need to convert them into np.arrays. Convert them with the np.array function, and then you will need to change their shape to the needed shape, that is (nb_observations,1375,1) for Y and (nb_observations,5511,101). Use the np.reshape function for that. replace nb_observations with the size of your earlier array of X and Y.

  • Now you need to divide X and Y into training and test datasets. take the 80% first observations as training, and add the 20% that is left to new variables that are X_dev and Y_dev for test.

  • Now you're set to continue the with the notebook with no problem

how to reshape X to (5511,101)?