Input - speech signal, output - digit number
It contains :-
-
Reading the dataset and preprocessing the data set.
-
Training the LSTM with RAW data.
-
Converting to spectrogram and Training the LSTM network
-
Creating the augmented data and doing step 2 and 3 again.