ServerSideHannes/las

The token vector should be one-hot encoded.

Closed this issue · 5 comments

is it necessary to use one hot encoding or we can use tf.keras.preprocessing.text.Tokenizer for encoding?

Maybe it could work, I think you need to specify how you would use it for me to be able to make a qualified guess on that :)
In the paper they let the decoder emit a single character per time step, thats why I implemented it that way.

@hgstudent can you please share preprocessing code too?

I don't have any general preprocessing code atm. However, depending on what you mean with preprocessing, I could point you towards a repository if you would like too though :)

yes please
i am new to this feild and its very hard for me to learn all this
if you know any repository with complete speech recognition code from preprocessing to prediction please give me link
i was waiting for your reply from many days

I dont know of any complete speech recognition but https://github.com/DemisEom/SpecAugment is good for preprocessing along with augmentation :)