Audio-Keyword-Spotting-in-Chinese

Combine Recurrent Neural Network and Convolutional layer on audio data. The target keyword is "雅婷姊".

Baseline Model

I. One convolutional layer

II. Two GRU layers

III. Several dropout and batch normalization layer

IV. Fully-connected layer with sigmoid

I. One convolutional layer

II. Two GRU layers and modified first into bi-directional and residual

III. Several dropout and batch normalization layer

IV. Fully-connected layer with sigmoid