Tensorflow speech recognition challenge

Purpose

Inquire and explore steps in the data scientist pipeline including: data wrangling, data cleaning, and predictive modeling with AI. We will use the Speech Recognition data set, found on Kaggle.com, in order to build an algorithm that recognizes simple speech commands.

Objective

Learn Command Words: Yes, No, Up, Down, Left, Right, On, Off, Stop, Go

Dataset Generation

Downloads Tensorflow Speech Recognition Dataset, extracts MFCCs as features, and saves vectors to .h5 file

Usage

python3 SRData.py -o
Requirements
- h5py
- pickle
- numpy
- soundfile
- tqdm
- librosa
- matplotlib
- scipy

Feature Extraction

Raw Samples
Frequency Domain
Power Spectrum
Mel Filters
Mel Filter Banks
Mel-frequency cepstral coefficients (MFCCs)

Training

Trains the model

Usage

python3 SRnn.py
Requirements
- h5py
- pickle
- numpy
- sklearn
- keras
- matplotlib

Convolutional Model
Recurrent Model (LSTM)

Evaluation

Plots a trained model's history (accuracy/loss)

Usage

python3 SRPlots.py <model_history_file>
Requirements
- pickle
- matplotlib
Convolutional Model
Recurrent Model (LSTM)

Demo

Runs test demo that allows a user to record a word, extracts the features (MFCCs), uses model to predict the word, uses Google Text To Speech to play back the model's prediction

Usage

python3 SRTest.py <model_file>
Requirements
- sounddevice
- pickle
- numpy
- playsound
- keras
- gtts
- tqdm
- scipy

Presentation

Resources

* This repo is under MIT License, use as you please.

MerlinPCarson/SpeechRec