- Purpose
Inquire and explore steps in the data scientist pipeline including: data wrangling, data cleaning, and predictive modeling with AI. We will use the Speech Recognition data set, found on Kaggle.com, in order to build an algorithm that recognizes simple speech commands.
- Objective
Learn Command Words: Yes, No, Up, Down, Left, Right, On, Off, Stop, Go
Downloads Tensorflow Speech Recognition Dataset, extracts MFCCs as features, and saves vectors to .h5 file
-
Usage
python3 SRData.py -o
-
Requirements
- h5py
- pickle
- numpy
- soundfile
- tqdm
- librosa
- matplotlib
- scipy
- Raw Samples
- Frequency Domain
- Power Spectrum
- Mel Filters
- Mel Filter Banks
- Mel-frequency cepstral coefficients (MFCCs)
Trains the model
-
Usage
python3 SRnn.py
-
Requirements
- h5py
- pickle
- numpy
- sklearn
- keras
- matplotlib
Plots a trained model's history (accuracy/loss)
-
Usage
python3 SRPlots.py <model_history_file>
-
Requirements
- pickle
- matplotlib
Runs test demo that allows a user to record a word, extracts the features (MFCCs), uses model to predict the word, uses Google Text To Speech to play back the model's prediction
-
Usage
python3 SRTest.py <model_file>
-
Requirements
- sounddevice
- pickle
- numpy
- playsound
- keras
- gtts
- tqdm
- scipy
- Tensorflow Speech Recognition Challenge
- Tensorflow Command Word Dataset
- Speech Processing for Machine Learning: Filter banks, Mel-Frequency Cepstral Coefficients (MFCCs) and What's In-Between
- Keras Conv1D: Working with 1D Convolutional Neural Networks in Keras
- Time Series Classification with CNNs
- A Beginner's Guide to LSTMs and Recurrent Neural Networks
* This repo is under MIT License, use as you please.