/SpeechRec

Tensorflow speech recognition challenge

Primary LanguageJupyter NotebookMIT LicenseMIT

License: MIT

Tensorflow speech recognition challenge

  • Purpose

Inquire and explore steps in the data scientist pipeline including: data wrangling, data cleaning, and predictive modeling with AI. We will use the Speech Recognition data set, found on Kaggle.com, in order to build an algorithm that recognizes simple speech commands.

  • Objective

Learn Command Words: Yes, No, Up, Down, Left, Right, On, Off, Stop, Go

Dataset Generation

Downloads Tensorflow Speech Recognition Dataset, extracts MFCCs as features, and saves vectors to .h5 file

  • Usage

    python3 SRData.py -o

  • Requirements

    • h5py
    • pickle
    • numpy
    • soundfile
    • tqdm
    • librosa
    • matplotlib
    • scipy

Feature Extraction

  • Raw Samples alt_text
  • Frequency Domain alt_text
  • Power Spectrum alt_text
  • Mel Filters alt_text
  • Mel Filter Banks alt_text
  • Mel-frequency cepstral coefficients (MFCCs) alt_text

Training

Trains the model

  • Usage

    python3 SRnn.py

  • Requirements

    • h5py
    • pickle
    • numpy
    • sklearn
    • keras
    • matplotlib



  • Convolutional Model alt text alt text alt text

  • Recurrent Model (LSTM) alt text alt text alt text

Evaluation

Plots a trained model's history (accuracy/loss)

  • Usage

    python3 SRPlots.py <model_history_file>

  • Requirements

    • pickle
    • matplotlib

  • Convolutional Model alt text alt text

  • Recurrent Model (LSTM) alt text alt text

Demo

Runs test demo that allows a user to record a word, extracts the features (MFCCs), uses model to predict the word, uses Google Text To Speech to play back the model's prediction

  • Usage

    python3 SRTest.py <model_file>

  • Requirements

    • sounddevice
    • pickle
    • numpy
    • playsound
    • keras
    • gtts
    • tqdm
    • scipy

alt text

Presentation

Resources

* This repo is under MIT License, use as you please.