Accent Trainer is Flask webapp/endpoint that compares the user's speech with different accents and assigns similarity scores based on speed, voice (DTW/MFCC), and accuracy. The accents are generated from Amazon Polly and accuracy analysis using Bing Speech API speech to text.
The distance results were compared against a model (i.e. same Polly voice and text), different accents (different Polly voice and same text), and negative examples (different Polly voice and different text). It performed as expected, with the model scoring 100%, similar accents scoring higher than dissimilar accents (i.e. US English rating highly with US English versus Portuguese accent), and negative examples performing the worst.
- Install Anaconda for Python 3.6. If space is a problem, you can also use pip or Miniconda to install the dependencies.
git clone
the repository andcd
into it.- You may then choose to install the rest of the dependencies in a virtual environment or not (as well as the pip or conda method).
conda install -c conda-forge librosa
pip install pysoundfile
pip install SpeechRecognition
pip install python_speech_features
pip install cydtw
pip install Flask-WTF
- Register for Amazon Web Services, install the CLI and configure it. You might also need to
pip install boto3
. - Register for Microsoft Azure and get a Bing Speech API key. Insert it into
BING_KEY
infunctions.py
. - Write your own secret key in
app.secret_key
inapp.py
- Modify the grade calculations under compare() and compare_json(). These are arbitrary so you might want your own formula.
python app.py
Feel free to post issues and make pull requests.
This is a skeleton for a more fully developed server-side solution. Feel free to contact me via my website.