African language Speech Recognition - Speech-to-Text

African language Speech Recognition

Introduction

Speech recognition technology allows for hands-free control of smartphones, speakers, and even vehicles in a wide variety of languages. Companies have moved towards the goal of enabling machines to understand and respond to more and more of our verbalized commands. There are many matured speech recognition systems available, such as Google Assistant, Amazon Alexa, and Apple’s Siri. However, all of those voice assistants work for limited languages only.

The World Food Program wants to deploy an intelligent form that collects nutritional information of food bought and sold at markets in two different countries in Africa - Ethiopia and Kenya. The design of this intelligent form requires selected people to install an app on their mobile phone, and whenever they buy food, they use their voice to activate the app to register the list of items they just bought in their own language. The intelligent systems in the app are expected to live to transcribe the speech-to-text and organize the information in an easy-to-process way in a database.

Our responsibility was to build a deep learning model that is capable of transcribing a speech to text in the Amharic language. The model we produce will be accurate and is robust against background noise.

Installation guide

Conda Enviroment

conda create --name mlenv python==3.7.5
conda activate mlenv

Installation of dependencies

git clone https://github.com/week4-SpeechRecognition/Speech-to-Text.git
cd Speech-to-Text
sudo python3 setup.py install

Architecture

Project Structure

images:

images/ the folder where all snapshot for the project are stored.

data:

*.dvc the folder where the dataset versioned files are stored.

.dvc:

.dvc/: the folder where dvc configured for data version control.

.github:

.github/: the folder where github actions and CML workflow is integrated.

.vscode:

.vscode/: the folder where local path fix are stored.

models:

models/ the folder where model pickle files are stored.

notebooks:

notebooks/ include all notebooks for deep-learning and meta-data.

scripts:

*.py: Scripts for modularization, logging, and packaging.

root folder:

requirements.txt: a text file lsiting the projet's dependancies.
README.md: Markdown text with a brief explanation of the project and the repository structure.
Dockerfile: build users can create an automated build that executes several command-line instructions in a container.

Contributors

Made with contrib.rocks

Hen0k/Speech-to-Text