/ChitongaASR

A natural language processing and machine learning project for a low resource langauge in Zambia.

Primary LanguageJupyter Notebook

Overview

This repository contains resources and tools related to a NLP and ML project, which aims to develop an automatic speech recognition system(ASR) for 5hrs of speechdata in ChiTonga, a low resource langauge in Zambia. Some of the tasks that I worked on include:

  1. Text preprocessing: This involved cleaning and preparing text data for analysis and model training.
  2. Audio preprocessing: This involved cleaning and preparing audio data model training.
  3. Model training and evaluation: This involved training a ML model on a dataset and evaluating its performance using metrics such as accuracy, wer and cer. This included tasks such as selecting an appropriate model type, tuning hyperparameters, and using cross-validation to assess model generalization.
  4. Inference and prediction: Once a model has been trained and evaluated, it can be used to make predictions on new data. This included tasks such as generating text using a demo language model built using Gradio python librabry.

The finetuned models can be found here: Whisper-small-Tonga & wav2vec2-xls-r-300m-tonga.
The datasets used in this project are availabe here: Zambezi Voice. To contribute, please visit the Zambezi Voice for more details.

Read more about this project here:

Project structure

In this repository, you will find the following resources:

  1. Data preprocessing notebooks: contains Jupyter notebooks used to preprocess my data.
  2. fine-tuned model notebooks.
  3. Gradio demo app code.

Getting started

To get started, you will need to have an account on google colab pro, especially if you plan on reproducing the finetuned models. Free version can be used to run the demo and data preprocessing notebooks. However, I highly recommend using the Pro version for higher memory and CPU limits,longer runtimes and priority access to GPUs.
Once you have done that, you can start exploring the resources in this repository. If you have any questions or run into any issues, please don't hesitate to reach out. I am always happy to help!

License

This project is MIT-licensed.