/Eng-Swa-Translator

A bilingual translator application using transformers. The languages are Swahili and English

Primary LanguageJupyter Notebook

Contributors Forks Stargazers Issues MIT License LinkedIn


English to Swahili Translator

Creating a model which will be able to translate English to Swahili using local datasets.

Revolutionzing Language Processing
Explore the docs »

View Progress · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Roadmap
  3. Contributing
  4. License
  5. Contact
  6. Acknowledgments

About The Project

image

1.1 Problem Statement

The problem of translating English to Swahili presents several challenges, primarily due to the linguistic differences between the two languages. Machine translation has made significant advancements, but there is room for improvement in achieving accurate and contextually relevant translations. This project aims to address these challenges and enhance the quality of English to Swahili translations using Natural Language Processing (NLP) techniques.

1.2 Objectives

The primary objectives of this project are as follows:

  1. Develop a robust English-to-Swahili translation model: Create a machine translation model capable of accurately translating English text into Swahili while preserving context and meaning.
  2. Improve translation quality: Enhance the fluency, coherence, and accuracy of translated text to make it more natural and contextually relevant for Swahili speakers.
  3. Handle various text types and domains: Ensure the translation model can handle diverse text types, including formal documents, informal conversations, technical content, and more.

1.3 Project Structure

  1. Dataset folder - All dataset csv should be placed there
  2. Notebook folder - All notebook files should be placed there
  3. Deployment folder - All Deployment should be placed there
  4. Model folder - Any saved model should be placed there

(back to top)

Built With

  • python logo
  • flask logo
  • git logo

(back to top)

Screenshot from 2024-01-24 11-14-42

Screenshot from 2024-01-07 11-41-01

How to run it?

First step is to download the models from the link MODEL add the model in the root project directory.

The following instructions were tested on the Windows and Linux with Python 3.8.

  1. Clone this repository
git clone https://github.com/Rogendo/Eng-Swa-Translator.git
cd Eng-Swa-Translator/
  1. Create and activate virtual environment
python -m venv venv

on Linux system

source venv/bin/activate

on Windows system

.\venv\Scripts\activate.bat
  1. Install requirements
pip install  -r requirements.txt
cd deployment/
  1. Run the
flask --app app --debug run

Deployed Model on Hugging Face

The English-Swahili translation model has been successfully deployed on Hugging Face, a popular platform for hosting and sharing machine learning models. The deployment enables seamless integration with applications via an API, making it accessible to developers and end-users globally.

Model Details

Model Name: Rogendo/en-sw, Rogendo/sw-en
Hosted Platform: Hugging Face Model Hub
Architecture: Transformer-based Neural Network, fine-tuned for English-Swahili translations.
Dataset: Trained on datasets like JW300 and CCMatrix, which include diverse linguistic contexts.

How to Access the Model

Visit the model's page on Hugging Face: [https://huggingface.co/Rogendo/en-sw](https://huggingface.co/Rogendo/en-sw).
Use the Inference API directly from the Hugging Face interface:
    Input text in English or Swahili.
    Receive instant translations without additional setup.
Integrate the model into your application:
    Install transformers via pip install transformers.
    Use the provided code snippet to load and use the model locally or through the API.

Sample Code

"""
    main.py


    If training the model proves to be too much for you either computationally, or timewise, You can utilise the deployed version of the model in huggingface
    through the code provided below in the 'main.py' file  Example usage from the deployed model in huggingface. Happy coding! 
    
"""
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM


eng_swa_tokenizer = AutoTokenizer.from_pretrained("Rogendo/en-sw")
eng_swa_model = AutoModelForSeq2SeqLM.from_pretrained("Rogendo/en-sw")

eng_swa_translator = pipeline(
    "text2text-generation",
    model = eng_swa_model,
    tokenizer = eng_swa_tokenizer,
)




swa_eng_tokenizer = AutoTokenizer.from_pretrained("Rogendo/sw-en")
swa_eng_model = AutoModelForSeq2SeqLM.from_pretrained("Rogendo/sw-en")

swa_eng_translator = pipeline(
    "text2text-generation",
    model = swa_eng_model,
    tokenizer = swa_eng_tokenizer,
)

def translate_text_swa_eng(text):
  translated_text = swa_eng_translator(text,max_length=128, num_beams=5)[0]['generated_text']
  return translated_text

def translate_text_eng_swa(text):
    translated_text = eng_swa_translator(text, max_length=128, num_beams=5)[0]['generated_text']
    return translated_text

text = "Ninampenda sana mama yangu, bila yeye singekuwa mahali nilipo sasa"
translate_text_swa_eng(text)

text = "My name is John, I love Food so much that I can let it kill me if it had hands of its own"
translate_text_eng_swa(text)

Roadmap

  • Revolutionzing Language Processing

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

Contact

Your Name - @Eng-swa-translator -

Project Link: https://github.com/Rogendo/Eng-Swa-Translator/issues

(back to top)

Acknowledgments

(back to top)