Creating a model which will be able to translate English to Swahili using local datasets.
Revolutionzing Language Processing
Explore the docs »
View Progress
·
Report Bug
·
Request Feature
Table of Contents
The problem of translating English to Swahili presents several challenges, primarily due to the linguistic differences between the two languages. Machine translation has made significant advancements, but there is room for improvement in achieving accurate and contextually relevant translations. This project aims to address these challenges and enhance the quality of English to Swahili translations using Natural Language Processing (NLP) techniques.
The primary objectives of this project are as follows:
- Develop a robust English-to-Swahili translation model: Create a machine translation model capable of accurately translating English text into Swahili while preserving context and meaning.
- Improve translation quality: Enhance the fluency, coherence, and accuracy of translated text to make it more natural and contextually relevant for Swahili speakers.
- Handle various text types and domains: Ensure the translation model can handle diverse text types, including formal documents, informal conversations, technical content, and more.
- Dataset folder - All dataset csv should be placed there
- Notebook folder - All notebook files should be placed there
- Deployment folder - All Deployment should be placed there
- Model folder - Any saved model should be placed there
First step is to download the models from the link MODEL add the model in the root project directory.
The following instructions were tested on the Windows and Linux with Python 3.8.
- Clone this repository
git clone https://github.com/Rogendo/Eng-Swa-Translator.git
cd Eng-Swa-Translator/
- Create and activate virtual environment
python -m venv venv
on Linux system
source venv/bin/activate
on Windows system
.\venv\Scripts\activate.bat
- Install requirements
pip install -r requirements.txt
cd deployment/
- Run the
flask --app app --debug run
The English-Swahili translation model has been successfully deployed on Hugging Face, a popular platform for hosting and sharing machine learning models. The deployment enables seamless integration with applications via an API, making it accessible to developers and end-users globally.
Model Name: Rogendo/en-sw, Rogendo/sw-en
Hosted Platform: Hugging Face Model Hub
Architecture: Transformer-based Neural Network, fine-tuned for English-Swahili translations.
Dataset: Trained on datasets like JW300 and CCMatrix, which include diverse linguistic contexts.
Visit the model's page on Hugging Face: [https://huggingface.co/Rogendo/en-sw](https://huggingface.co/Rogendo/en-sw).
Use the Inference API directly from the Hugging Face interface:
Input text in English or Swahili.
Receive instant translations without additional setup.
Integrate the model into your application:
Install transformers via pip install transformers.
Use the provided code snippet to load and use the model locally or through the API.
"""
main.py
If training the model proves to be too much for you either computationally, or timewise, You can utilise the deployed version of the model in huggingface
through the code provided below in the 'main.py' file Example usage from the deployed model in huggingface. Happy coding!
"""
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
eng_swa_tokenizer = AutoTokenizer.from_pretrained("Rogendo/en-sw")
eng_swa_model = AutoModelForSeq2SeqLM.from_pretrained("Rogendo/en-sw")
eng_swa_translator = pipeline(
"text2text-generation",
model = eng_swa_model,
tokenizer = eng_swa_tokenizer,
)
swa_eng_tokenizer = AutoTokenizer.from_pretrained("Rogendo/sw-en")
swa_eng_model = AutoModelForSeq2SeqLM.from_pretrained("Rogendo/sw-en")
swa_eng_translator = pipeline(
"text2text-generation",
model = swa_eng_model,
tokenizer = swa_eng_tokenizer,
)
def translate_text_swa_eng(text):
translated_text = swa_eng_translator(text,max_length=128, num_beams=5)[0]['generated_text']
return translated_text
def translate_text_eng_swa(text):
translated_text = eng_swa_translator(text, max_length=128, num_beams=5)[0]['generated_text']
return translated_text
text = "Ninampenda sana mama yangu, bila yeye singekuwa mahali nilipo sasa"
translate_text_swa_eng(text)
text = "My name is John, I love Food so much that I can let it kill me if it had hands of its own"
translate_text_eng_swa(text)
- Revolutionzing Language Processing
See the open issues for a full list of proposed features (and known issues).
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
Your Name - @Eng-swa-translator -
Project Link: https://github.com/Rogendo/Eng-Swa-Translator/issues