This repository contains automatic translation tools for the Flores200 languages using the NLLB model from Meta AI (https://github.com/facebookresearch/fairseq/tree/nllb).
This API allows you to translate text to 200 languages automatically, detect the source language of a text from 200 languages, and get toxicity in texts.
Get all the available translation languages.
Method: GET
A list with all the available languages using FLORES-200 code.
Automatic translation using NLLB model from Meta AI. Translate input_text in langs_out languages. Available languages: [all languages listed in /langs endpoint]
.
Method: POST
Field | Type | Required | Description |
---|---|---|---|
input_text | string | Yes | The text to be translated. |
langs_out | string | No | The languages to translate the input text to (comma separated). If not specified, it will translate to all languages. |
The result of the translation service, containing translated text in all the languages specified.
Get toxicity in texts without specifying the source language.
Method: POST
Field | Type | Required | Description |
---|---|---|---|
input_text | string | Yes | The text to be checked for toxicity. |
The result of the toxicity detection service, containing toxic words in input_text.
Detect source language of a text from 200 languages.
Method: POST
Field | Type | Required | Description |
---|---|---|---|
input_text | string | Yes | The text to detect the language of. |
The result of the language detection service, containing the detected language of the input text.
To learn how to use these tools, check out the /examples
folder. There is also a frontend application built with Streamlit that consumes this API. To learn more, check out the following repository: https://github.com/rosasalberto/automatic_translation_frontend.
https://hub.docker.com/repository/docker/rosasalberto/translation-service/general
- Get image
docker pull rosasalberto/translation-service
- Run Image
docker run --gpus all -p 8080:8080 rosasalberto/translation-service
Modify translation_langs
in config.py
to include the languages you want to be able to translate, using the Flores200 language codes.
- Install CUDA 11.6
- Install Microsoft C++ Build Tools if using Windows:
- Install python 3.7.2 https://www.python.org/downloads/release/python-372/ and add to PATH:
has to return Python 3.7.2
python --version
- Upgrade pip
python -m pip install --upgrade pip
- Install pipenv
pip install pipenv
- Clone this repo
git clone https://github.com/rosasalberto/automatic_translation_server
- Change directory and install the needed dependencies in a virtual environment and activate it
cd automatic_translation_server pipenv install --dev --python 3.7.2 pipenv shell
- Download Language Detection (LID) model from the provided link: https://tinyurl.com/nllblid218e and add id to the '/weights' folder
- Configure the server by modifying the
config.py
file:- Modify
translation_langs
to include the languages you want to be able to translate, using the Flores200 language codes. - Modify
lid_path
to the full path of the LID model. - Modify
path_toxicity_data
to the full path to the toxicity vocab files.
- Modify
uvicorn server:app --reload
- Set up application from https://github.com/rosasalberto/automatic_translation_frontend
To build a Docker image, you need to have Docker installed on your machine. If you don't have it already, you can install it by following the instructions on the Docker website: https://docs.docker.com/get-docker/
- Get nvidia image for Cuda 11.6
docker pull nvidia/cuda:11.6.2-base-ubuntu20.04
- Build docker Image
docker build -t translation-service .
- Run Image
docker run --gpus all -p 8080:8080 translation-service
- Optional: Upload your Image to the Docker Hub
-
In packages installation, if you experience problems related to the Python version, try the following command which forces pipenv to use a given version of Python:
pipenv install --dev --python 3.7.2
-
If you do not have python 3.7.2 on your system, you can:
- Install Python using your operating system's package manager. On Linux systems, you can use apt-get or yum, and on macOS you can use brew.
- Download the Python installer from the official Python website (https://www.python.org/) and run it to install Python on your system.
- Use a version manager such as pyenv or asdf to install and manage multiple versions of Python on your system.
Once you have a Python interpreter installed, you should be able to use pipenv to install the dependencies for your project.
If you experience any problem don't hesitate to contact: rosas.alberto.upc@gmail.com