This is a simple API for identifying the language of a given text using the pre-trained model FastText. The API is built using the FastText library for loading the pre-trained model and making predictions.
The API accepts a text string in a POST request to the endpoint / in JSON format and returns the identified language and confidence score in JSON format.
git clone git@github.com:flexchar/language-identification-api.gitdocker build -t language-identification-api .docker run -rm -p 8000:8000 language-identification-apiThis will start the API endpoint on port 8000 of the container and map it to port 8000 on the host.Send a POST request to the endpoint with the text you want to identify.json
curl -X POST -H "Content-Type: application/json" -d '{"text":"This is a sample text"}' http://localhost:8000/This will return the identified language and confidence score.
Make sure you have Docker installed on your system before running the above commands. The pre-trained model 'lid.176.bin' is already available in the repo and is copied to the container during the build process. You may use the image pushed to the Github registry alongside this repo.
I've built this endpoint for use inside internal applications in a closed environment. For production, you may use reverse proxy such as Nginx or Apache to proxy the requests to Gunicorn, to have more control over the requests endpoint and supply with the TLS certificates.