This project is the result of a collaboration between DreamSpace Academy, NYU CIC, and Omdena, and was funded by NYU CIC. The goal of the project is to detect hate speech on social media platforms that's in either Tamil, English or Tanglish (English transliterated into Tamil). A global team of 50 AI changemakers took on the task to detect hate speech in Tamil language.The partner for this challenge is social enterprise DreamSpace Academy (DSA). The Challenge is supported by the NYU Center on International Cooperation and the Netherlands Ministry of Foreign Affairs.
The focus is on the following hate-speech related categories:
-
Community-based hate speech
-
Religion-based hate speech
-
Gender-based hate speech
-
Political hate speech
- An AI model written in Python: Built using
Fastapi
andStreamlit
making the complete code base in Python.
-
Clone the Repo.
-
Run the backend service. (Make sure Docker is running.)
- Go to the
backend
folder - Run the
Docker Compose
command
$ cd backend backend:~$ sudo docker-compose up -d
- Go to the
-
Run the frontend service.
- Go to the
frontend
folder
- Run the app with the streamlit run command
$ cd frontend frontend:~$ streamlit run NLPfile.py
- Go to the
-
Access to Fastapi Documentation:
- Hate Classification: http://localhost:8080/api/v1/classification/docs
-
Front End: streamlit code is in the
frontend
folder. Along with theDockerfile
andrequirements.txt
-
Back End: Fastapi code is in the
backend
folder.-
The project has been implemented as a microservice, with its own fastapi server and requirements and Dockerfile.
-
Directory tree as below:
- classification > app > api > bert_model_artifacts - model.bin - network.py
-
Each folder model will need the following files:
- Model bin file is the saved model after training.
network.py
for customised model, define class here.
-
config.json
: This file contains the details of the models in the backend and the dataset they are trained on.
-
- Run the following script with your desired text input as the
data
variable:
$ cd backend
backend:~$ python backend\test_api.py
This project is licensed under the Apache License 2.0. You may not use any trademarks associated with the software without permission. The full text of the license can be found in the LICENSE file.