Text Toxicity Analysis 📝

This project provides a streamlit application to analyze the toxicity of text inputs using a pre-trained BERT model. The application classifies text into multiple categories of toxicity including toxic, severe_toxic, obscene, threat, insult, and identity_hate.

Features

Text Analysis: Classifies text input into multiple categories of toxicity.
Interactive Web Interface: Uses Streamlit for a user-friendly interface.
Model Training: Includes a Jupyter notebook for training the BERT model on a custom dataset.

Installation

Prerequisites

Python 3.8 or higher
Pip package manager

Steps

Clone the repository:

git clone https://github.com/kvba1/Text-Toxicity-Analysis
cd text-toxicity-analysis

Install dependencies:
```
pip install -r requirements.txt
```
Download the pre-trained model:

Place your trained model (toxic.pt) in the ./model directory.
Run the Streamlit app:
```
streamlit run app.py
```

Usage

Enter Text: Type or paste the text you want to analyze into the input box.
Analyze: Click the "Analyze" button to classify the text.
View Results: The app displays the classification results and maintains a history of analyzed texts.

Training the Model

To train the model, you can use the provided Jupyter notebook train.ipynb:

Open the notebook:
```
jupyter notebook train.ipynb
```
Follow the instructions: The notebook contains detailed steps for training the BERT model on a toxicity dataset.

File Structure

app.py: The main application file for the Streamlit app.
train.ipynb: Jupyter notebook for training the BERT model.
requirements.txt: List of Python dependencies.
model/toxic.pt: Pre-trained model weights (not included, needs to be downloaded separately).
README.md: Project documentation.

Example

Input

I am friendly

Output

| toxic | severe_toxic | obscene | threat | insult | identity_hate |
|-------|--------------|---------|--------|--------|---------------|
|   0   |       0      |    0    |   0    |    0   |       0       |

Contributing

Contributions are welcome! Please fork the repository and submit a pull request.

License

This project is licensed under the MIT License.

kvba1/Text-Toxicity-Analysis-BERT-Streamlit