ShepherdMe.ai

This repository contains code to instantiate and deploy a toxic comment classifier along with a custom UI wrapper. This model is able to detect 6 types of toxicity in a text fragment. The six detectable types are toxic, severe toxic, obscene, threat, insult, and identity hate.

The model MAX Toxic Comment Classifier is based on the pre-trained BERT-Base, English Uncased model and was finetuned on the Toxic Comment Classification Dataset using the Huggingface BERT Pytorch repository.

A brief definition of the six different toxicity types can be found below.

Toxic: very bad, unpleasant, or harmful

Severe toxic: extremely bad and offensive

Obscene: (of the portrayal or description of sexual matters) offensive or disgusting by accepted standards of morality and decency

Threat: a statement of an intention to inflict pain, injury, damage, or other hostile action on someone in retribution for something done or not done

Insult: speak to or treat with disrespect or scornful abuse

Identity hate: hatred, hostility, or violence towards members of a race, ethnicity, nation, religion, gender, gender identity, sexual orientation or any other designated sector of society

Licenses

Component License Link
This repository Apache 2.0 LICENSE
Finetuned Model Weights Apache 2.0 LICENSE
Pre-trained Model Weights Apache 2.0 LICENSE
TensorFlow Model Code (3rd party) Apache 2.0 LICENSE
PyTorch Model Code (3rd party) Apache 2.0 LICENSE
Toxic Comment Classification Dataset CC0 LICENSE

Pre-requisites:

  • docker: The Docker command-line interface. Follow the installation instructions for your system.
  • The minimum recommended resources for this model is 4GB Memory and 4 CPUs.

Installation Guide

TBD