GreesyGuard is a text moderation model trained on Gitpod to identify and filter inappropriate content. GreesyGuard created with the help of claude-3.5-sonnet
-
Clone the repository:
git clone https://github.com/Nicat-dcw/greesyguard.git cd greesyguard
-
Install the required packages:
pip install -r requirements.txt
-
Prepare your dataset:
Ensure your dataset contains fields
tweet
andlabel
. -
Train the model:
python train.py
-
Run inference:
python inference.py
We used the dataset for training the model for benchmark
HumanEval | SG Prompt | |
---|---|---|
GreesyGuard-2 | 42% | 89.7 |
Text-mod-007 | 16% | 85.6 |
ShieldGemma(2b) | No Data | No Data |
- Increased Vocab size
- Tokenizer (p50>cl100k)
- Max length (128>2048)
- Learning rate (2e-5)
- Hugginface's datasets support
- Better learning handling
- API Support (OpenAI)
Next version: 10 stars