Hamzenium/Reinforcement-learning-LLM
Fine-tuned Google's FLAN-T5 for non-toxic dialogue summaries using Hugging Face Transformers, toxicity detection, detoxification, and PPO-based training.
Jupyter Notebook
Stargazers
No one’s star this repository yet.