Hamzenium/Reinforcement-learning-LLM
Fine-tuned Google's FLAN-T5 for non-toxic dialogue summaries using Hugging Face Transformers, toxicity detection, detoxification, and PPO-based training.
Jupyter Notebook
Fine-tuned Google's FLAN-T5 for non-toxic dialogue summaries using Hugging Face Transformers, toxicity detection, detoxification, and PPO-based training.
Jupyter Notebook