REINFORCEMENT LEARNING FROM HUMAN FEEDBACK

A conceptual and hands-on introduction to tuning and evaluating large language models (LLMs) using Reinforcement Learning from Human Feedback.

  • Get a conceptual understanding of Reinforcement Learning from Human Feedback (RLHF), as well as the datasets needed for this technique
  • Fine-tune the Llama 2 model using RLHF with the open source Google Cloud Pipeline Components Library
  • Evaluate tuned model performance against the base model with evaluation methods

INDEX

  1. How does RLHF Works
  2. Datasets for RL Training
  3. Tune an LLM with RLHF
  4. Evaluate the tuned model

COURSE LINK

https://learn.deeplearning.ai/courses/reinforcement-learning-from-human-feedback