google-gemini/gemma-cookbook

Aligning with DPO a Gemma 2B model.

Closed this issue · 5 comments

Description of the feature request:

I want to create a new notebook for the cookbook showing how to align a Gemma model using DPO with a public dataset from Hugging Face.

I'm going to use this notebook as a base:
Aligning_DPO_open_gemma-2b-it.ipynb

but with more explanations and adapting the format to meet the requirements of the cookbook.

What problem are you trying to solve with this feature?

No response

Any other information you'd like to share?

No response

This looks interesting. Could you

  1. follow the steps here?

  2. come up with another inference demo, since the current one gives the wrong answer mathematically, although it follows the instruction?

Thanks @windmaple.

The model adapts its behavior to the alignment process. In the response from the unaligned model, the instruction to return only numbers is ignored. I will adapt the inference example to an operation that returns the correct value or look for another example.

On the other hand, I'm not sure whether to keep the code that publishes the model on Hugging Face as an example, or if it would be better to remove it and focus the notebook solely on the DPO process.

On the other hand, I'm not sure whether to keep the code that publishes the model on Hugging Face as an example, or if it would be better to remove it and focus the notebook solely on the DPO process.

Yes, pls keep this part of the code. We want to make it as easy as possible to publish any finetuned model.

Thank you for your contribution👍👍! We will try to get it featured on Google's social account soon.

Thanks to you! A pleasure!