Do not train on inputs feature for SFT

Question

Do not train on inputs feature for SFT

Closed this issue 8 months ago · 3 comments

freQuensy23-coder commented 8 months ago

Add the ability to train only on the outputs of the model in SFT trainer. It can be implemented by masking labels using -100 value
(https://github.com/tloen/alpaca-lora/blob/8bb8579e403dc78e37fe81ffbb253c413007323f/finetune.py#L167 - alpaca lora implementation)

Answer 1 · 2023-11-18T18:58:32.000Z

Do I understand correctly that you would like to have the ability to train a model to predict only the completion? That is, to calculate the loss only for the completion?

If so, then this is currently possible. There is a CompletionCollator that implements what you desire.

Here's the link to it: https://github.com/BobaZooba/xllm/blob/main/GUIDE.md#completioncollator

It is specified in the config using the parameter collator_key, where the value should be completion.

config = Config(collator_key="completion")

Or

python train.py --collator_key completion

To use this collator, you will need to independently implement your own dataset. Instructions on how to do this are provided here: https://github.com/BobaZooba/xllm/blob/main/GUIDE.md#how-to-implement-dataset

The get_sample method should return a dictionary of this type:

{
    "text_parts": [
        "Hello!",
        "My name is Boris"
    ]
}

That is, under the key text_parts there should be a list of strings. When using this collator, the loss will only be calculated for the last string in the list. You can also specify a prefix in this string, so that it is not considered when calculating the loss. For instance, this can be useful for data formats where there is a prefix indicating an assistant. Example:

{
    "text_parts": [
        "Human: Calculate 1 + 1",
        "Assistant: 1 + 1 equals 2"
    ]
}

Thank you for pointing out this need. In the future, I will make the GeneralDataset more convenient, so that it can also accept this data format in order to use the CompletionCollator.

Does my response address your query?

Answer 2 · 2023-12-01T16:58:38.000Z

Closing this issue due to lack of activity

Answer 3 · 2024-01-24T09:38:59.000Z

Thank you for answering.

Are this thing deepspeed compatible?