BobaZooba/xllm

Do not train on inputs feature for SFT

Closed this issue · 3 comments

Add the ability to train only on the outputs of the model in SFT trainer. It can be implemented by masking labels using -100 value
(https://github.com/tloen/alpaca-lora/blob/8bb8579e403dc78e37fe81ffbb253c413007323f/finetune.py#L167 - alpaca lora implementation)

Do I understand correctly that you would like to have the ability to train a model to predict only the completion? That is, to calculate the loss only for the completion?

If so, then this is currently possible. There is a CompletionCollator that implements what you desire.

Here's the link to it: https://github.com/BobaZooba/xllm/blob/main/GUIDE.md#completioncollator

It is specified in the config using the parameter collator_key, where the value should be completion.

config = Config(collator_key="completion")

Or

python train.py --collator_key completion

To use this collator, you will need to independently implement your own dataset. Instructions on how to do this are provided here: https://github.com/BobaZooba/xllm/blob/main/GUIDE.md#how-to-implement-dataset

The get_sample method should return a dictionary of this type:

{
    "text_parts": [
        "Hello!",
        "My name is Boris"
    ]
}

That is, under the key text_parts there should be a list of strings. When using this collator, the loss will only be calculated for the last string in the list. You can also specify a prefix in this string, so that it is not considered when calculating the loss. For instance, this can be useful for data formats where there is a prefix indicating an assistant. Example:

{
    "text_parts": [
        "Human: Calculate 1 + 1",
        "Assistant: 1 + 1 equals 2"
    ]
}

Thank you for pointing out this need. In the future, I will make the GeneralDataset more convenient, so that it can also accept this data format in order to use the CompletionCollator.

Does my response address your query?

Closing this issue due to lack of activity

Thank you for answering.

Are this thing deepspeed compatible?