Integration with Huggingface Trainer

Question

Integration with Huggingface Trainer

Closed this issue a year ago · 1 comments

While trying to apply AnaLog to LLM pre-/fine-tuning, I realized that almost all existing codes/repos are built upon Huggingface Trainer. E.g.,

Stanford Alpaca: https://github.com/tatsu-lab/stanford_alpaca/blob/main/train.py
lmsys Vicuna: https://github.com/lm-sys/FastChat/blob/main/fastchat/train/train_lora.py
axolotl: https://github.com/OpenAccess-AI-Collective/axolotl/blob/main/src/axolotl/core/trainer_builder.py

Therefore, I believe it's wise to work on the AnaLog + HF Trainer integration, instead of writing training codes for these LLMs from scratch. This feature will be crucial for a wide adoption of AnaLog.

If we want to integrate "log extraction" into HF Trainer, we can most likely use TrainerCallback as below:

from transformers import Trainer, TrainerCallback

class AnaLogCallback(TrainerCallback):
    def on_train_end(self, args, state, control, **kwargs):
        """AnaLog logging"""

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    callbacks=[AnaLogCallback()]
)

However, we may skip the training procedure completely if we have direct access to the trained model. It would be still nice to leverage HF Trainer for various optimization (e.g. gradient checkpointing, FSDP), as our log extraction code is pretty similar with training code (reference: https://github.com/sangkeun00/analog/blob/main/examples/bert_influence/extract_log.py). I am not particularly familiar with HF Trainer, so if someone tagged below who are more familiar with this can help, that would be very much appreciated!

@hwijeen @nshdesai @pomonam @hage1005 @DachengLi1

Answer 1 · 2024-01-08T03:33:38.000Z

Addressed in #73