Integration with Huggingface Trainer
Closed this issue · 1 comments
While trying to apply AnaLog
to LLM pre-/fine-tuning, I realized that almost all existing codes/repos are built upon Huggingface Trainer. E.g.,
- Stanford Alpaca: https://github.com/tatsu-lab/stanford_alpaca/blob/main/train.py
- lmsys Vicuna: https://github.com/lm-sys/FastChat/blob/main/fastchat/train/train_lora.py
- axolotl: https://github.com/OpenAccess-AI-Collective/axolotl/blob/main/src/axolotl/core/trainer_builder.py
Therefore, I believe it's wise to work on the AnaLog + HF Trainer integration, instead of writing training codes for these LLMs from scratch. This feature will be crucial for a wide adoption of AnaLog.
If we want to integrate "log extraction" into HF Trainer, we can most likely use TrainerCallback
as below:
from transformers import Trainer, TrainerCallback
class AnaLogCallback(TrainerCallback):
def on_train_end(self, args, state, control, **kwargs):
"""AnaLog logging"""
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
callbacks=[AnaLogCallback()]
)
However, we may skip the training procedure completely if we have direct access to the trained model. It would be still nice to leverage HF Trainer for various optimization (e.g. gradient checkpointing, FSDP), as our log extraction code is pretty similar with training code (reference: https://github.com/sangkeun00/analog/blob/main/examples/bert_influence/extract_log.py). I am not particularly familiar with HF Trainer, so if someone tagged below who are more familiar with this can help, that would be very much appreciated!
Addressed in #73