What's the trainset is used to obtain “Model with contextg extension via improved LoRA fine-tuning” (LoRA+)?
ZackZikaiXiao opened this issue · 0 comments
Hi, thanks for the great work. I have a question regarding the used trainset for different types of models (Fully fine-tuned, Lora+, models for extra experiments in paper).
In the ReadMe, it states, "There is no need to make supervised fine-tuning upon the fine-tuned context extended models. It is all right to directly use the base model as Llama2-chat models, as the amount of long instruction following data is enough for SFT." While in the paper, Figure 5's caption suggests that Lora+ is trained with RedPajama.
I'm seeking clarification on the following points:
- Do the released models refer to those that have undergone unsupervised fine-tuning on RedPajama and then tested on PG19?
- Is Table 9, which evaluates the LongBench benchmark, the only one involving supervised fine-tuning with LongAlpaca-12k based on models fine-tuned with RedPajama?
- Where can I find the performance of using only LongAlpaca-12k to derive the Lora adapter, embeds, and norm layer?
RedPajama (unsupervised) | LongAlpaca-12k (supervised) | |
---|---|---|
Fully fine-tuned (readme) | √ | |
Lora+ (readme) | √ | |
Models for LongBench benchmark (paper) | √ | √ |
I've drafted a table to summarize my understanding of the training configurations mentioned in both the ReadMe and the paper. Could you please confirm if this representation is correct?