What's the trainset is used to obtain “Model with contextg extension via improved LoRA fine-tuning” (LoRA+)？

Question

What's the trainset is used to obtain “Model with contextg extension via improved LoRA fine-tuning” (LoRA+)？

ZackZikaiXiao opened this issue 8 months ago · 0 comments

Hi, thanks for the great work. I have a question regarding the used trainset for different types of models (Fully fine-tuned, Lora+, models for extra experiments in paper).

In the ReadMe, it states, "There is no need to make supervised fine-tuning upon the fine-tuned context extended models. It is all right to directly use the base model as Llama2-chat models, as the amount of long instruction following data is enough for SFT." While in the paper, Figure 5's caption suggests that Lora+ is trained with RedPajama.

I'm seeking clarification on the following points:

Do the released models refer to those that have undergone unsupervised fine-tuning on RedPajama and then tested on PG19?
Is Table 9, which evaluates the LongBench benchmark, the only one involving supervised fine-tuning with LongAlpaca-12k based on models fine-tuned with RedPajama?
Where can I find the performance of using only LongAlpaca-12k to derive the Lora adapter, embeds, and norm layer?

	RedPajama (unsupervised)	LongAlpaca-12k (supervised)
Fully fine-tuned (readme)	√
Lora+ (readme)	√
Models for LongBench benchmark (paper)	√	√

I've drafted a table to summarize my understanding of the training configurations mentioned in both the ReadMe and the paper. Could you please confirm if this representation is correct?