How is l_ff created?
Closed this issue · 1 comments
Firstly, thank you for the amazing work! I had a question around the implementation of
The config file for (IA)3 lists lora_layers
as "k|v|wi_1.*"
Line 6 in 4e581fa
However, when using this string to find model layers to modify (code snippet below), it seems that while the Keys and Values in the self-attention modules are modified, all the FF layers (i.e. in the format encoder.block.x.layer.x.DenseReluDense.wi
) are skipped, and thus the vector
Lines 64 to 72 in 4e581fa
I was thus wondering if the param lora_layers
should instead be "k|v|wi.*"
? Or am I missing something, and the existing config file somehow also triggers the creation of
Thank you!
Update: I was debugging with T5-small, and didn't realize that the FFN module in T0 model has a wi_1
layer in it.