How is l_ff created?

Firstly, thank you for the amazing work! I had a question around the implementation of $l_{ff}$ in the (IA)³ method:

The config file for (IA)³ lists lora_layers as "k|v|wi_1.*"

"lora_layers": "k|v|wi_1.*",

However, when using this string to find model layers to modify (code snippet below), it seems that while the Keys and Values in the self-attention modules are modified, all the FF layers (i.e. in the format encoder.block.x.layer.x.DenseReluDense.wi) are skipped, and thus the vector $l_{ff}$ is not created in the model ($l_k$ and $l_v$ are created as expected).

t-few/src/models/lora.py

Lines 64 to 72 in 4e581fa

    
           if re.fullmatch(config.lora_layers, c_name): 
        
               assert isinstance( 
        
                   layer, nn.Linear 
        
               ), f"LoRA can only be applied to torch.nn.Linear, but {layer} is {type(layer)}." 
        
               setattr( 
        
                   module, 
        
                   c_name, 
        
                   LoRALinear(layer, config.lora_rank, config.lora_scaling_rank, config.lora_init_scale), 
        
               )

I was thus wondering if the param lora_layers should instead be "k|v|wi.*"? Or am I missing something, and the existing config file somehow also triggers the creation of $l_{ff}$, in addition to $l_k$ and $l_v$?

Thank you!

Update: I was debugging with T5-small, and didn't realize that the FFN module in T0 model has a wi_1 layer in it.

	if re.fullmatch(config.lora_layers, c_name):
	assert isinstance(
	layer, nn.Linear
	), f"LoRA can only be applied to torch.nn.Linear, but {layer} is {type(layer)}."
	setattr(
	module,
	c_name,
	LoRALinear(layer, config.lora_rank, config.lora_scaling_rank, config.lora_init_scale),
	)