thunlp/OpenDelta

XGLM: How to apply OpenDelta to a new model?

un-certainty opened this issue · 2 comments

Hi,

Thanks for providing this useful library for delta tuning!

How can we apply OpenDelta to a new model, such as facebook/xglm-564M? Its architecture looks like:

root
├── model (XGLMModel)
│   ├── embed_tokens (Embedding) weight:[256008, 1024]
│   ├── embed_positions (XGLMSinusoidalPositionalEmbedding) weights:[2050, 1024]
│   ├── layers (ModuleList)
│   │   └── 0-23(XGLMDecoderLayer)
│   │       ├── self_attn (XGLMAttention)
│   │       │   └── k_proj,v_proj,q_proj,out_proj(Linear) weight:[1024, 1024] bias:[1024]
│   │       ├── self_attn_layer_norm,final_layer_norm(LayerNorm) weight:[1024] bias:[1024]
│   │       ├── fc1 (Linear) weight:[4096, 1024] bias:[4096]
│   │       └── fc2 (Linear) weight:[1024, 4096] bias:[1024]
│   └── layer_norm (LayerNorm) weight:[1024] bias:[1024]
└── lm_head (Linear) weight:[256008, 1024]

To reproduce

from opendelta import LoraModel
from transformers import XGLMForCausalLM

backbone_model = XGLMForCausalLM.from_pretrained("facebook/xglm-564M")
delta_model = LoraModel(backbone_model)

You should read the documentation first. Specify the submodules you want to modify by adding

 LoraModel(backbone_model, modified_modules = ["q_proj","v_proj"] )

How to update only some layers?