question about the arguments within the get_pretrained_model() function
Opened this issue · 0 comments
hongruhu commented
Hi,
When I looked at some examples of getting the pretrained models, I saw:
parameters, forward_fn, tokenizer, config = get_pretrained_model(
model_name="500M_human_ref",
embeddings_layers_to_save=(20,),
max_positions=32,
)
parameters, forward_fn, tokenizer, config = get_pretrained_model(
model_name="500M_1000G",
# Get embedding at layers 5 and 20
embeddings_layers_to_save=(5, 20,),
# Get attention map number 4 at layer 1 and attention map number 14
# at layer 12
attention_maps_to_save=((1,4), (12, 14)),
max_positions=128,
)
Here it seems that different pretrained models have different configuration? I was wondering if you could further add detailed clarification about how to choose the embeddings_layers_to_save
and max_positions
. I also saw in some issues mentioning that we might need to set the max_positions
to 1000
? Just a bit confused and it would be the best if the authors could provide the suggested configures for each pretrained model somewhere in the tutorial or readme files.