question about the arguments within the get_pretrained_model() function

Question

question about the arguments within the get_pretrained_model() function

Opened this issue 5 months ago · 0 comments

Hi,

When I looked at some examples of getting the pretrained models, I saw:

parameters, forward_fn, tokenizer, config = get_pretrained_model(
    model_name="500M_human_ref",
    embeddings_layers_to_save=(20,),
    max_positions=32,
)

parameters, forward_fn, tokenizer, config = get_pretrained_model(
    model_name="500M_1000G",
    # Get embedding at layers 5 and 20
    embeddings_layers_to_save=(5, 20,),
    # Get attention map number 4 at layer 1 and attention map number 14
    # at layer 12
    attention_maps_to_save=((1,4), (12, 14)),
    max_positions=128,
)

Here it seems that different pretrained models have different configuration? I was wondering if you could further add detailed clarification about how to choose the embeddings_layers_to_save and max_positions. I also saw in some issues mentioning that we might need to set the max_positions to 1000? Just a bit confused and it would be the best if the authors could provide the suggested configures for each pretrained model somewhere in the tutorial or readme files.