Mivg/SLED

Partial loading of AutoModelForSeq2SeqLM SLED models' checkpoint weights of with AutoModel

Opened this issue · 1 comments

Mivg commented

When loading a model checkpoint of type AutoModelForSeq2SeqLM with AutoModel, the state dict will not be loaded correctly, as some of the name of parameters are different.

Firstly, Thanks for nice paper and useful deployment as package.

I leave this question here because I think my issue can be related with this issue.

I'm now following modelcard at huggingface for bart-large('tau/t5-v1_1-large-sled'), following readme. conditional generation.

  • model card link( https://huggingface.co/tau/bart-large-sled )
    At conditional generation for bart, I got some issue while following model card instructions.
    As in model card, I changed 'SledModel' into 'SledModelForConditionalGeneration'.

My error message occured is as below.
from sled import SledModelForConditionalGeneration

ImportError: cannot import name 'SledModelForConditionalGeneration' from 'sled'

does 'py-sled' package doesn't support ConditionalGeneration yet..?