Support for xlm-roberta
umiron opened this issue · 2 comments
umiron commented
Is it possible to add support for xlm-roberta? It's the same architecture as roberta, except for a larger vocabulary since it is multi-lingual.
metric-space commented
Hey @umiron
I believe there isn't anything within mergekit that is a barrier to inter- xlm-roberta related merges as the architecture format is tensor size oblivious.
If this really matches up with the xlm-roberta weight names and architecture, add the architecture name (XLMRobertaForMaskedLM
) here locally and test to see if it works
umiron commented
Thanks, @metric-space. This works well (except in my case the change was to this file, since the relevant architecture was XLMRobertaModel
rather than XLMRobertaForMaskedLM
).