arcee-ai/mergekit

Support for xlm-roberta

umiron opened this issue · 2 comments

Is it possible to add support for xlm-roberta? It's the same architecture as roberta, except for a larger vocabulary since it is multi-lingual.

Hey @umiron
I believe there isn't anything within mergekit that is a barrier to inter- xlm-roberta related merges as the architecture format is tensor size oblivious.

If this really matches up with the xlm-roberta weight names and architecture, add the architecture name (XLMRobertaForMaskedLM) here locally and test to see if it works

Thanks, @metric-space. This works well (except in my case the change was to this file, since the relevant architecture was XLMRobertaModel rather than XLMRobertaForMaskedLM).