About the huggingface pretrained model
Closed this issue · 5 comments
Shelod commented
Hi, how can I use this huggingface pretrained model to produce chengyu embeddings? https://huggingface.co/visualjoyce/chengyubert_2stage_stage1_wwm_ext ,
since chinese-BERT-wwm only produces token based embedding.
Vimos commented
Shelod commented
since pretrain may cost huge computation power, trying to directly use huggingface pretrained model to produce chengyu contextualized embeddings. for example "赵括的纸上谈兵使得赵国在长平之战中大败" is inputed to this pretrained model, only get separate embedding for '纸‘or‘上’or‘谈’or‘兵’, cannot obtain separate embedding for '纸上谈兵’.
Vimos commented
In this case, you need to change the input from "赵括的纸上谈兵使得赵国在长平之战中大败"
to "赵括的[MASK]使得赵国在长平之战中大败"
Shelod commented
It works, thanks a lot!
Vimos commented
Glad to hear that!