请问VLEProcessor.from_pretrained可以将切分好的token映射为对应的ID。那么如何将对应的ID转化为文本呢?
the-nine-nation opened this issue · 2 comments
the-nine-nation commented
我在后面接了一个decode,然后接一个全连接层预测输出文字,但我不明白如何将它转化为对应的文字输出
GoGoJoestar commented
您可以使用VLEProcessor.tokenizer
,使用方法和transformers的其他tokenizer
一样,比如
vle_processor = VLEProcessor.from_pretrained(model_name)
print(vle_processor.tokenizer.encode('A nice day!'))
# [1, 336, 1085, 406, 300, 2]
print(vle_processor.tokenizer.decode([1, 336, 1085, 406, 300, 2]))
# '[CLS] A nice day![SEP]'
the-nine-nation commented
非常感谢