Didn't find several files
youngornever opened this issue · 1 comments
When I run the membertret, I dont find several files.
1、/home/zhengchujie/bert_torch/chinese_wwm_pytorch/bert_config.json&vocab.txt&pytorch_model.bin
As a result, I download an alternative in https://github.com/ymcui/Chinese-BERT-wwm.
However, there are several warnings.
INFO - pytorch_transformers.tokenization_utils Model name 'KdConv/benchmark/_bert_chinese_wwm_pytorch/vocab.txt' not found in model shortcut name list
INFO - pytorch_transformers.tokenization_utils - Didn't find file /KdConv/benchmark/_bert_chinese_wwm_pytorch/added_tokens.json&special_tokens_map.json&tokenizer_config.json. We won't load it.
2、FileNotFoundError: [Errno 2] No such file or directory: '../data/resources/chinese_stop_words.txt'
As a result, I git clone https://github.com/goto456/stopwords, and mv cn_stopwords.txt chinese_stop_words.txt.
Please give the corresponing url of those files.
Thanks
Q1: I am not sure what your version of transformers
is. The version used in our codes is an early version pytorch_pretrained_bert
. Besides, what are your args
settings? Saying what are the values of --bert_config_file
, --vocab_file
and --init_checkpoint
?
Q2: It has been a long time since our experiments were conducted. I am sorry but i am not sure what stop words are used in our experiments. You can use any publicly available stop word vocab as you want.