Problem when calling "export_bert.py"
freesunshine0316 opened this issue · 10 comments
Hi,
I met several problems when executing this script. I'd be appreciate if anyone can help me out.
First, I saw "PTB_TOKEN_UNESCAPE" does not exist in "parse_nk.py", and I just commented that out.
Second, I observe the following error:
Traceback (most recent call last):
File "export/export_bert.py", line 406, in <module>
the_inp_tokens, the_inp_mask, the_out_chart, the_out_tags = make_network()
File "export/export_bert.py", line 326, in make_network
ftag = make_ftag(word_out)
File "export/export_bert.py", line 286, in make_ftag
tf.constant(sd['f_tag.0.weight'].numpy().transpose()),
KeyError: 'f_tag.0.weight'
By looking into the code, it seems that my model does not have named parameters with "f_tag" prefix, and my model does have these with "f_label" prefix.
Does this mean that my model require gold POS tags as the input as well?
I did not set "--use-tags" or "--predict-tags" for training.
The export_bert.py
script was written with the expectation that you enable --predict-tags
during training. The f_tag
weights that it can't find are from the POS tag prediction head. It looks like you'll need to adjust either the training or the export script to get things to run.
Hi Kitaev,
Thank you for your reply.
The README says that the --predict-tags
is just used for auxiliary losses. Does that also change what model require as the input as well? Specifically, do I need to provide POS tags as additional input to my model, if I didn't use --predict-tags
for model training?
I just need a model that works the same way as the released models (e.g. "benepar_zh").
The model inputs don't change if you train with --predict-tags
. The training data does, however, need to contain POS tags -- they are used for supervision only, and not as an input.
The way to match a released model is to enable --predict-tags
during training.
You can also modify the export script to not do anything related to POS tags, if you don't want to re-train the model.
Hi Kitaev,
Does that mean the parser performs POS tagging as token-level label prediction even without the --predict-tags
switch?
Thanks.
That's right, that option enables token-level POS tag prediction.
HI @nikitakit
Thanks for your reply. My previous question wasn't clear.
I'd updated it, and can you please verify it? Thanks. That should be my last one.
Thanks again!
I retrained my model with --predict-tags
and successfully generated my meta.json
, model.pb
and vocab.txt
.
After packing them into a zip file, loading the zip file with benepar reports the following error:
Traceback (most recent call last):
File "eval_ctb.py", line 10, in <module>
parser = benepar.Parser("/data2/lfsong/exp.parsing/servc.chinese/cn_roberta_aux.zip")
File "/data/home/lfsong/anaconda3/lib/python3.7/site-packages/benepar/nltk_plugin.py", line 36, in __init__
super(Parser, self).__init__(name, batch_size)
File "/data/home/lfsong/anaconda3/lib/python3.7/site-packages/benepar/base_parser.py", line 199, in __init__
graph_def = tf.GraphDef.FromString(model)
google.protobuf.message.DecodeError: Error parsing message
Hi @nikitakit
Can you release an expert.py
script that does not assume using ELMO?
The current scripts either assume using ELMO or BERT.
Many thanks!
As of benepar v0.2.0a0, there is no more exporting to tensorflow and you can just use pytorch checkpoints directly. The original exporting code was written as a one-off because I didn't anticipate adding new models, or the explosion in pre-training approaches that we've seen over the past few years.