cooelf/SemBERT

Allennlp预测SRL结果不一致

deanyan7 opened this issue · 7 comments

您好,当我直接使用原始数据进行SRL预测时,所得到的结果与您提供的测试样本不一致

如 The new rights are nice enough

样本测试所给的结果是 {"verbs": [{"verb": "are", "description": "[ARG1: The new rights] [V: are] [ARG2: nice enough]", "tags": ["B-ARG1", "I-ARG1", "I-ARG1", "B-V", "B-ARG2", "I-ARG2"]}], "words": ["The", "new", "rights", "are", "nice", "enough"]}

而allennlp预测出来的结果是 [{'verbs': [], 'words': ['The', 'new', 'rights', 'are', 'nice', 'enough']}]

allennlp 0.8.1 allennlp-models=1.0.0
也测试过 allennlp 1.0.0 allennlp-models=1.0.0

这个预测结果似乎模型没有有效执行,印象中没有正确识别动词的话会出现全空的情况。
这个是使用的提供的数据处理吗?(online or offline版本?)请提供详细的操作流程以便重现下。

两个版本均是一样的结果,这只是其中的一个样本,无法正确的预测类似do、is、was、were等类型的系动词或者辅助动词,但是对于其他的动词,结果感觉还是挺好的,稍后我会将具体的流程提供以便复现

AllenNLP的动词是通过spacy识别的。确认下你现在用的是之前的ELMo模型还是BERT?

allennlp基于BERT的demo的确也识别不了类似动词(https://demo.allennlp.org/semantic-role-labeling)。

Related Issue: allenai/allennlp#4146

pytorch版本为1.5.0
我采用了您提供的 srl-model-2018.05.25.tar.gz,allennlp==0.8.1 spacy==2.2.4 也采用了allennlp-demo提供的bert-base-srl-2020.03.24.tar.gz 在 allennlp==1.0.0 allennlp==1.0.0 均出现此类问题

复现:
allennlp==0.8.1 spacy==2.2.4
from allennlp.models import load_archive
from allennlp.predictors import Predictor
archive = load_archive("/model/srl-model-2018.05.25.tar.gz",cuda_device=0)
predictor = Predictor.from_archive(archive)
predictor.predict(sentence)

或者
allennlp==1.0.0 allennlp-models==1.0.0
from allennlp.models import load_archive
from allennlp.predictors import Predictor
archive = load_archive("model/bert-base-srl-2020.03.24.tar.gz",cuda_device=0)
predictor = Predictor.from_archive(archive)
predictor.predict(sentence)

sentence = "yeah i know and i did that all through college and it worked too"
result = {'verbs': [{'verb': 'know', 'description': 'yeah [ARG0: i] [V: know] and i did that all through college and it worked too', 'tags': ['O', 'B-ARG0', 'B-V', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O']},
{'verb': 'worked', 'description': 'yeah i know and i did that all through college and [ARG1: it] [V: worked] [ARGM-ADV: too]', 'tags': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'B-ARG1', 'B-V', 'B-ARGM-ADV']}], 'words': ['yeah', 'i', 'know', 'and', 'i', 'did', 'that', 'all', 'through', 'college', 'and', 'it', 'worked', 'too']}

样本结果:
{"verbs": [{"verb": "know", "description": "yeah [ARG0: i] [V: know] and i did that all through college and it worked too", "tags": ["O", "B-ARG0", "B-V", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"]},
{"verb": "did", "description": "yeah i know and [ARG0: i] [V: did] [ARG1: that] [ARGM-TMP: all through college] and it worked too", "tags": ["O", "O", "O", "O", "B-ARG0", "B-V", "B-ARG1", "B-ARGM-TMP", "I-ARGM-TMP", "I-ARGM-TMP", "O", "O", "O", "O"]},
{"verb": "worked", "description": "yeah i know and i did that all through college and [ARG0: it] [V: worked] [ARGM-ADV: too]", "tags": ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "B-ARG0", "B-V", "B-ARGM-ADV"]}], "words": ["yeah", "i", "know", "and", "i", "did", "that", "all", "through", "college", "and", "it", "worked", "too"]}

我试了下不同spacy的版本在给出verb标签的时候有些区别,可能导致了SRL模型对谓词的识别问题。可以换成早期的spacy的版本(如2.0.18,并重新安装python -m spacy download en_core_web_sm)

参考 allenai/allennlp#3418

测试样例:

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("The new rights are nice enough")
print([token.text for token in doc])
print([token.pos_ for token in doc])

以下是具体的输出对比:

spacy 2.0.18
['yeah', 'i', 'know', 'and', 'i', 'did', 'that', 'all', 'through', 'college', 'and', 'it', 'worked', 'too']
['INTJ', 'PRON', 'VERB', 'CCONJ', 'PRON', 'VERB', 'DET', 'DET', 'ADP', 'NOUN', 'CCONJ', 'PRON', 'VERB', 'ADV']

['The', 'new', 'rights', 'are', 'nice', 'enough']
['DET', 'ADJ', 'NOUN', 'VERB', 'ADJ', 'ADV']

spacy 2.2.4
['The', 'new', 'rights', 'are', 'nice', 'enough']
['DET', 'ADJ', 'NOUN', 'AUX', 'ADJ', 'ADV']

['yeah', 'i', 'know', 'and', 'i', 'did', 'that', 'all', 'through', 'college', 'and', 'it', 'worked', 'too']
['INTJ', 'PRON', 'VERB', 'CCONJ', 'PRON', 'AUX', 'SCONJ', 'DET', 'ADP', 'NOUN', 'CCONJ', 'PRON', 'VERB', 'ADV']

非常感谢您的帮助,此问题已解决,谢谢