Constituency CRF RoBERTa English not working properly
SirBernardPhilip opened this issue · 2 comments
SirBernardPhilip commented
I am trying to run the constituency parser using the pretrained RoBERTa English model but it does not do a good job of parsing sentences. The "con-crf-en" model works as expected but the RoBERTa one fails to properly parse any sentences and outputs something like this every time:
>>> parser.predict(['I', 'saw', 'Sarah', 'with', 'a', 'telescope', '.'], verbose=False, prob=True)[0].pretty_print()
TOP
|
S
_____________|__________________
| | | | | | NP
| | | | | | |
_ _ _ _ _ _ _
| | | | | | |
I saw Sarah with a telescope .
Here's a notebook link that replicates the issue
yzhangcs commented
@SirBernardPhilip Hi, the previous weights were corrupted somehow. I've uploaded the re-trained model. Please check it out :)
SirBernardPhilip commented
Amazing thank you! I just checked and the new model works for English. Also, the same thing is happening for the con-cr-xlmr
model:
>>> from supar import Parser
>>> model_name = "con-crf-xlmr"
>>> parser = Parser.load(model_name)
>>> parser.predict(['Je', 'ai', 'vu', 'Sarah', 'avec', 'un', 'télescope', '.'], verbose=False, prob=True)[0].pretty_print()
TOP
|
S
____________|_______________________
_ _ _ _ _ _ _ _
| | | | | | | |
Je ai vu Sarah avec un télescope .