About relation and entity labelling in pretraining

Question

About relation and entity labelling in pretraining

Closed this issue 4 years ago · 1 comments

Hi, you mention the pretrained Bert model as your requirement. But you do your own pretraining right?

The downloadable pretrained model is done with corpus that does not come with labels for relation extraction.

It sounds like you do your pretraining with text automatically given labels for entity and relation by spacy which may not be 100% correct. Am I correct in this?

For the training (I suppose this is fine-tuning), you use the Semeval dataset whose labels for entities and relations are supposed to be manually checked and 100% correct.

I suppose you have tried with the downloadable pretrained model that does not get any label for entities and relations. How much worse the performance would be when you do this?

Thanks.

Answer 1 · 2020-08-19T23:51:46.000Z

The original Bert pretrained on text corpus + MTB will give better results when finetuned compared to just original Bert pretrained on text corpus.