Zero Shot explanation

Question

Zero Shot explanation

bablf opened this issue 2 years ago · 2 comments

Hi,

First of: very nice paper! Thanks for your work.
I have a question about the Zero Shot RTE though.

You state in your paper that you generate sentences for unseen labels based on example sentences that contain these unseen labels. And then you train your extractor on these "synthetic" examples. But how is that a zero shot setting? If you train on the labels that are to be predicted, it is not a Zero-Shot-setting anymore, or is it?

I would love to hear your thoughts on this. Maybe I got something wrong or my understanding of zero-shot is wrong.
Best wishes

Answer 1 · 2022-08-03T09:42:45.000Z

Hi, thank you for your kind comments :)
We generate the synthetic samples based only on the label names of the unseen relations (eg "Military Rank", "Position Played" etc).
As we do not use any annotated triplets or sentences from the unseen relation data for training, this should still fulfill the zero-shot setting.

Answer 2 · 2022-08-03T12:39:29.000Z

Thanks for the swift response!
Yes that's how I understood the generation as well. So yes, it is fair to assume that the named entities in these synthetic examples are unseen.

As stated in your paper:

In order to make ZeroRTE solvable in a supervised manner, we propose RelationPrompt to generate synthetic relation examples by prompting language models to generate structured texts.

So your suggestion is to solve ZeroRTE by generating data for these unseen labels. And since you train your Relation Extractor on these synthetic examples, it becomes a supervised setting.

I always understood Zero Shot as a setting where the model was not trained on any of the classes. But your solution is a fair approach 👍