What the difference among generation target, target approximate, target full expression and target gt?

Question

What the difference among generation target, target approximate, target full expression and target gt?

Closed this issue 3 years ago · 4 comments

Hi there,
When I run into WebQSP generation, I'm confused about the difference between generation target, target approximate, target full expression and target gt.
What's the difference among the four? Which one should I pick for generation data?

Thanks.
Yiheng

Answer 1 · 2022-04-21T17:13:28.000Z

In short, generation_target is the ground truth for the generator.

I think it's some tricks I had in order to handle the multiple ground truths annotations in WebQSP. In some examples, there are more than one ground truth s-expressions, so ideally we want to generate the simplest one.

You can view target_gt as the simplest s-expressions chosen from the list of the ground truth s-expressions using some rules. But some time the ranker may favor another s-expression other than the one I choose with rule. So the hope here is that, we dont want to generate another ground truth s-expression if the top-ranked s-expression is already the ground truth. So logic of line 44-60 is --
if top-ranked s-expression has 1.0 accuracy: use the top-ranked s-expression as the generation_target; else: use the target_gt picked by rules as the generation_target

target_full_expr just stores target_gt.

you can ignore target_approx_expr in the generation part. It is something used in the ranker training part, you can view it as a simplified ground truth s-expression for better training the ranker (because some of the ground truth s-expressions aren't covered in the enumerate s-expressions, so we sometime want the ranker to favor a s-expression that is close to the target s-expression but within the enumerated space)

Answer 2 · 2022-04-22T01:09:23.000Z

So I guess we could just use generation_target.
But may I known, if you have tested, use multiple targets instead of only choosing one, as the label during the generator training?
Because there maybe some equivalent s-expressions when enumeration, and there also maybe multiple parsers for each question in WebQSP.

Answer 3 · 2022-04-22T02:17:33.000Z

Yeah just use generation_target.

Nope, I haven't tried that. It doesn't sound very intuitive to me if the same set of top-ranked candidates has multiple generation targets, so I didn't try that.

Answer 4 · 2022-04-22T02:18:40.000Z

Got it.
Thanks for your reply.