salesforce/rng-kbqa

What the difference among generation target, target approximate, target full expression and target gt?

Closed this issue · 4 comments

yhshu commented

Hi there,
When I run into WebQSP generation, I'm confused about the difference between generation target, target approximate, target full expression and target gt.
What's the difference among the four? Which one should I pick for generation data?

Thanks.
Yiheng

In short, generation_target is the ground truth for the generator.

I think it's some tricks I had in order to handle the multiple ground truths annotations in WebQSP. In some examples, there are more than one ground truth s-expressions, so ideally we want to generate the simplest one.

You can view target_gt as the simplest s-expressions chosen from the list of the ground truth s-expressions using some rules. But some time the ranker may favor another s-expression other than the one I choose with rule. So the hope here is that, we dont want to generate another ground truth s-expression if the top-ranked s-expression is already the ground truth. So logic of line 44-60 is --
if top-ranked s-expression has 1.0 accuracy: use the top-ranked s-expression as the generation_target; else: use the target_gt picked by rules as the generation_target

target_full_expr just stores target_gt.

you can ignore target_approx_expr in the generation part. It is something used in the ranker training part, you can view it as a simplified ground truth s-expression for better training the ranker (because some of the ground truth s-expressions aren't covered in the enumerate s-expressions, so we sometime want the ranker to favor a s-expression that is close to the target s-expression but within the enumerated space)

yhshu commented

So I guess we could just use generation_target.
But may I known, if you have tested, use multiple targets instead of only choosing one, as the label during the generator training?
Because there maybe some equivalent s-expressions when enumeration, and there also maybe multiple parsers for each question in WebQSP.

Yeah just use generation_target.

Nope, I haven't tried that. It doesn't sound very intuitive to me if the same set of top-ranked candidates has multiple generation targets, so I didn't try that.

yhshu commented

Got it.
Thanks for your reply.