Question about difference of formula of composition_score and copy_score between paper and code

Question

Question about difference of formula of composition_score and copy_score between paper and code

Closed this issue 5 years ago · 2 comments

As defined in Equation 5 of the paper, the final probability distribution P_t is a mix of generation distribution and copy distribution.
p_t(w) = (1−α_copy)∗p_gen(w)+(α_copy)∗p_copy(w)
However, I found the formula used in code as follows:
composite_scores = copy_alpha * composite_scores
copy_scores = (1 - copy_alpha) * copy_attn
I am concered about whether the difference would influence the effect of model and how it would effect on the performance.

Answer 1 · 2019-08-12T04:39:24.000Z

Good catch, but I suppose there is no difference between them. You can do some experiments if you are interested in it.

Answer 2 · 2019-08-22T09:03:13.000Z

Good catch, but I suppose there is no difference between them. You can do some experiments if you are interested in it.

Thanks for your answer. I experiment on both origin code and modified code based on provided pretrain model. And it seems to have a little better performance after modification. However, I only tried 2 times, the result may be a stochastic improvement.
I agree to your comment. The alpha in formula of code may be considered as generate_alpha for generate mode.