Implementation of Distilling Causal Effect
lichong952012 opened this issue · 1 comments
lichong952012 commented
Thanks for your excellent work. I have a question, in equation 6 of the article, how is the multiplication P (Y|I= i) *Wi calculated in the code? In loss1, the weight W is not included.
JoyHuYY1412 commented
Thanks for your excellent work. I have a question, in equation 6 of the article, how is the multiplication P (Y|I= i) *Wi calculated in the code? In loss1, the weight W is not included.
@lichong952012 Hi, please check here
The logits are weighted summed.