google-research/noisystudent

Confusion about the loss function

zhangcheng-Honor opened this issue · 1 comments

hello, when i look at this code after read the paper,I am quite confused about how the loss function is calculated in the code.The final loss in the paper is the sum of the cross-entropy loss of the distribution calculation of the two kinds of data (with or without labels), but the calculation method in the code is not the same. like:
real_lab_bsz = tf.to_float(lab_bsz) * FLAGS.label_data_sample_prob
real_unl_bsz = batch_size * FLAGS.label_data_sample_prob * FLAGS.unlabel_ratio
data_loss = lab_loss * real_lab_bsz + unl_loss * real_unl_bsz
data_loss = data_loss / real_lab_bsz`
The loss in front of this part of the code has been averaged, but it has multiplied by the number of samples first, and then divided by the number of labeled samples. What is the significance of this calculation method? It doesn't feel the same as described in the paper.

Hi, in Training details of Section 3.1, we mentioned that "Labeled images and unlabeled images
are concatenated together to compute the average cross-entropy loss. "