nshepperd/gpt-2

Why the label of training is like this

SchenbergZY opened this issue · 0 comments

From the code in train.py i found the loss function:

        loss = tf.reduce_mean(
            tf.nn.sparse_softmax_cross_entropy_with_logits(
                labels=context[:, 1:], logits=output['logits'][:, :-1]))

But why does it have the slice [:, 1:] in labels and [:, :-1] in logits? why the slices are not the same?