karpathy/nn-zero-to-hero

No keepdim in dbnmeani calculation in makemore_part4_backprop.ipynb

Opened this issue · 0 comments

In the notebook makemore_part4_backprop.ipynb, there is a small issue with the calculation of dbnmeani in the following line:

dbnmeani = (-dbndiff).sum(0)

This should be corrected to:

dbnmeani = (-dbndiff).sum(0, keepdim=True)

The current implementation without keepdim results in dbnmeani.shape being [64], whereas bnmeani.grad.shape is [1, 64]. While for cmp() torch.tensor([1, 2, 3]) with shape [3] is equivalent to torch.tensor([[1, 2, 3]]) with shape [1, 3], the lack of keepdim here can cause confusion due to the shape mismatch.