Log input/output tensor stats

Question

Log input/output tensor stats

charlesfrye opened this issue 2 years ago · 1 comments

As noted in discussion here, there can be a difference between the visualization the content of a Tensor as human-interpretable media and the actual contents of that Tensor.

So it's useful to have the raw values logged. But the raw tensor values are fat, unstructured blobs -- for an image, taking up space like a high-resolution bitmap rather than like a png.

Following the principle that 20% of the information can catch 80% of the bugs, we should instead log descriptive statistics of the input, output, and target tensors. Considerations for this kind of logging are discussed in the (private) 2021 repo here and here.

Answer 1 · 2022-08-30T23:13:09.000Z

Prototyped this by having students implement it in an exercise and they found that training slowed down by a factor of 5. It should be possible to do this without slowing training and the torchmetrics code seems to be doing it the right way (applying running reduce operations), so this was surprising.