Negative values for memory overhead when testing on smaller networks
EliasVansteenkiste opened this issue ยท 5 comments
When I test torch-scan on small networks, I get negative values for Framework & CUDA overhead and Total RAM usage.
Any idea how to fix it?
Thanks in advance
Trainable params: 47,073
Non-trainable params: 0
Total params: 47,073
---------------------------------------------------------------------------------------------
Model size (params + buffers): 0.18 Mb
Framework & CUDA overhead: -24.64 Mb
Total RAM usage: -24.46 Mb
---------------------------------------------------------------------------------------------
Floating Point Operations on forward: 67.61 MFLOPs
Multiply-Accumulations on forward: 34.70 MMACs
Direct memory accesses on forward: 34.57 MDMAs
Hi @EliasVansteenkiste !
Thanks for reporting this! This is odd, could you specify a minimal code snippet to reproduce this please? (architecture included)
I suspect the RAM overhead computation failed because of an issue with nvidia-smi
. But I'd need to be able to reproduce the error to investigate ๐
@EliasVansteenkiste any update?
ping @EliasVansteenkiste ๐
Hello @EliasVansteenkiste ๐
It's been quite a while, if I can't reproduce the error, I cannot do much. Would you mind sharing how to reproduce it?
I'm closing this issue since I don't have any way of reproducing this unfortunately :/
@EliasVansteenkiste if you have time at some point, please post more details ๐