dotnet/TorchSharp

The function torch.cuda.empty_cache() is missing.

lintao185 opened this issue ยท 6 comments

The function torch.cuda.empty_cache() is missing.

image
image
Even though torch.NewDisposeScope() has been used, the GPU memory usage remains high.
I've been debugging for a whole day with no progress. This memory management issue is very tricky, and I'm currently at a loss.

might be similar to #1194

and for the memory issue... We don't even know what happens in your loop... It's really hard to help you if you don't provide a way to reproduce the problems.

The strange thing is that I added torch.NewDisposeScope() to every function, but it still leads to memory leaks. However, when I write a standalone demo, there are no issues. It's very puzzling, and I can't figure it out.

It seems I found the reason: TorchSharp doesn't have torch.cuda.amp, so more memory is needed to store the computation graph during training.
I'm giving up for now. I'll come back to look at my current project after some time; maybe there will be a surprise.๐Ÿ˜Š๐Ÿ˜Š๐Ÿ˜Š

Torch doesn't seem to release GPU memory once it's been allocated, unless you call empty_cache(). It holds onto it for future needs.

This is a duplicate of #892, where I discussed some of the complications with implementing it, so I'll close this issue and keep the old one open.