torch.GC may hang with a data loader cache
Closed this issue · 0 comments
shendiaomo commented
The current design requires all the tensors created between two consequent torch.GC()
call being unreachable upon the second call. As a result, if we want to cache data loader tensors in a chan
, the torch.GC()
may hang forever.
For a simplified example:
torch.GC()
runtime.LockOSThread()
c := make(chan torch.Tensor, 0)
{
torch.NewTensor([][]float32{{1, 2}, {3, 4}}) // Register the anonymous `Tensor` to the global `WaitGroup`
go func() {
a := torch.NewTensor([][]float32{{1, 2}, {3, 4}}) // Register `a` to the global `WaitGroup`
c <- a
time.Sleep(time.Day)
runtime.KeepAlive(&a)
}()
}
<-c
torch.GC() // Lasts for one day
The 2nd call to torch.GC()
in the above code snippet will last for a day because a
in the goroutine will live until the end of the goroutine.