cuda runtime error after various number of iterations

Question

cuda runtime error after various number of iterations

tobiasxg opened this issue 5 years ago · 0 comments

I am trying to train cuda on my own dataset. I use the same backbone, but I have changed the classifier and the mask, to only contain two classes.

I run the line:
model.train_model(train_set, test_set, learning_rate=2*config.LEARNING_RATE, epochs=5, layers='heads')

I am able to start the training, but after a few iterations, my GPU runs out of memory. I have tried to solve it by deleting variables at the end of the iterations, but it remains. Decreasing the IMAGE_MAX_DIM only allows for more iterations to be ran, before the error pops up. My GPU is a NVIDIA GeForce GTX 1050.

Is there any logical explanation to why my GPU keeps using more memory with each iteration?