cuda runtime error (2) : out of memory
Spandan-Madan opened this issue · 0 comments
Spandan-Madan commented
Can't even load the model!
torch.set_grad_enabled(False)
config = coco.CocoConfig()
config.display()
model = modellib.MaskRCNN(config=config,model_dir='saved_models/')
Error:-
Configurations:
BACKBONE_SHAPES [[64 64]
[32 32]
[16 16]
[ 8 8]
[ 4 4]]
BACKBONE_STRIDES [4, 8, 16, 32, 64]
BATCH_SIZE 32
BBOX_STD_DEV [0.1 0.1 0.2 0.2]
DETECTION_MAX_INSTANCES 100
DETECTION_MIN_CONFIDENCE 0.7
DETECTION_NMS_THRESHOLD 0.3
GPU_COUNT 2
IMAGENET_MODEL_PATH /data/graphics/toyota-pytorch/training-scaffold_new/unet/runs/maskrcnn_debug_run/resnet50_imagenet.pth
IMAGES_PER_GPU 16
IMAGE_MAX_DIM 256
IMAGE_MIN_DIM 256
IMAGE_PADDING True
IMAGE_SHAPE [256 256 3]
LEARNING_MOMENTUM 0.9
LEARNING_RATE 0.001
MASK_POOL_SIZE 14
MASK_SHAPE [28, 28]
MAX_GT_INSTANCES 100
MEAN_PIXEL [123.7 116.8 103.9]
MINI_MASK_SHAPE (56, 56)
NAME coco
NUM_CLASSES 81
POOL_SIZE 7
POST_NMS_ROIS_INFERENCE 100
POST_NMS_ROIS_TRAINING 200
ROI_POSITIVE_RATIO 0.33
RPN_ANCHOR_RATIOS [0.5, 1, 2]
RPN_ANCHOR_SCALES (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE 1
RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD 0.7
RPN_TRAIN_ANCHORS_PER_IMAGE 256
STEPS_PER_EPOCH 32000
TRAIN_ROIS_PER_IMAGE 20
USE_MINI_MASK True
USE_RPN_ROIS True
VALIDATION_STEPS 50
WEIGHT_DECAY 0.0001
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-7-1200d0c91d44> in <module>
2 config = coco.CocoConfig()
3 config.display()
----> 4 model = modellib.MaskRCNN(config=config,model_dir='saved_models/')
/data/graphics/toyota-pytorch/training-scaffold_new/unet/runs/maskrcnn_debug_run/modelmaskrcnn.py in __init__(self, config, model_dir)
1413 self.model_dir = model_dir
1414 self.set_log_dir()
-> 1415 self.build(config=config)
1416 self.initialize_weights()
1417 self.loss_history = []
/data/graphics/toyota-pytorch/training-scaffold_new/unet/runs/maskrcnn_debug_run/modelmaskrcnn.py in build(self, config)
1447 config.RPN_ANCHOR_STRIDE)).float(), requires_grad=False)
1448 if self.config.GPU_COUNT:
-> 1449 self.anchors = self.anchors.cuda()
1450
1451 # RPN
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1525909934016/work/aten/src/THC/THCGeneral.cpp:844