OSError: Cannot allocate memory

Question

OSError: Cannot allocate memory

sarimmehdi opened this issue 4 years ago · 2 comments

Hello. I have 8 GB of RAM and a GTX 1050Ti. I wanted to use your neural network on my own set of images. I created the rgb_detection_val.txt file as you did. Then, I created the pickle file. Then, I run:

python train/test_net_det.py --cfg cfgs/det_sample.yaml OUTPUT_DIR pretrained_models/car TEST.WEIGHTS pretrained_models/car/model_0050.pth

However, I get a memory error:

YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  return yaml.load(cfg_to_load)
config:
 {'DATA': {'CAR_ONLY': True,
          'DATA_ROOT': 'kitti/data/pickle_data',
          'EXTEND_FROM_DET': False,
          'FILE': 'datasets/provider_sample.py',
          'HEIGHT_HALF': (0.25, 0.5, 1.0, 2.0),
          'NUM_HEADING_BIN': 12,
          'NUM_SAMPLES': 147,
          'NUM_SAMPLES_DET': 512,
          'PEOPLE_ONLY': False,
          'RTC': True,
          'STRIDE': (0.25, 0.5, 1.0, 2.0),
          'WITH_EXTRA_FEAT': False},
 'EVAL_MODE': False,
 'FROM_RGB_DET': True,
 'IOU_THRESH': 0.7,
 'LOSS': {'BOX_LOSS_WEIGHT': 1.0,
          'CORNER_LOSS_WEIGHT': 10.0,
          'HEAD_REG_WEIGHT': 20.0,
          'SIZE_REG_WEIGHT': 20.0},
 'MODEL': {'FILE': 'models/det_base.py', 'NUM_CLASSES': 2},
 'NUM_GPUS': 1,
 'NUM_WORKERS': 4,
 'OUTPUT_DIR': 'pretrained_models/car',
 'OVER_WRITE_TEST_FILE': '',
 'RESUME': False,
 'SAVE_SUB_DIR': 'val_nms',
 'TEST': {'BATCH_SIZE': 32,
          'DATASET': 'val',
          'METHOD': 'nms',
          'THRESH': 0.1,
          'WEIGHTS': 'pretrained_models/car/model_0050.pth'},
 'TRAIN': {'BASE_LR': 0.001,
           'BATCH_SIZE': 32,
           'DATASET': 'train',
           'GAMMA': 0.1,
           'LR_POLICY': 'step',
           'LR_STEPS': [20],
           'MAX_EPOCH': 50,
           'MIN_LR': 1e-05,
           'MOMENTUM': 0.9,
           'OPTIMIZER': 'adam',
           'START_EPOCH': 0,
           'WEIGHTS': '',
           'WEIGHT_DECAY': 0.0001},
 'USE_TFBOARD': True,
 'disp': 100}
load dataset from kitti/data/pickle_data/frustum_caronly_val_rgb_detection.pickle
=> loaded checkpoint 'pretrained_models/car/model_0050.pth')
Traceback (most recent call last):
  File "/home/sarim/PycharmProjects/trajectory_pred/frustum-convnet/train/test_net_det.py", line 391, in <module>
    test(model, test_dataset, test_loader, save_file_name, result_folder)
  File "/home/sarim/PycharmProjects/trajectory_pred/frustum-convnet/train/test_net_det.py", line 196, in test
    for i, data_dicts in enumerate(test_loader):
  File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 719, in __init__
    w.start()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 112, in start
    self._popen = self._Popen(self)
  File "/usr/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/lib/python3.7/multiprocessing/context.py", line 277, in _Popen
    return Popen(process_obj)
  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 70, in _launch
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

Process finished with exit code 1

I decided to investigate this, so I printed out the num_batches located at line 187 in def test() function. I was getting a value of 1. This means, your code is somehow pushing all images at once. I cannot understand how to troubleshoot this. Please tell me what are the necessary changes to make here. Thank you.

Answer 1 · 2020-02-20T10:31:39.000Z

How much memory is left after you load the pickle file? It seems the error happen when multi-processing try to fork worker to load data . If this, set NUM_WORKERS to 0 ?
num_batches 1 mean you have totally <32 samples ?

Answer 2 · 2020-02-20T10:44:45.000Z

Thank you for replying. I fixed the problem by installing Tensorflow-GPU