Gorilla-Lab-SCUT/frustum-convnet

OSError: Cannot allocate memory

sarimmehdi opened this issue · 2 comments

Hello. I have 8 GB of RAM and a GTX 1050Ti. I wanted to use your neural network on my own set of images. I created the rgb_detection_val.txt file as you did. Then, I created the pickle file. Then, I run:

python train/test_net_det.py --cfg cfgs/det_sample.yaml OUTPUT_DIR pretrained_models/car TEST.WEIGHTS pretrained_models/car/model_0050.pth

However, I get a memory error:

YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  return yaml.load(cfg_to_load)
config:
 {'DATA': {'CAR_ONLY': True,
          'DATA_ROOT': 'kitti/data/pickle_data',
          'EXTEND_FROM_DET': False,
          'FILE': 'datasets/provider_sample.py',
          'HEIGHT_HALF': (0.25, 0.5, 1.0, 2.0),
          'NUM_HEADING_BIN': 12,
          'NUM_SAMPLES': 147,
          'NUM_SAMPLES_DET': 512,
          'PEOPLE_ONLY': False,
          'RTC': True,
          'STRIDE': (0.25, 0.5, 1.0, 2.0),
          'WITH_EXTRA_FEAT': False},
 'EVAL_MODE': False,
 'FROM_RGB_DET': True,
 'IOU_THRESH': 0.7,
 'LOSS': {'BOX_LOSS_WEIGHT': 1.0,
          'CORNER_LOSS_WEIGHT': 10.0,
          'HEAD_REG_WEIGHT': 20.0,
          'SIZE_REG_WEIGHT': 20.0},
 'MODEL': {'FILE': 'models/det_base.py', 'NUM_CLASSES': 2},
 'NUM_GPUS': 1,
 'NUM_WORKERS': 4,
 'OUTPUT_DIR': 'pretrained_models/car',
 'OVER_WRITE_TEST_FILE': '',
 'RESUME': False,
 'SAVE_SUB_DIR': 'val_nms',
 'TEST': {'BATCH_SIZE': 32,
          'DATASET': 'val',
          'METHOD': 'nms',
          'THRESH': 0.1,
          'WEIGHTS': 'pretrained_models/car/model_0050.pth'},
 'TRAIN': {'BASE_LR': 0.001,
           'BATCH_SIZE': 32,
           'DATASET': 'train',
           'GAMMA': 0.1,
           'LR_POLICY': 'step',
           'LR_STEPS': [20],
           'MAX_EPOCH': 50,
           'MIN_LR': 1e-05,
           'MOMENTUM': 0.9,
           'OPTIMIZER': 'adam',
           'START_EPOCH': 0,
           'WEIGHTS': '',
           'WEIGHT_DECAY': 0.0001},
 'USE_TFBOARD': True,
 'disp': 100}
load dataset from kitti/data/pickle_data/frustum_caronly_val_rgb_detection.pickle
=> loaded checkpoint 'pretrained_models/car/model_0050.pth')
Traceback (most recent call last):
  File "/home/sarim/PycharmProjects/trajectory_pred/frustum-convnet/train/test_net_det.py", line 391, in <module>
    test(model, test_dataset, test_loader, save_file_name, result_folder)
  File "/home/sarim/PycharmProjects/trajectory_pred/frustum-convnet/train/test_net_det.py", line 196, in test
    for i, data_dicts in enumerate(test_loader):
  File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 719, in __init__
    w.start()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 112, in start
    self._popen = self._Popen(self)
  File "/usr/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/lib/python3.7/multiprocessing/context.py", line 277, in _Popen
    return Popen(process_obj)
  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 70, in _launch
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

Process finished with exit code 1

I decided to investigate this, so I printed out the num_batches located at line 187 in def test() function. I was getting a value of 1. This means, your code is somehow pushing all images at once. I cannot understand how to troubleshoot this. Please tell me what are the necessary changes to make here. Thank you.

How much memory is left after you load the pickle file? It seems the error happen when multi-processing try to fork worker to load data . If this, set NUM_WORKERS to 0 ?
num_batches 1 mean you have totally <32 samples ?

Thank you for replying. I fixed the problem by installing Tensorflow-GPU