OSError: Cannot allocate memory
sarimmehdi opened this issue · 2 comments
Hello. I have 8 GB of RAM and a GTX 1050Ti. I wanted to use your neural network on my own set of images. I created the rgb_detection_val.txt file as you did. Then, I created the pickle file. Then, I run:
python train/test_net_det.py --cfg cfgs/det_sample.yaml OUTPUT_DIR pretrained_models/car TEST.WEIGHTS pretrained_models/car/model_0050.pth
However, I get a memory error:
YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
return yaml.load(cfg_to_load)
config:
{'DATA': {'CAR_ONLY': True,
'DATA_ROOT': 'kitti/data/pickle_data',
'EXTEND_FROM_DET': False,
'FILE': 'datasets/provider_sample.py',
'HEIGHT_HALF': (0.25, 0.5, 1.0, 2.0),
'NUM_HEADING_BIN': 12,
'NUM_SAMPLES': 147,
'NUM_SAMPLES_DET': 512,
'PEOPLE_ONLY': False,
'RTC': True,
'STRIDE': (0.25, 0.5, 1.0, 2.0),
'WITH_EXTRA_FEAT': False},
'EVAL_MODE': False,
'FROM_RGB_DET': True,
'IOU_THRESH': 0.7,
'LOSS': {'BOX_LOSS_WEIGHT': 1.0,
'CORNER_LOSS_WEIGHT': 10.0,
'HEAD_REG_WEIGHT': 20.0,
'SIZE_REG_WEIGHT': 20.0},
'MODEL': {'FILE': 'models/det_base.py', 'NUM_CLASSES': 2},
'NUM_GPUS': 1,
'NUM_WORKERS': 4,
'OUTPUT_DIR': 'pretrained_models/car',
'OVER_WRITE_TEST_FILE': '',
'RESUME': False,
'SAVE_SUB_DIR': 'val_nms',
'TEST': {'BATCH_SIZE': 32,
'DATASET': 'val',
'METHOD': 'nms',
'THRESH': 0.1,
'WEIGHTS': 'pretrained_models/car/model_0050.pth'},
'TRAIN': {'BASE_LR': 0.001,
'BATCH_SIZE': 32,
'DATASET': 'train',
'GAMMA': 0.1,
'LR_POLICY': 'step',
'LR_STEPS': [20],
'MAX_EPOCH': 50,
'MIN_LR': 1e-05,
'MOMENTUM': 0.9,
'OPTIMIZER': 'adam',
'START_EPOCH': 0,
'WEIGHTS': '',
'WEIGHT_DECAY': 0.0001},
'USE_TFBOARD': True,
'disp': 100}
load dataset from kitti/data/pickle_data/frustum_caronly_val_rgb_detection.pickle
=> loaded checkpoint 'pretrained_models/car/model_0050.pth')
Traceback (most recent call last):
File "/home/sarim/PycharmProjects/trajectory_pred/frustum-convnet/train/test_net_det.py", line 391, in <module>
test(model, test_dataset, test_loader, save_file_name, result_folder)
File "/home/sarim/PycharmProjects/trajectory_pred/frustum-convnet/train/test_net_det.py", line 196, in test
for i, data_dicts in enumerate(test_loader):
File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 279, in __iter__
return _MultiProcessingDataLoaderIter(self)
File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 719, in __init__
w.start()
File "/usr/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/usr/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/usr/lib/python3.7/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 70, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
Process finished with exit code 1
I decided to investigate this, so I printed out the num_batches located at line 187 in def test() function. I was getting a value of 1. This means, your code is somehow pushing all images at once. I cannot understand how to troubleshoot this. Please tell me what are the necessary changes to make here. Thank you.
How much memory is left after you load the pickle file? It seems the error happen when multi-processing try to fork worker to load data . If this, set NUM_WORKERS to 0 ?
num_batches 1 mean you have totally <32 samples ?
Thank you for replying. I fixed the problem by installing Tensorflow-GPU