eric-yyjau/pytorch-superpoint

training_model_frontend

P66094125 opened this issue · 6 comments

Traceback (most recent call last):
File "train4.py", line 144, in
args.func(config, output_dir, args)
File "train4.py", line 44, in train_base
return train_joint(config, output_dir, args)
File "train4.py", line 96, in train_joint
train_agent.train()
File "C:\Users\User\Desktop\python\STUDY_CNN_imagematching\pytorch-superpoint-master-venv\Train_model_frontend.py", line 275, in train
for i, sample_train in tqdm(enumerate(self.train_loader)):
File "C:\Users\User\Desktop\python\STUDY_CNN_imagematching\pytorch-superpoint-master-venv\venv\lib\site-packages\torch\utils\data\dataloader.py", line 352, in iter
return self._get_iterator()
File "C:\Users\User\Desktop\python\STUDY_CNN_imagematching\pytorch-superpoint-master-venv\venv\lib\site-packages\torch\utils\data\dataloader.py", line 294, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\User\Desktop\python\STUDY_CNN_imagematching\pytorch-superpoint-master-venv\venv\lib\site-packages\torch\utils\data\dataloader.py", line 801, in init
w.start()
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\pool.py", line 528, in reduce
'pool objects cannot be passed between processes or pickled'
NotImplementedError: pool objects cannot be passed between processes or pickled

(venv) C:\Users\User\Desktop\python\STUDY_CNN_imagematching\pytorch-superpoint-master-venv>Traceback (most recent call last):
File "", line 1, in
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

Looks like the dataset path is not set correctly?
So there's no input for the dataloader.
Thanks.

Thank you for your answer, eric-yyjau.

what difference between "train_model_heatmap" and "train_model_fronted"?

image

Hi, I have encountered the same error when running the 1st step to train MagicPoint on Synthetic Shapes. The dataset is datasets\SyntheticDataset_gaussian which seems correct. I run it in conda virtual environment in Windows.
2021-12-17 14_23_51-Image_Matching – train4 py

@amethystwu hi, I run it in conda virtual environment in ubuntu to solve this problem.

On Windows, I was able to execute the command:
python train4.py train_base configs/magicpoint_shapes_pair.yaml magicpoint_synth --eval
by changing the lines 48, 49 in the file utils/loader.py as follow:
workers_train = training_params.get('workers_train', 0) # 1 16
workers_val = training_params.get('workers_val', 0) # 1 16

It looks like an error caused by pickle when running on multiple workers.