Prediction-RuntimeError: Some background workers are no longer alive
Saul62 opened this issue · 6 comments
Hello, I encountered an issue when validating the 701 dataset, the program throws an error after predicting 9 data points.
The problem arises when using the Python multiprocessing module for parallel computation. Based on the provided error messages, there are two main issues: RuntimeError: Some background workers are no longer alive and multiprocessing.managers.RemoteError and KeyError. How can this be resolved?
Error message as follows:
Predicting FLARETs_0010:
perform_everything_on_device: True
Traceback (most recent call last):
File "/root/miniconda3/envs/umamba/bin/nnUNetv2_predict", line 33, in
sys.exit(load_entry_point('nnunetv2', 'console_scripts', 'nnUNetv2_predict')())
File "/root/onethingai-tmp/U-Mamba/umamba/nnunetv2/inference/predict_from_raw_data.py", line 831, in predict_entry_point
predictor.predict_from_files(args.i, args.o, save_probabilities=args.save_probabilities,
File "/root/onethingai-tmp/U-Mamba/umamba/nnunetv2/inference/predict_from_raw_data.py", line 250, in predict_from_files
return self.predict_from_data_iterator(data_iterator, save_probabilities, num_processes_segmentation_export)
File "/root/onethingai-tmp/U-Mamba/umamba/nnunetv2/inference/predict_from_raw_data.py", line 366, in predict_from_data_iterator
proceed = not check_workers_alive_and_busy(export_pool, worker_list, r, allowed_num_queued=2)
File "/root/onethingai-tmp/U-Mamba/umamba/nnunetv2/utilities/file_path_utilities.py", line 103, in check_workers_alive_and_busy
raise RuntimeError('Some background workers are no longer alive')
RuntimeError: Some background workers are no longer alive
Process SpawnProcess-8:
Traceback (most recent call last):
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/root/onethingai-tmp/U-Mamba/umamba/nnunetv2/inference/data_iterators.py", line 57, in preprocess_fromfiles_save_to_queue
raise e
File "/root/onethingai-tmp/U-Mamba/umamba/nnunetv2/inference/data_iterators.py", line 50, in preprocess_fromfiles_save_to_queue
target_queue.put(item, timeout=0.01)
File "", line 2, in put
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/managers.py", line 833, in _callmethod
raise convert_to_error(kind, result)
multiprocessing.managers.RemoteError:
Traceback (most recent call last):
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/managers.py", line 260, in serve_client
self.id_to_local_proxy_obj[ident]
KeyError: '7ff7fdd1f460'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/managers.py", line 262, in serve_client
raise ke
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/managers.py", line 256, in serve_client
obj, exposed, gettypeid = id_to_obj[ident]
KeyError: '7ff7fdd1f460'
I have the same issue when validating the 701 dataset.
Predicting FLARETs_0004:
perform_everything_on_device: True
Traceback (most recent call last):
File "/root/miniconda3/envs/umamba/bin/nnUNetv2_predict", line 33, in <module>
sys.exit(load_entry_point('nnunetv2', 'console_scripts', 'nnUNetv2_predict')())
File "/data_local/commit/U-Mamba/umamba/nnunetv2/inference/predict_from_raw_data.py", line 834, in predict_entry_point
predictor.predict_from_files(args.i, args.o, save_probabilities=args.save_probabilities,
File "/data_local/commit/U-Mamba/umamba/nnunetv2/inference/predict_from_raw_data.py", line 251, in predict_from_files
return self.predict_from_data_iterator(data_iterator, save_probabilities, num_processes_segmentation_export)
File "/data_local/commit/U-Mamba/umamba/nnunetv2/inference/predict_from_raw_data.py", line 367, in predict_from_data_iterator
proceed = not check_workers_alive_and_busy(export_pool, worker_list, r, allowed_num_queued=2)
File "/data_local/commit/U-Mamba/umamba/nnunetv2/utilities/file_path_utilities.py", line 103, in check_workers_alive_and_busy
raise RuntimeError('Some background workers are no longer alive')
RuntimeError: Some background workers are no longer alive
output/nnunet_predict_701/FLARETs_0006
torch.Size([1, 210, 380, 380])
Process SpawnProcess-4:
Traceback (most recent call last):
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/data_local/commit/U-Mamba/umamba/nnunetv2/inference/data_iterators.py", line 61, in preprocess_fromfiles_save_to_queue
raise e
File "/data_local/commit/U-Mamba/umamba/nnunetv2/inference/data_iterators.py", line 50, in preprocess_fromfiles_save_to_queue
target_queue.put(item, timeout=0.01)
File "<string>", line 2, in put
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/managers.py", line 833, in _callmethod
raise convert_to_error(kind, result)
multiprocessing.managers.RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/managers.py", line 260, in serve_client
self.id_to_local_proxy_obj[ident]
KeyError: '7feade491030'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/managers.py", line 262, in serve_client
raise ke
File "/root/miniconda3/envs/umamba/lib/python3.10/multiprocessing/managers.py", line 256, in serve_client
obj, exposed, gettypeid = id_to_obj[ident]
KeyError: '7feade491030'
---------------------------------------------------------------------------
I have the similar question:
-##---------------
Traceback (most recent call last):
File "/mnt/zqk/.conda/envs/umamba/bin/nnUNetv2_train", line 33, in
sys.exit(load_entry_point('nnunetv2', 'console_scripts', 'nnUNetv2_train')())
File "/mnt/zqk/BTCV/model_test/SwinUNetR/U-Mamba-main/umamba/nnunetv2/run/run_training.py", line 268, in run_training_entry
run_training(args.dataset_name_or_id, args.configuration, args.fold, args.tr, args.p, args.pretrained_weights,
File "/mnt/zqk/BTCV/model_test/SwinUNetR/U-Mamba-main/umamba/nnunetv2/run/run_training.py", line 204, in run_training
nnunet_trainer.run_training()
File "/mnt/zqk/BTCV/model_test/SwinUNetR/U-Mamba-main/umamba/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py", line 1258, in run_training
train_outputs.append(self.train_step(next(self.dataloader_train)))
File "/mnt/zqk/BTCV/model_test/SwinUNetR/U-Mamba-main/umamba/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py", line 900, in train_step
output = self.network(data)
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/zqk/BTCV/model_test/SwinUNetR/U-Mamba-main/umamba/nnunetv2/nets/UMambaEnc.py", line 352, in forward
skips = self.encoder(x)
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/zqk/BTCV/model_test/SwinUNetR/U-Mamba-main/umamba/nnunetv2/nets/UMambaEnc.py", line 163, in forward
x = self.mamba_layerss
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
return func(*args, **kwargs)
File "/mnt/zqk/BTCV/model_test/SwinUNetR/U-Mamba-main/umamba/nnunetv2/nets/UMambaEnc.py", line 46, in forward
x_mamba = self.mamba(x_norm)
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/mamba_ssm/modules/mamba_simple.py", line 146, in forward
out = mamba_inner_fn(
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/mamba_ssm/ops/selective_scan_interface.py", line 317, in mamba_inner_fn
return MambaInnerFn.apply(xz, conv1d_weight, conv1d_bias, x_proj_weight, delta_proj_weight,
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/torch/cuda/amp/autocast_mode.py", line 98, in decorate_fwd
return fwd(*args, **kwargs)
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/mamba_ssm/ops/selective_scan_interface.py", line 187, in forward
conv1d_out = causal_conv1d_cuda.causal_conv1d_fwd(
TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:
1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: Optional[torch.Tensor], arg3: Optional[torch.Tensor], arg4: bool) -> torch.Tensor
Invoked with: tensor([[[-0.0491, 0.1108, 0.1163, ..., -0.1036, -0.1036, 0.1690],
[-0.0855, -0.0731, -0.0899, ..., 0.0130, 0.0130, -0.4536],
[-0.5495, -0.3877, -0.4273, ..., 0.5782, 0.5782, 0.5817],
...,
[-0.0828, -0.2404, -0.1682, ..., -0.5110, -0.5110, -0.3344],
[ 0.4582, 0.3348, 0.4075, ..., -0.4091, -0.4091, -0.5827],
[-0.3157, -0.4055, -0.3878, ..., -0.5348, -0.5348, -0.1631]],
[[-0.2968, 0.1006, 0.0193, ..., -0.1083, -0.1083, 0.1561],
[ 0.2038, -0.3414, -0.2363, ..., 0.0206, 0.0206, -0.4348],
[-0.4408, -0.2806, -0.3668, ..., 0.5653, 0.5653, 0.5552],
...,
[-0.2908, -0.2763, -0.2454, ..., -0.5056, -0.5056, -0.3196],
[ 0.0712, 0.1875, 0.1929, ..., -0.4032, -0.4032, -0.5714],
[ 0.2172, -0.4126, -0.3173, ..., -0.5288, -0.5288, -0.1556]]],
device='cuda:0', requires_grad=True), tensor([[ 0.2341, -0.4789, -0.0208, 0.3874],
[ 0.3118, -0.2485, -0.2377, 0.4749],
[ 0.1615, 0.4887, 0.4222, 0.0927],
[-0.4377, 0.0944, -0.1609, 0.1785],
[-0.1287, -0.3506, -0.4573, 0.1306],
[-0.4764, 0.4150, -0.0747, 0.2143],
[-0.3022, 0.1012, -0.2865, 0.1749],
[ 0.3570, -0.1671, 0.0872, 0.0585],
[ 0.0920, -0.4386, -0.1715, -0.3099],
[-0.4049, 0.3184, -0.2907, 0.1653],
[-0.1470, -0.4065, -0.2341, -0.2470],
[ 0.1234, -0.4312, -0.4374, -0.0321],
[-0.1388, 0.2120, -0.3501, 0.3295],
[ 0.2373, -0.2574, 0.3881, -0.0462],
[-0.2355, -0.2754, -0.0305, -0.1253],
[-0.2050, 0.4809, -0.2690, -0.4272],
[-0.0697, -0.4713, 0.2718, -0.3010],
[-0.4286, 0.4376, -0.0172, -0.0580],
[-0.4233, -0.0699, 0.4696, 0.4137],
[-0.2423, 0.1985, -0.4915, 0.4235],
[ 0.3321, 0.2985, 0.1137, 0.0867],
[-0.3255, 0.4006, -0.1824, 0.1523],
[ 0.1196, 0.2120, -0.0944, 0.0589],
[ 0.1203, -0.3548, -0.3386, 0.2519],
[-0.4032, -0.1827, -0.4929, -0.3168],
[ 0.1784, 0.0232, -0.0495, 0.2028],
[-0.3688, -0.4267, 0.3309, 0.4874],
[ 0.2087, 0.1669, -0.0523, -0.0068],
[-0.4391, -0.3192, -0.1679, 0.1272],
[ 0.3335, -0.3604, -0.0523, 0.3940],
[ 0.4864, 0.2191, -0.2756, -0.2299],
[-0.0314, -0.1608, 0.2049, -0.3435],
[ 0.4361, -0.0147, -0.4413, -0.3398],
[-0.3455, -0.1826, 0.4357, 0.1847],
[-0.2165, 0.0143, 0.0508, -0.4730],
[-0.1745, -0.0662, -0.4641, -0.1006],
[ 0.2257, -0.1837, 0.0892, 0.1096],
[ 0.0550, -0.0665, -0.4336, 0.3441],
[-0.2285, 0.4836, 0.2190, -0.2985],
[-0.3567, -0.1537, -0.3341, -0.0128],
[-0.2371, -0.0663, -0.4536, 0.0082],
[ 0.0554, 0.0025, -0.1911, 0.1813],
[-0.4043, -0.0907, -0.2568, -0.4053],
[ 0.3486, -0.0422, 0.0131, -0.1056],
[ 0.2010, -0.0947, -0.2872, -0.2287],
[-0.4851, -0.1853, -0.4469, 0.4861],
[ 0.4966, -0.3591, 0.1496, 0.0835],
[ 0.4758, 0.2139, -0.0215, -0.2494],
[ 0.2504, -0.0795, -0.4824, 0.1999],
[-0.4946, 0.2453, -0.4168, 0.4381],
[-0.3890, -0.3599, -0.3134, -0.0867],
[ 0.4481, 0.2203, 0.2909, -0.3969],
[ 0.3748, -0.0066, -0.4547, -0.1453],
[-0.0921, 0.4662, 0.1144, 0.0293],
[ 0.0494, 0.4914, -0.1008, -0.2834],
[ 0.0627, 0.3752, 0.2841, -0.2204],
[-0.4730, -0.0404, -0.0578, 0.4405],
[ 0.3908, 0.4092, -0.3176, 0.4471],
[ 0.4153, -0.1096, 0.4343, 0.3190],
[-0.3495, -0.0411, 0.4542, -0.4151],
[-0.0707, -0.4734, -0.2026, -0.4487],
[ 0.0126, 0.0120, 0.1909, 0.2329],
[ 0.2429, -0.0376, -0.2207, 0.1399],
[ 0.4854, -0.3094, 0.3535, 0.0131]], device='cuda:0',
requires_grad=True), Parameter containing:
tensor([ 0.0747, -0.4136, -0.1951, 0.0490, 0.1685, 0.4614, -0.4164, -0.2429,
-0.1325, 0.1565, -0.2873, -0.4702, -0.4290, 0.3216, 0.2686, 0.0714,
-0.1852, -0.4706, -0.1142, 0.4662, -0.0884, 0.2010, -0.3492, -0.2183,
0.3453, -0.1287, -0.3230, -0.1082, 0.4222, -0.2139, -0.3322, -0.4582,
0.1118, 0.3871, 0.1275, -0.1565, 0.3038, 0.2347, 0.0825, -0.2411,
-0.1832, -0.1973, -0.3639, -0.0440, 0.3261, -0.4377, -0.1061, -0.1320,
-0.3778, -0.2043, -0.3343, -0.4298, -0.3915, -0.0954, -0.0720, -0.3083,
-0.3703, 0.0058, -0.3564, 0.2000, 0.3286, 0.0201, -0.3494, -0.2806],
device='cuda:0', requires_grad=True), None, None, None, True
Exception in thread Thread-4 (results_loop):
Traceback (most recent call last):
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 125, in results_loop
raise e
File "/mnt/zqk/.conda/envs/umamba/lib/python3.10/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 103, in results_loop
raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the "
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message
I have the same problem. I wonder if it's a lack of computing power on my machine, or if it's a causal_conv_1d
error.
I have the same issue:
2024-03-14 22:47:59.337693: unpacking dataset...
2024-03-14 22:47:59.667426: unpacking done...
2024-03-14 22:47:59.667924: do_dummy_2d_data_aug: False
2024-03-14 22:47:59.680668: Unable to plot network architecture:
2024-03-14 22:47:59.680761: No module named 'hiddenlayer'
2024-03-14 22:47:59.687895:
2024-03-14 22:47:59.687997: Epoch 0
2024-03-14 22:47:59.688121: Current learning rate: 0.01
using pin_memory on device 0
Traceback (most recent call last):
File "/usr/local/bin/nnUNetv2_train", line 33, in
sys.exit(load_entry_point('nnunetv2', 'console_scripts', 'nnUNetv2_train')())
File "/content/U-Mamba/umamba/nnunetv2/run/run_training.py", line 268, in run_training_entry
run_training(args.dataset_name_or_id, args.configuration, args.fold, args.tr, args.p, args.pretrained_weights,
File "/content/U-Mamba/umamba/nnunetv2/run/run_training.py", line 204, in run_training
nnunet_trainer.run_training()
File "/content/U-Mamba/umamba/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py", line 1258, in run_training
train_outputs.append(self.train_step(next(self.dataloader_train)))
File "/content/U-Mamba/umamba/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py", line 900, in train_step
output = self.network(data)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/U-Mamba/umamba/nnunetv2/nets/UMambaBot.py", line 207, in forward
out = self.mamba(middle_feature_flat)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/mamba_ssm/modules/mamba_simple.py", line 146, in forward
out = mamba_inner_fn(
File "/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/selective_scan_interface.py", line 317, in mamba_inner_fn
return MambaInnerFn.apply(xz, conv1d_weight, conv1d_bias, x_proj_weight, delta_proj_weight,
File "/usr/local/lib/python3.10/dist-packages/torch/autograd/function.py", line 506, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/amp/autocast_mode.py", line 98, in decorate_fwd
return fwd(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/selective_scan_interface.py", line 187, in forward
conv1d_out = causal_conv1d_cuda.causal_conv1d_fwd(
TypeError: causal_conv1d_fwd(): incompatible function arguments. The following argument types are supported:
1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: Optional[torch.Tensor], arg3: Optional[torch.Tensor], arg4: bool) -> torch.Tensor
Invoked with: tensor([[[-0.6382, -0.4265, -0.4575, ..., 0.3599, 0.4956, -0.2935],
[ 0.3787, 0.4329, -0.5562, ..., 0.3960, 0.4448, 0.5083],
[ 0.3894, -0.3760, -0.7661, ..., -0.1026, 0.1354, 0.1112],
...,
[-0.9312, 0.0642, 1.0596, ..., 0.0847, -0.7080, -1.0225],
[ 0.3203, 0.3088, 0.7344, ..., -0.4385, -0.1875, 0.0043],
[-0.0684, 0.5269, 0.0842, ..., -0.9180, 0.6592, -0.1587]],
[[-0.0917, 0.5205, 0.0656, ..., -0.0911, 0.4780, 0.3191],
[ 0.3711, 0.3787, 0.2769, ..., 0.3538, 0.7544, 0.1790],
[ 0.1733, 0.3979, -0.9443, ..., 0.0364, -0.0048, 0.1611],
...,
[-0.3191, -0.3755, 0.1874, ..., -1.1611, -2.0781, -0.3896],
[-0.2236, -0.1467, -0.2396, ..., -0.5059, 0.8311, -0.2249],
[-0.3533, 0.1874, 0.8169, ..., 0.0668, -1.0127, -0.8462]],
[[ 0.3044, -0.1705, 0.0508, ..., 0.9473, 0.3650, 0.4119],
[ 0.3845, -0.3269, 0.3433, ..., 0.8623, 0.1277, -0.4331],
[-0.3108, 0.1451, 0.5273, ..., -0.6851, 0.0798, 1.1738],
...,
[-0.5898, -0.1954, -0.6011, ..., -1.2520, -0.2563, -0.2235],
[ 0.3484, 0.1733, 0.3040, ..., 0.2323, 0.4419, 0.6016],
[-0.1136, 0.1556, -0.0454, ..., -0.1123, -0.6719, -0.4343]],
...,
[[ 0.0906, -0.1805, 0.0513, ..., -0.2311, 0.9443, -0.0299],
[ 0.1479, 0.4751, 0.2771, ..., 0.5444, 0.0271, -0.0429],
[-0.0348, -0.5376, -0.2188, ..., -0.6421, -0.1191, 0.3616],
...,
[-0.2849, 0.5508, 0.3767, ..., -0.7939, -1.1377, 0.0343],
[ 0.4050, 0.1764, -0.2341, ..., -0.5293, 0.0665, -0.3154],
[-0.6084, 0.3564, 0.4814, ..., 0.2068, -0.1576, -1.4629]],
[[ 0.2343, -1.3701, 0.0996, ..., 0.6562, 0.8042, 0.5381],
[ 0.8242, -0.3867, -0.2098, ..., -0.5718, -0.2374, -0.4104],
[ 0.2595, 0.0446, 0.0566, ..., -0.7080, 0.3467, 0.4282],
...,
[ 0.0127, -0.6870, -0.4365, ..., -1.0381, -0.1013, -0.3372],
[ 0.3218, -0.0968, -0.0614, ..., -0.7754, 0.0306, 0.5405],
[ 0.0759, -0.0693, 0.2468, ..., -0.2004, -0.8022, -0.5728]],
[[ 0.0079, -0.5547, -0.2485, ..., 0.1643, -0.2012, -0.0824],
[-0.2629, 0.1907, -0.0386, ..., 0.1499, -0.0655, -0.3374],
[ 0.5361, 0.8872, -0.5195, ..., -0.7358, -0.0739, 0.2191],
...,
[ 0.3469, 0.1092, 0.3921, ..., -0.2266, -0.2871, -0.5259],
[ 0.1659, -0.4648, -0.6831, ..., -0.4575, 0.1437, 0.4517],
[-0.5703, 1.1455, 0.3037, ..., -0.2773, 0.3811, -0.2539]]],
device='cuda:0', dtype=torch.float16, requires_grad=True), tensor([[-0.1942, 0.0138, 0.4839, -0.3377],
[-0.4573, 0.2494, 0.2311, -0.2305],
[-0.4981, 0.1973, -0.1062, 0.4207],
...,
[ 0.0431, 0.0944, -0.0752, 0.0998],
[ 0.2777, -0.1555, 0.3114, -0.4224],
[ 0.4277, 0.3805, 0.3027, 0.4618]], device='cuda:0',
requires_grad=True), Parameter containing:
tensor([ 0.1733, -0.3476, 0.3965, ..., -0.3649, -0.4902, -0.2539],
device='cuda:0', requires_grad=True), None, None, None, True
Exception in thread Thread-4 (results_loop):
Traceback (most recent call last):
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.10/dist-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 125, in results_loop
raise e
File "/usr/local/lib/python3.10/dist-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 103, in results_loop
raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the "
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message
I have the same problem, is there a solution?
Hi All,
Please try the new code. We re-implemented the sampling function to improve the efficiency.
We also released the corresponding segmentation results.
https://drive.google.com/file/d/1qlzTym3YdyCt3eR8J90h636it4cCDCt8/view?usp=sharing