Vedo and torch.multiprocessing
Opened this issue · 1 comments
kirilllzaitsev commented
Hi, using DDP in torch leads to the following error when calling plotter.show
:
Traceback (most recent call last):
File "/home/user/train.py", line 627, in <module>
mp.spawn(
File "/home/user/venv/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 239, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/user/venv/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 197, in start_processes
while not context.join():
File "/home/user/venv/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 140, in join
raise ProcessExitedException(
torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with signal SIGSEGV
/usr/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 2 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
Here is a sample code:
plotter = vedo.Plotter(
N=1,
offscreen=False,
)
pointcloud = vedo.Points(pts.detach().cpu().numpy())
plotter.show(
pointcloud,
legends[idx]
)
My vedo
version is vedo==2023.4.6
.
The plotter is visible only on the main process and the data in use has been cloned/detached to prevent any shared access. I'm wondering if that's a known problem between vedo
and torch.multiprocessing
?
marcomusy commented
To be honest I have no experience with torch.multiprocessing, I can only suggest to upgrade vedo to the latest version, but i' m not sure that that can cure the problem.
If i'm not mistaken the upstream VTK needs to start the interctive window in the main thread.