Python 3.8 support
emlys opened this issue ยท 20 comments
Observed issue happening with 3.8.0 and not 3.7.9:
A minimal taskgraph results in a repeating RuntimeError
. This is probably related to a change in the 3.8.0 beta 1 release: "bpo-33725: On macOS, the multiprocessing module now uses spawn start method by default."
Minimal example:
import taskgraph
def func():
return 1
graph = taskgraph.TaskGraph('/Users/emily/Documents/delineateit_workspace', 2)
task = graph.add_task(func, task_name='test')
graph.join()
Traceback (repeats forever):
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:if __name__ == '__main__': freeze_support() ...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Traceback (most recent call last):
File "", line 1, in
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/runpy.py", line 262, in run_path
return _run_module_code(code, init_globals, run_name,
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/runpy.py", line 95, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/emily/invest/src/natcap/invest/reproduce_bug.py", line 7, in
graph = taskgraph.TaskGraph('/Users/emily/Documents/delineateit_workspace', 2)
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/site-packages/taskgraph/Task.py", line 360, in init
self._worker_pool = NonDaemonicPool(
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/site-packages/taskgraph/Task.py", line 79, in init
super(NonDaemonicPool, self).init(*args, **kwargs)
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/pool.py", line 212, in init
self._repopulate_pool()
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static
w.start()
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
return Popen(process_obj)
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "/opt/miniconda3/envs/bug3.8-3/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
Huh, interesting: I remember adding this awhile back with spawn
becoming the default in mind:
class NoDaemonContext(type(multiprocessing.get_context('spawn'))):
"""From https://stackoverflow.com/a/8963618/42897.
"As the current implementation of multiprocessing [3.7+] has been
extensively refactored to be based on contexts, we need to provide a
NoDaemonContext class that has our NoDaemonProcess as attribute.
[NonDaemonicPool] will then use that context instead of the default
one." "spawn" is chosen as default since that is the default and only
context option for Windows and is the default option for Mac OS as
well since 3.8.
Sure, happy to look into this after I've finished the latest couple of things that have come up with coastal blue carbon.
I was able to reproduce this issue on Python 3.8 with Taskgraph 0.10.2, and this issue appears to have been fixed as of taskgraph 0.10.3. @emlys could you confirm that this works for you with any version of python 3.8 in taskgraph 0.10.3?
Since this issue sounds related to natcap/invest#420, I tried running a recent dev build of InVEST and can confirm that DelineateIt runs successfully on n_workers>0. I'll put in a patch over there to bump up the taskgraph requirement.
@phargogh I'm still having the same issue with taskgraph 0.10.3, python 3.8.0 and 3.8.7. Is it expected that we always have to call multiprocessing.freeze_support()
when using taskgraph, as we do in the invest entrypoint?
@emlys multiprocessing.freeze_support()
should only be needed if we're using python's multiprocessing within a frozen binary environment such as PyInstaller on Windows. It should have no effect outside of a frozen environment and on any other OS.
I'm still having the same issue with taskgraph 0.10.3, python 3.8.0 and 3.8.7.
@emlys out of curiosity, is your python coming from conda-forge
or somewhere else?
The minimally-reproducible script I'm using to test this is:
cd taskgraph
git checkout 0.10.3
conda create -c conda-forge -p ./env-387 python=3.8.7 retrying psutil rstcheck
conda activate ./env387
python setup.py install
# Then I run the test script in the original post, with success
@phargogh I followed those steps almost exactly (had to use python3.8.6, I got an error that python==3.8.7 isn't available on conda-forge) and I'm still getting the error.
Just summarizing what I've found so far:
- In the past, you could avoid this issue by using
fork
.fork
works by copying the parent process to make the subprocessspawn
works by importing everything that was imported into the parent process, into the subprocess. Because of that, you get this recursive error unless you have theif __name__ == '__main__'
guard.
- The
fork
start method is now considered unsafe on macOS and isn't available on Windows anyway- We should explicitly set the start method to
spawn
since it's the only one available on unix and windows, and it's not the default on non-macOS unix.
- We should explicitly set the start method to
spawn
has extra requirements to be used safely.- I had thought the error message was saying you need to call
freeze_support()
, but actually theif __name__ == '__main__'
guard is the important part. - This is clearly the solution for uses within our code.
- It is not clear if this is a good solution for taskgraph as a package.
- do we need to tell users that any code that uses taskgraph must be wrapped in a
if __name__ == '__main__'
? - maybe it doesn't matter because we are the primary (only?) users of taskgraph.
- do we need to tell users that any code that uses taskgraph must be wrapped in a
- I had thought the error message was saying you need to call
- I still don't know why James doesn't get this error with taskgraph 0.10.3.
In a conda environment with:
- python 3.8.6
- taskgraph 0.10.2
This script fails with the same _check_not_importing_main
as described in the post:
import logging
logging.basicConfig(level=logging.DEBUG)
import taskgraph
def func():
return 1
graph = taskgraph.TaskGraph('./test-workspace', 2)
task = graph.add_task(func, task_name='test')
graph.join()
Same python installation, same version of taskgraph, this works great:
import logging
logging.basicConfig(level=logging.DEBUG)
import taskgraph
def func():
return 1
if __name__ == '__main__':
graph = taskgraph.TaskGraph('./test-workspace', 2)
task = graph.add_task(func, task_name='test')
graph.join()
Taking a closer look at taskgraph's multiprocessing, it looks like spawn
is already used on all operating systems.
So it does seem strange that the non-if __name__ == '__main__'
is working at all on my mac.
maybe it doesn't matter because we are the primary (only?) users of taskgraph.
I agree with this for two reasons:
- We are indeed (probably) the only users of taskgraph
- If someone DOES get this error, the python devs have very kindly put the solution in the error text.
I'll keep poking at this a bit more to see if anything else comes up.
maybe it doesn't matter because we are the primary (only?) users of taskgraph.
I agree with this for two reasons:
We are indeed (probably) the only users of taskgraph
If someone DOES get this error, the python devs have very kindly put the solution in the error text.
Would this also affect someone using the natcap.invest
package? That could be slightly more of a problem
Ah! If I make sure to remove ./test-workspace
before running the script, I see the exception on 0.10.3. If ./test-workspace
exists, I don't see the exception.
EDIT: nope, now it's always failing.
So now it seems like the changes in 0.10.3 did fix natcap/invest#420, but it's likely unrelated to this error. (which makes sense because the invest entrypoint does use if __name__ == '__main__'
, so this shouldn't come up)
So now it seems like the changes in 0.10.3 did fix natcap/invest#420, but it's likely unrelated to this error. (which makes sense because the invest entrypoint does use if name == 'main', so this shouldn't come up)
Yep, I agree. The hanging behavior seen there is unrelated to this error.
Would this also affect someone using the natcap.invest package? That could be slightly more of a problem
Yep! This issue will indeed affect someone using InVEST so long as they're using n_workers > 0
(which is thankfully not default behavior). If they want to use n_workers > 0
, they'll need to do the if __name__ == '__main__'
dance. But it is also a reasonable expectation that someone wanting to use multiprocessing will be able to understand and debug their system when things go wrong, and short of not using spawn
there's nothing we can do about this anyways.
But it is also a reasonable expectation that someone wanting to use multiprocessing will be able to understand and debug their system when things go wrong, and short of not using spawn there's nothing we can do about this anyways.
I feel like this deserves a note in the API docs, just confirming that this is what we expect to happen. Also maybe in the taskgraph readme? The example code there should also get updated.
I'll add note to the taskgraph README and submit a PR to you. Since natcap/invest#467 is still pending, I'll add a note to the API docs there.
Thanks! I had started on updating the taskgraph README example code but I ran into a new issue... #60
Sorry I missed this entire thing @phargogh and @emlys, is the "fix" here to guard your code with an if __name__ == '__main__'
? I asked about this years ago on stackoverflow and got a :tumbleweed: https://stackoverflow.com/questions/50242773/can-i-guard-against-python-multiprocessing-bomb
Otherwise, someone using multiprocessing really needs to do an if __name__ == '__main__'
or they're gonna hurt.
Yep! This ultimately comes down to documenting the if __name__ == '__main__'
in the example.