
absl.flags fails with multiprocessing when using "spawn"

anthonybaxter opened this issue · 3 comments

There's something odd going on with absl.flags and interacting very badly with multiprocessing when using 'spawn' as a start method. This is on MacOS 11.4, using homebrew's version of Python 3.9.6, although it also fails on the system python 3, 3.8.2.

Given the following code (I'll also atttach it),

import absl.flags
import absl.app
import multiprocessing
import time

absl.flags.DEFINE_integer("delay", 2, "sleep delay")
FLAGS = absl.flags.FLAGS

def worker(n):

def main(argv):
# "fork" works fine.
with multiprocessing.Pool(20) as pool:
pool.map(worker, range(1000))

if name == "main":

it will fail (inconsistently) with

Traceback (most recent call last):
File "/opt/homebrew/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/opt/homebrew/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/Users/anthonybaxter/multifail.py", line 10, in worker
File "/opt/homebrew/lib/python3.9/site-packages/absl/flags/_flagvalues.py", line 499, in getattr
raise _exceptions.UnparsedFlagAccessError(error_message)
absl.flags._exceptions.UnparsedFlagAccessError: Trying to access flag --delay before flags were parsed.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/anthonybaxter/multifail.py", line 22, in
File "/opt/homebrew/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/opt/homebrew/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
File "/Users/anthonybaxter/multifail.py", line 18, in main
pool.map(worker, range(1000))
File "/opt/homebrew/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/opt/homebrew/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
absl.flags._exceptions.UnparsedFlagAccessError: Trying to access flag --delay before flags were parsed.

yilei commented

This is expected as spawn starts a fresh python interpreter, which means the absl.flags is never parsed in the child processes (flags are parsed when absl.app.run is called).

A few ideas:

  1. Spawn the processes before calling app.run
  2. Instead of accessing the flags in child processes, pass all arguments to the function instead
  3. Use Pool(initializer=) and have the initialize do the extra flag parsing:
    def parse_flags():
    with multiprocessing.Pool(20, initializer=parse_flags) as pool:

Does this help?

OK, that makes sense. Given 'spawn' is the default on MacOS and Windows, is it worth a brief mention in the docs about it, as it was somewhat unexpected.

yilei commented

Yeah it might be worth adding a note about it somewhere, I'll keep this open.