uqfoundation/dill

Should recurse=True be the default, even though it breaks functionality?

Closed this issue · 4 comments

In general, it solves a lot of problems (avoids a NameError in looking up globals used inside a function) to have dill.settings['recurse'] = True Note that multiprocess and pathos seems to perform more robustly on Windows when dill.settings['recurse'] = True. See: uqfoundation/multiprocess#6.

The number of test cases it breaks seem to be few. See #105.

Should dill.settings['recurse']=True be promoted to the default -- even before the corner cases in #105 are addressed?

The answer is apparently "no". There are still issues in multiprocess on Windows when dill.settings['recurse'] = True.

Is it at least in enough of a state that we should have cloudpickle bring in dill and set recurse to True?

@rgbkrk: There are serialization failures on my test objects under stress in multiprocess (note the "dill-aware" version, not the std-lib version) either way… a few objects pass with recurse=True that fail with recurse=False and vice versa. There are also test failures in dill that don't fail in cloudpickle (mostly the 3rd-party objects specifically added to cloudpickle), and vice versa.

I don't see a generic "best" choice at the moment. dill with recurse=False has the least amount of failures on object coverage, but that doesn't necessarily mean the best coverage on usage patterns.

I would think that sort of monkeypatch this might go badly, but it might also work:

import dill as pickle
# pickle[`recurse`] = True
import cloudpickle as pickle

Essentially, setting dill as a base to get all of the dill objects registered,
then replacing them with any of the objects cloudpickle also handles.

The best option is probably a little sprint to get recurse=True working for the cases from recurse=False that it breaks -- and also pull the existing cloudpickle special cases into dill. Then we have unification, and in a single package that's actively maintained. I don't think it would take too much effort to do.

Note relevant issues: uqfoundation/multiprocess#65 uqfoundation/pathos#65. With multiprocessing moving from 'fork' to 'spawn' as the default context on MacOS (and related issue on Windows), a potentially easy solution is to move to recurse=True as the default. Not sure why this wasn't revisited. Will investigate further.