uqfoundation/dill

issue with unpickling weakrefs in nested objects

Opened this issue · 0 comments

Hi all,

I am trying to implement classes for tree nodes that maintain a reference to their parent node. In a nutshell, whenever node A is assigned as attribute of node B, A's parent reference is automatically set to B. I also provide subclasses for tuples, lists, dicts, and sets that set parent references of their members. This works nice with the implementation below (only relevant methods shown):

class Node:
    def __init__(self):
        self.parent = None

    def __setattr__(self, name, value, /):
        if isinstance(value, Node) and value.parent:
            value.parent().remove(value)
        if hasattr(self, name) and isinstance(getattr(self, name), Node):
            getattr(self, name).parent = None
        if isinstance(value, Node):
            value.parent = ref(self)
        return object.__setattr__(self, name, value)

    def __delattr__(self, name, /):
        child = getattr(self, name)
        if isinstance(child, Node):
            child.parent = None
        object.__delattr__(self, name)

    def remove(self, child):
        for key, value in self.__dict__.items():
            if value is child:
                delattr(self, key)
                break

class SetNode(set, Node):
    def __init__(self, items = ()):
        super().__init__(items)
        Node.__init__(self)
        for item in items:
            if isinstance(item, Node):
                item.parent = ref(self)

    def add(self, value, /):
        super().add(value)
        if isinstance(value, Node):
            value.parent = ref(self)

My problem starts when I try to pickle instances of these classes using the dill library. I would expect the following code to pass, but it raises an AssertionError:

from io import BytesIO
from weakref import ref
import logging

import dill
import dill.detect

from minimp import Node, SetNode


logging.basicConfig(level=logging.DEBUG)


if __name__ == '__main__':
    state = Node()
    state.children = SetNode()  # state.children.parent is set to ref(state)
    child = Node()
    state.children.add(child)   # child.parent is set to ref(state.children)
    state.refered = ref(child)  # child.parent is unaffected when using weak references

    buf = BytesIO()
    dill.dump(state, buf)
    buf.seek(0)
    del state
    del child
    restored = dill.load(buf)

    assert restored.refered().parent() is restored.children

The assertion is due to restored.refered().parent being a dead reference that evaluates to None.

Investigating the problem for a looong time, here is what appears to happen: the Unpickler reconstructs the state instance "from the inside out". The first object to be reconstructed is therefore child.parent -- a weakref to state.children. Because the target of this weakref does not exist yet, dill creates a new SetNode and assigns a weakref to it as child.parent to the newly created child. However, this new SetNode is not memoized by the Unpickler, so when the Unpickler is tasked to reconstruct state.children there is no memory of the existing SetNode. Instead, a new instance is created and populated. Since the content of the SetNode has been memoized by Unpickler, it is used unaltered, including the weakref to the temporary SetNode object. Because there is no reference to the SetNode that was created first, it is garbage collected and restoed.refered().parent becomes a dead reference.

My questions:

  1. Is my interpretation of the error correct?
  2. Is this an error in dill or is it expected behaviour?
  3. How would I need to change my code to avoid this behaviour?