pytransitions/transitions

The Event and Machine have cyclic dependency.

yw94 opened this issue · 3 comments

yw94 commented

Thank you for taking the time to report a bug! Your support is essential for the maintenance of this project. Please fill out the following fields to ease bug hunting and resolving this issue as soon as possible:

Describe the bug
Cyclic dependency exists between the machine and event, causing memory overflow.

Minimal working example

class Foo:

    def __init__(self):
        self.machine = Machine(
            model=self, 
            states=["a", "b"], 
            transitions=[{"trigger": "a_to_b", "source": "a", "dest": "b"}], 
            initial="a"
        )

Expected behavior
The debugging result shows that the machine.events contains the event instance, the event instance has the machine attribute, and cyclic dependency.

Additional context
Add any other context about the problem here.

Hello @yw94,

it is correct that Machine contains a list of events where each Event has a reference to the machine itself. The machine reference is later passed to the model wrapped into EventData. Since Event and EventData only handle references and don't initiate machines, I don't get where this leads to memory overflow. As far as I know, Python's garbage collector is smart enough to recognise the cyclic dependency and remove machines and events. I just ran a check and created roughly 25 million instances:

from transitions import Machine
import psutil
process = psutil.Process()


class Foo:

    def __init__(self):
        self.machine = Machine(
            model=self,
            states=["a", "b"],
            transitions=[{"trigger": "a_to_b", "source": "a", "dest": "b"}],
            initial="a"
        )


counter = 0

with open("memory.csv", "w") as f:
    while True:
        counter += 1
        foo = Foo()
        if counter % 50000 == 0:
            print(f"{counter},{process.memory_info().rss}", file=f)
            f.flush()

and the memory usage (in bytes) stayed roughly the same:

Screenshot 2023-11-06 at 16 34 19

When I prevent garbage collection like this:

# ...
counter = 0
events = []
with open("memory.csv", "w") as f:
    while True:
        counter += 1
        foo = Foo()
        events.append(foo.machine.events["a_to_b"])  # <-- keep reference to an event and thus to machine
        if counter % 50000 == 0:
            print(f"{counter},{process.memory_info().rss}", file=f)
            f.flush()
# ...

the memory consumption increases rather fast:

Screenshot 2023-11-06 at 16 45 58

At 1.4 million instances, the process already consume almost 12 billion bytes (12 GB).

@yw94: I will close this issue since I cannot verify a memory leak caused by transition with information I currently have. Feel free to comment anyway if your problem persists.