hifi/heisenbridge

Rooms unlinked after crash

CyberShadow opened this issue · 3 comments

An OOM event caused my kernel to kill the VM running Synapse and Heisenbridge. Upon boot, Heisenbridge partially forgot one of my networks:

  • The control room (named after the network) got unlinked. I see that heisenbridge left the room.
  • The network itself remained in the networks list.
  • After creating a new control room with open, I saw that Heisenbridge forgot the network password. Setting it again with password allowed it to connect.
  • All rooms on that network got unlinked, and I had to recreate them. I now have two of each rooms for my channels and queries.

I am not sure if the problem is with Heisenbridge. If it is truly fully stateless, then I guess that it cannot be, and Synapse for some reason kicked Heisenbridge out of the rooms. (But then, why only exactly for one network?) But if it ever writes to disk, then this problem could be because it did not do so in an atomic manner, and its IO code should be reviewed.

Hope this helps.

hifi commented

All the network config is in user account data on the Matrix homeserver so if it lost any state that would likely be lost by Synapse 🤔

Thanks. I'm guessing that's not the same user account data that I can see by clicking "Explore account data" in Element developer tools? Because I don't see anything Heisenbridge-related there...

Edit: Oh, I guess it would be on Heisenbridge's user, not my user.

I wanted to check the Synapse logs from the time of the event, but I lost them.

Closing for now, thanks; will reopen with more details if this reoccurs.