tango-controls/pytango

Event subscription lost when destroying different DeviceProxy instance

Closed this issue · 12 comments

Hi,
I have some issues trying to understand the behavior of subscribe_event.
Here is a simple piece of code that just wraps a DeviceProxy and subscribes to the state with change_event.
If I instantiate this object twice I don't receive any events. (this re-instantiation happens all the time when rerunning a script in ipython with the %run magic).
If I explicitly call del on the object before recreating it everything works as expected and I receive events.

import tango

def callback(event):
    print(event.event, event.attr_value.value)
    
class Device():
    def __init__(self, name):
        self.proxy = tango.DeviceProxy(name)
        self.event_id = self.proxy.subscribe_event('state', tango.EventType.CHANGE_EVENT, callback)
        
    def __del__(self):
        self.proxy.unsubscribe_event(self.event_id)
        print('TangoMotor __del__')
        
        
device = Device('motor/motctrl03/1')
# need to explicitly del device otherwise I receive no events
#del device
device = Device('motor/motctrl03/1')

If I don't call del explicitly __del__ runs after the recreation of the second instance.
To me it seems like these 2 instances somehow share the underlying event subscription and if the first instance gets deleted after the second instance is created I no longer receive any events? Does that make any sense?
I appreciate any advice.
Thanks,

running python 3.8 / cpptango 9.3.4-rc6 / pytango 9.3.2 for the client

Hi @weninc

Yes, that is a little surprising. I can reproduce it running python 3.7 / cpptango 9.3.4-rc6 / pytango 9.3.2.
I'm not sure if the problem is PyTango or cppTango, or just a "feature". I'll investigate some more.

A couple of scenarios are shown below, which provide some more clues.

In [1]: import tango 
   ...:  
   ...:  
   ...: def callback(event): 
   ...:     print(f"Got event: {event.event}, {event.attr_value.value}") 
   ...:  
   ...:  
   ...: class Device(): 
   ...:     def __init__(self, name, attr): 
   ...:         print(f"Device __init__ - attr {attr}") 
   ...:         self.proxy = tango.DeviceProxy(name) 
   ...:         self.event_id = self.proxy.subscribe_event( 
   ...:             attr, tango.EventType.CHANGE_EVENT, callback) 
   ...:         print(f"  subscribed to {attr}, ID: {self.event_id}") 
   ...:          
   ...:     def __del__(self): 
   ...:         print(f"__del__ start - unsubscribe ID {self.event_id}") 
   ...:         self.proxy.unsubscribe_event(self.event_id) 
   ...:         print(f"__del__ done.") 
   ...:                                                                                                                                                 

In [2]: dp = tango.DeviceProxy("sys/tg_test/1")                                                                                                         

In [3]: d1 = Device("sys/tg_test/1", "state")                                                                                                           
Device __init__ - attr state
Got event: change, RUNNING
  subscribed to state, ID: 1

In [4]: d2 = Device("sys/tg_test/1", "state")                                                                                                           
Device __init__ - attr state
Got event: change, RUNNING
  subscribed to state, ID: 2

In [5]: dp.SwitchStates()                                                                                                                               

Got event: change, FAULT
Got event: change, FAULT
In [6]: del d1                                                                                                                                          

In [7]: import gc                                                                                                                                       

In [8]: gc.collect()                                                                                                                                    
__del__ start - unsubscribe ID 1
__del__ done.
Out[8]: 1625

In [9]: dp.SwitchStates()                                                                                                                               

In [10]: dp.SwitchStates()                                                                                                                              

In [11]: del d2                                                                                                                                         
__del__ start - unsubscribe ID 2
Exception ignored in: <function Device.__del__ at 0x7f27673c4560>
Traceback (most recent call last):
  File "<ipython-input-1-55996759fad3>", line 18, in __del__
  File "/opt/project/tango/green.py", line 195, in greener
    return executor.run(fn, args, kwargs, wait=wait, timeout=timeout)
  File "/opt/project/tango/green.py", line 109, in run
    return fn(*args, **kwargs)
  File "/opt/project/tango/device_proxy.py", line 1317, in __DeviceProxy__unsubscribe_event
    self.__unsubscribe_event(event_id)
PyTango.EventSystemFailed: (DevError(desc = 'Failed to unsubscribe event, the event id specified does not correspond with any known one', origin = 'EventConsumer::get_event_system_for_event_id()', reason = 'API_EventNotFound', severity = tango._tango.ErrSeverity.ERR),)

The error message in the final exception is from cppTango, but I don't know if the PyTango client is incorrectly unsubscribing, or if it is in cppTango.
Sometimes I need to use the garbage collector to free the object, as above.

Changing the order of deletion, gives a different result. If we delete the newer object first, then the old one keeps its subscription.

In [12]: d1 = Device("sys/tg_test/1", "state")                                                                                                          
Device __init__ - attr state
Got event: change, FAULT
  subscribed to state, ID: 3

In [13]: d2 = Device("sys/tg_test/1", "state")                                                                                                          
Device __init__ - attr state
Got event: change, FAULT
  subscribed to state, ID: 4

In [14]: dp.SwitchStates()                                                                                                                              

Got event: change, RUNNING
Got event: change, RUNNING
In [15]: del d2                                                                                                                                         
__del__ start - unsubscribe ID 4
__del__ done.

In [16]: dp.SwitchStates()                                                                                                                              

Got event: change, FAULT
In [17]: dp.SwitchStates()                                                                                                                              

Got event: change, RUNNING
In [18]: del d1                                                                                                                                         
__del__ start - unsubscribe ID 3
__del__ done.

In [19]:             

In general, using __del__ with DeviceProxy is problematic for objects with cyclic references. We don't know when, or in which thread the deletion (and unsubscription) will be executed. See the very long thread here (if you have lots of spare time):

Great thanks for looking into it.
I have seen these problems with del as well and was reading the thread you mentioned (lots of interesting details). It makes it quite tricky to properly cleanup and unsubscribe from events.
I have tried to remove the unsubscibe_event call in del but that doesn't change the behavior I see.

I have tried to remove the unsubscibe_event call in del but that doesn't change the behavior I see.

OK. The cppTango DeviceProxy destructor does unsubscribe from all events as well. There is some more info in another PyTango issue here - although that's more about issues with multiple threads. ALBA were going to try and change some things in the PyTango DeviceProxy destruction.

@bourtemb or @mliszcz Could one of you please try this as a pure C++ example?

Scenario 1:

  • Create two DeviceProxy instances (proxy1, proxy2), subscribe to same attribute
  • Expect 2 callbacks.
  • Delete proxy1.
    • In PyTango we are seeing 0 callbacks after that.
    • In C++ (if we have a PyTango bug) then you will still get callbacks from proxy2.
  • Delete proxy2.
    • In PyTango we get an exception from unsubscribe about event ID not existing.
    • In C++ (if we have a PyTango bug), then no errors.

Scenario 2:

  • Create two DeviceProxy instances (proxy1, proxy2), subscribe to same attribute
  • Expect 2 callbacks.
  • Delete proxy2.
    • In PyTango we see callbacks from proxy1.
    • In C++ expect the same.

@ajoubertza , it looks like this is due to a bug in cpptango: tango-controls/cppTango#353
Maybe we could raise the priority of this cpptango issue which seems to have been forgotten since a while?

Thanks for confirming @bourtemb. Yes, please raise the priority on that cpptango issue.

Hi @ajoubertza, sorry it took so long, but I just tested the two scenarios you mentioned above. With pure-c++ client we have exactly the same results as you described (in both cases). This needs to be fixed in cppTango. In that other ticket I proposed possible solution.

@mliszcz No problem. Thanks for investigating in detail.

@weninc This issue is being fixed in cppTango - see tango-controls/cppTango#746. Note that it will only be available from the first 9.4.x cppTango release - no backport to 9.3.x.

Can we close this ticket now?

thanks, I am following the progress of tango-controls/cppTango#746.
Can we wait to close this issue until the fix gets merged?

Yes, we can wait until it merged.

fixed in cppTango 9.4.x