Unity-Technologies/ROS-TCP-Connector

ROS msgs not received in a topic if publisher starts after subscriber

felipebelocreatecrobotics opened this issue · 4 comments

Main bug description
If I subscribe to a topic before a publisher was registered at that topic, no ROS msgs are received by the subscriber callback once the publisher is registered.

To Reproduce
Steps to reproduce the behavior:

  1. Run 'rostopic echo SOME_TOPIC'
  2. Run an application (application A) with a publisher registered to a certain topic. E.g.:
using System.Collections;
using System.Collections.Generic;
using RosMessageTypes.Std;
using Unity.Robotics.ROSTCPConnector;
using UnityEngine;

public class SimpleSender : MonoBehaviour
{
    ROSConnection ros;

    public ROSConnection Ros
    {
        get => ros;
        set => ros = value;
    }

    public string value;
    
    public string Value
    {
        get => value;
        set => this.value = value;
    }

    public string topicName;
    public string TopicName
    {
        get => topicName;
        set => topicName = value;
    }
    
    public void Start()
    {
        // start the ROS connection
        Ros = ROSConnection.GetOrCreateInstance();
        Ros.RegisterPublisher<StringMsg>(TopicName);
    }

    public void Send(string value)
    {
        this.value = value;
        var message = ConvertToRosMessage(value);
        Ros.Publish(TopicName, message);
        Debug.Log("Value sent");
    }

    public StringMsg ConvertToRosMessage(string value)
    {
        var message = new StringMsg();
        message.data = value;
        return message;
    }

}
  1. Run another application (application B) with a subscriber registered to the same topic. E.g.:
using System.Collections;
using System.Collections.Generic;
using RosMessageTypes.Std;
using Unity.Robotics.ROSTCPConnector;
using UnityEngine;
using UnityEngine.Events;

[System.Serializable]
public class SimpleReceiverEvent : UnityEvent<string>
{
}

public class SimpleReceiver : MonoBehaviour
{
    ROSConnection ros;
    public StringReceiverEvent OnChange;

    public ROSConnection Ros
    {
        get => ros;
        set => ros = value;
    }
    
    public string topicName;
    public string TopicName
    {
        get => topicName;
        set => topicName = value;
    }
    
    private void Start()
    {
        Ros = ROSConnection.GetOrCreateInstance();
        Ros.Subscribe<StringMsg>(TopicName, MessageChange);
    }
    
    void MessageChange(StringMsg message)
    {
        Debug.Log("ROS msg received");
        var value = ConvertFromRosMessage(message);
        Receive(value);
    }
    public void Receive(string value)
    {
        OnChange.Invoke(value);
    }
    public string ConvertFromRosMessage(StringMsg message)
    {
        var value = message.data;
        return value;
    }
}
  1. When the method 'Send' from 'SimpleSender' in 'application A' is invoked, the message can be seen both using 'rostopic echo' and is properly received in application B (e.g., 'Debug.Log' is properly invoked.
  2. Now close the applications, leave 'rostopic echo' running and run the applications in the opposite order: first run application B then run application A.
  3. When the method 'Send' from 'SimpleSender' in 'application A' is invoked: application B does not receive the method; but, 'rostopic echo' does receive the message, which shows that it was properly published.

After several tests we concluded that, in TCP Connector, if a publisher isn't registered before a subscriber then the subscriber does not receive the messages from the publisher.

Expected behavior
We expect publishers and subscribers registered to the same topic to communicate between themselves independently of the order in which they were created.

Instead, if I subscribe to a topic before a publisher was registered at that topic, no ROS msgs are received by the subscriber callback once the publisher is registered.

For the sake of simplicity let's call A the machine that is publishing and B the machine that is subscribing to the topic.

If I run 'rostopic echo NAME_OF_TOPIC', I can see that the messages are indeed being published by A. However, the 'subscriber' in B does not seem to be able to receive them if B is run after A.

I came up with a couple of "hacks" to handle the issue. If I set a timeout in B when listening for msgs on a subscriber and either: disconnect/reconnect to RosConnection; or, unsubscribe/resubscribe to the topic; then, the topic is properly subscribed (in B) and receives messages (from A).

However, I've verified that these solutions have the side-effect that publishers in B lose "connection" to their related subscribers in other machines (e.g., in A). I also verified that to unsubscribe/resubscribe to the topic takes a long time (in the order of seconds), which means that messages get lost during that time, and B may not receive a message from A within time, failing the timeout check again.

For completeness, I'm using the following to register a publisher:

Environment:

  • Unity Version: 2021
  • Unity machine OS + version: Ubuntu 18.04
  • ROS machine OS + version: same as Unity machine
  • ROS–Unity communication: Ros and ROS-TCP-Endpoint running natively on the machine
  • Branch or version: TCP Connector and Endpoint version 0.6.0
at669 commented

Hi, thanks for the thorough report! I've created a ticket for internal tracking.

I investigated this scenario on my end, and I haven't yet been able to reproduce the issue. I have set up a simple project with publisher and subscriber MonoBehaviours, UI buttons to register each, alongside a button for publishing a simple counter.

248.mp4

0:00-0:26: This should be the "A first, B second" scenario--the publisher is registered, then the subscriber, then publishing begins. Note the Console window in Unity printing the registrations, publishing and receives.

0:35-1:03: This should be the "B first, A second" scenario--the subscriber is registered, then the publisher, then publishing begins.

The rest of the video just ensures that a message can't be published before the publisher is registered, to round out all three buttons trying to be run first. I've attached the simple project here--can you go ahead and test this on your end to see if the problem persists?

connector-248.zip

[Ticket#: AIRO-1655]

If I understood correctly, you have prepared it all in one application (one scene) only, that even shares the ROS Connection.
I have also started by doing the same, and in that case the order in which publishers and subscribers are registered do not matter.
The issue is when there are multiple ROS nodes running, which is the typical ROS use case. If you have 2 or more applications running then one can only subscribe to already published topics.
The "minimum" way of reproducing the error is to create two separate applications. One with a publisher (that can be triggered by a button for example) and one with a subscriber. If you run the subscriber application first and the publisher application after, then the subscriber application will not receive the ROS message.
You should be able to reproduce this scenario quickly using the code I provided.
Otherwise I can prepare 2 projects, a publisher and a subscriber, and send here.

AT669, To prove the point that the issue can be reproduced with 2 ROS nodes, I've built your application and ran both the built application and another one in the Unity editor. If you do so, you are able to reproduce the issue. Try using the built application as a publisher and the editor application as a subscriber. If publisher/subscriber are not registered on the right order, you will see that the subscriber does not receive the expected messages and does not 'log' them.

at669 commented

Thanks for the clarification!

Using multiple Unity instances is a currently unsupported use case. Unity-Technologies/Unity-Robotics-Hub#214 is a similar situation which you may want to check out, which was resolved by running multiple server endpoints.