Lifespan not working with transiet_local subscriber
charlielito opened this issue · 7 comments
Bug report
When subscribing to a transient_local
published topic and using the Lifespan parameter, it seems that it doesn't have any effect on the subscriber's side. That means the callback of the subscriber is always called with the message ignoring the Lifespan parameter you use.
Required Info:
- Operating System: Ubuntu 22.04
- Installation type: binaries
- Version or commit hash: iron
- DDS implementation: default
- Client library (if applicable): rclpy
Steps to reproduce issue
To reproduce the issue we have 2 nodes: one that publishes to /some_topic
with transient_local
and infinite lifespan QoS and the other node that subscribes to that topic also with transient_local
QoS but with a lifespan of 2 seconds.
Run this publisher node in one Terminal
import rclpy
from rclpy.node import Node
from rclpy.qos import (
QoSDurabilityPolicy,
QoSProfile,
QoSReliabilityPolicy,
)
from std_msgs.msg import Bool
class PubNode(Node):
def __init__(self):
super().__init__("pub_node")
qos = QoSProfile(
depth=5,
durability=QoSDurabilityPolicy.TRANSIENT_LOCAL,
reliability=QoSReliabilityPolicy.RELIABLE,
)
self.pub = self.create_publisher(Bool, "/some_topic", qos)
self.pub.publish(Bool(data=False))
def main(args=None):
rclpy.init(args=args)
node = PubNode()
try:
rclpy.spin(node)
except KeyboardInterrupt:
pass
node.destroy_node()
rclpy.shutdown()
if __name__ == "__main__":
main()
Then run the subscriber node in another terminal
from rclpy.node import Node
from rclpy.qos import (
QoSDurabilityPolicy,
QoSProfile,
QoSReliabilityPolicy,
)
from std_msgs.msg import Bool
from rclpy.duration import Duration
import rclpy
class SubNode(Node):
def __init__(self):
super().__init__("sub_node")
qos = QoSProfile(
depth=5,
durability=QoSDurabilityPolicy.TRANSIENT_LOCAL,
reliability=QoSReliabilityPolicy.RELIABLE,
lifespan=Duration(seconds=2),
)
print(qos)
self.sub = self.create_subscription(
Bool,
"/some_topic",
self.my_callback,
qos,
)
def my_callback(self, msg: Bool):
self.get_logger().info("Received: {}".format(msg.data))
def main(args=None):
rclpy.init(args=args)
node = SubNode()
try:
rclpy.spin(node)
except KeyboardInterrupt:
pass
node.destroy_node()
rclpy.shutdown()
if __name__ == "__main__":
main()
Expected behavior
Because of the lifespan of 2 seconds, the callback shouldn't be triggered if the node is launched after 2 seconds of the publisher node
Actual behavior
The callback is always triggered
Additional information
In the subscriber node, I printed the QoS object to check the properties. I got this:
QoSProfile(history=HistoryPolicy.KEEP_LAST, depth=5, reliability=ReliabilityPolicy.RELIABLE, durability=DurabilityPolicy.TRANSIENT_LOCAL, lifespan=2000000000 nanoseconds, deadline=0 nanoseconds, liveliness=LivelinessPolicy.SYSTEM_DEFAULT, liveliness_lease_duration=0 nanoseconds, avoid_ros_namespace_conventions=False)
But when checking the info with ros2 topic info /some_topic --verbose
I got:
Node name: sub_node
Node namespace: /
Topic type: std_msgs/msg/Bool
Topic type hash: RIHS01_feb91e995ff9ebd09c0cb3d2aed18b11077585839fb5db80193b62d74528f6c9
Endpoint type: SUBSCRIPTION
GID: 01.10.a5.80.f1.17.ec.a8.06.91.a8.ae.00.00.14.04
QoS profile:
Reliability: RELIABLE
History (Depth): KEEP_LAST (5)
Durability: TRANSIENT_LOCAL
Lifespan: Infinite
Deadline: Infinite
Liveliness: AUTOMATIC
Liveliness lease duration: Infinite
which says the Lifespan is Infinite. Somehow the QoS is not being set correctly (?)
DDS implementation: default
That is supposed to mean you use rmw_fastrtps_cpp
with iron, can you confirm?
i think, some rmw impelementation cannot show the actual QoS setting via ros2 topic info -v
, this depends on rmw implementation. for example, rmw_fastrtps_cpp
cannot show the history and depth QoS configuration.
there's been discussion on this, see more details for ros2/rclpy#849 (comment)
this is the conversion function RTPS QoS to RMW QoS, describes above.
besides, here is what i see.
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run prover_rclpy rclpy_1163_pub
...
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run prover_rclpy rclpy_1163_sub
...
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 topic info -v /some_topic
Type: std_msgs/msg/Bool
Publisher count: 1
Node name: pub_node
Node namespace: /
Topic type: std_msgs/msg/Bool
Topic type hash: RIHS01_feb91e995ff9ebd09c0cb3d2aed18b11077585839fb5db80193b62d74528f6c9
Endpoint type: PUBLISHER
GID: 01.0f.60.db.44.f7.83.bd.00.00.00.00.00.00.13.03
QoS profile:
Reliability: RELIABLE
History (Depth): UNKNOWN
Durability: TRANSIENT_LOCAL
Lifespan: Infinite
Deadline: Infinite
Liveliness: AUTOMATIC
Liveliness lease duration: Infinite
Subscription count: 1
Node name: sub_node
Node namespace: /
Topic type: std_msgs/msg/Bool
Topic type hash: RIHS01_feb91e995ff9ebd09c0cb3d2aed18b11077585839fb5db80193b62d74528f6c9
Endpoint type: SUBSCRIPTION
GID: 01.0f.60.db.18.f8.bd.4f.00.00.00.00.00.00.13.04
QoS profile:
Reliability: RELIABLE
History (Depth): UNKNOWN
Durability: TRANSIENT_LOCAL
Lifespan: 2000000000 nanoseconds
Deadline: Infinite
Liveliness: AUTOMATIC
Liveliness lease duration: Infinite
History (Depth): UNKNOWN
is set toUNKOWN
as mentioned above.Lifespan: 2000000000 nanoseconds
is set to subscription.- and if subscription issues after 2 seconds, i do not see the latched message delivered to the subscription.
thanks,
@fujitatomoya my bad, actually, I was using Cyclone DDS export RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
After changing it to rmw_fastrtps_cpp
I can confirm your results and the expected behaviour. In that case what should we do with this issue? Who can tackle this in cyclone DDS implementation?
I'm going to move this over to the rmw_cyclonedds_cpp
repository for now.
This is very closely related to eclipse-cyclonedds/cyclonedds#1835
This issue has an interesting difference with that one, though: the lifespan in DDS is spec'd to use the time of publication of the data (the "source timestamp") and so is probably never going to work in a useful way with transient-local data because the historical data would probably have expired long before it reached the reader. Therefore, simply adding a notion of lifespan to reader (as I noted as a possible way of addressing eclipse-cyclonedds/cyclonedds#1835) is not going to improve things here ...
So how badly needed is it? If it is not really important, then a documentation update is a reasonable approach, else we need to figure out how to deal with it without breaking the spec entirely.
IMO this should be fixed since the behavior should be the same independently of the DDS implementation.
IMO this should be fixed since the behavior should be the same independently of the DDS implementation.
Are you sure that's what you mean? Because that's equivalent to saying the combination is not supported because it is impossible without using vendor-specific extensions to the DDS specification.
I mean that it is very confusing that it behaves differently without any warning because the user would expect the same behavior. I think the feature is also a reasonable one since lifespan is an important part of the QoS specs