Lookup ROS msg definitions

Question

Lookup ROS msg definitions

rhaschke opened this issue 6 years ago · 20 comments

As far as I can see, the message parser needs registration of messages with full-fledged definitions, including all definitions of sub messages - as they are provided by ros::message_traits::Definition<ros_type>::value(). So far, the use case of this lib seems to process arbitrary messages received externally together with this full-fledged message definition.
I intended to use the lib to parse serialized messages, where I only knew the ros type (in form of pkg_name/type_name) and I was disappointed that this doesn't work out of the box.

I think, by looking up msg definitions from the (ROS) file system, utilizing
roslib::getPath(), you can easily handle this use case as well. Probably it will be a valuable extension to augment the existing API with a registerMessageDefinition(const ROSType &ros_type).

Answer 1 · 2019-01-18T09:56:51.000Z

Thinking more about this, I believe for consistency it's important to use full-fledged message definitions.
Otherwise, a node parsing a serialized message might use another message definition than that was used for serialization.

Answer 2 · 2019-01-18T11:15:10.000Z

So far, the use case of this lib seems to process arbitrary messages received externally together with this full-fledged message definition.

This is the use case for this library... https://github.com/facontidavide/PlotJuggler
PlotJuggler can read any ROS message, with zero information about the type at compilation time.

I intended to use the lib to parse serialized messages, where I only knew the ros type (in form of pkg_name/type_name) and I was disappointed that this doesn't work out of the box.

It is impossible to deserialize an array of butes without a schema.

In ROS there are only two sources of type erased messages, topic_tools::ShapeShifter and rosbag::MessageInstance, and they both contain the MessageDefinition.

Therefore, "I was disappointed that this doesn't work out of the box." sounds a bit unfair...

Thinking more about this, I believe for consistency it's important to use full-fledged message definitions.

As I said, it is not about consistency. It is impossible to do it otherwise. No deserialization library can exist unless you know how bytes are ordered in a serialized message, and only the message definition (AKA, a schema) can tell you that.

.

Answer 3 · 2019-01-18T11:31:18.000Z

If you see any real use case where you get a serialized message (as an array of bytes), but you are unable to get the Message Definition, let me know.
I haven't found such a case yet.

Answer 4 · 2019-01-18T12:18:53.000Z

I think, you misunderstood my use case: I do know the name of the ROS message type, but I don't have the full definition at hand. But it would be possible to "manually" assemble the full definition from the .msg files in the ROS file system. However, as I said, the sender and receiver system might use different versions of these files such that there is an inconsistency between serializer and your deserializer.
Hence, it's mandatory to also provide the full definition.

Answer 5 · 2019-03-26T17:31:32.000Z

Here's the use-case:
https://github.com/AIS-Bonn/nimbro_network/blob/7e1de20c8fb33fb1bf22ab9a8c4e93d098b9c4a3/nimbro_topic_transport/src/udp/udp_receiver.cpp#L186

The nimbro_network library transmits ROS messages over a non-ROS link, and the receiving end needs to verify the message definitions on receiver match message definitions on sender (by comparing MD5). Thus there's the need for knowing MD5/message definition of a topic of which you know only ROS message name - exactly as @rhaschke requested. As of now, the library makes a system call to rosmsg show, which is far from optimal.

Answer 6 · 2019-03-26T18:28:42.000Z

@peci1, I think you explicitly need to communicate the message definition as well. Looking them up from the file system is, of course, possible, but might yield different results on the sender and receiver system.

Answer 7 · 2019-03-26T19:14:48.000Z

But looking up on filesystem is intended IMO. The package needs to check if message definitions on sender and receiver computers match, because if they don't, there is a problem. On the other hand, the definitions themselves can be quite large, so transmitting only MD5 makes sense.

Answer 8 · 2019-03-26T20:16:06.000Z

If you argue for transmitting MD5 between sender and receiver and the receiver just validating the MD5 from its filesystem lookup, I fully agree: this would be a valuable extension of the library.
But you didn't mention that you have the MD5 available. In my use case, I hadn't.

Answer 9 · 2019-03-26T20:41:40.000Z

Yes, having the MD5 is an advantage, but anyways - the receiver side has to query the message type/MD5 regardless of the MD5 it got from the sender (to verify they're the same).

Answer 10 · 2019-03-26T20:43:40.000Z

The MD5 from the sender is required to be able to compare with the MD5 computed on the receiver side.

Answer 11 · 2019-03-26T20:45:58.000Z

So in your case you'd just deserialize using the locally found definitions and hope nothing changed since the time of serialization, right?

Answer 12 · 2019-03-26T20:47:18.000Z

This was my naive, initial thinking, yes. But this will fail silently (or dramatically) if the definitions are different on both sides.

Answer 13 · 2019-03-26T21:44:56.000Z

Let's try to bring a little more prospective into this problem.

We have a sender and a receiver (or server / client if you prefer), that communicate over UDP.
What I suggest, based on projects I did in the past, is:

The server side has the CORRECT definition of the message, since it is the one that serialize the information.
When a client subscribes to receive one or more messages, the server replies with the message definition on the server side.
Each of these message definition is associated to a unique identifier, that, to be consistent and practical, can be the MD5Sum.
The client side can, therefore, register the [MD5sum + definition] pair using my library.
Finally, the server can send one or more serialized messages that the client will be able to deserialize. Each message as a fixed header with the unique identifier + the serialized data.
Notice that the client will only use message definitions on the server side, not its own side.

Done!

Answer 14 · 2019-03-26T22:03:41.000Z

by the way, this discussion about consistency of the Message Definitions and MD5Sum seems to be kind of ... obvious.

What I mean is that the very fact that your library nimro-network exists is based on the fact that the two sides (sender and receiver) agree on the same message definition.

Otherwise, not only my library will fail but also any other serialization-deserialization mechanism provided by ROS itself.

Furthermore, I strongly disagree with the statement made by @peci1

As of now, the library makes a system call to rosmsg show, which is far from optimal.

This make no sense to me (and I am very obsessed about performance), since this call is done only ONCE at the beginning, not every single time a message is sent.

Answer 15 · 2019-03-26T22:54:36.000Z

Davide, I think you're missing a point in your answers: I do not want to blindly accept the message definitions from the sender/server on client side - because there is also a running ROS system on the client which would protest if you tried to use message definitions other than those installed on the client. What I need is to detect the situation when sender and receiver definitions differ, and fail immediately in such case.

This still seems to be impossible with ros_type_introspection, and yet there are at least two people (me and @rhaschke ) who think it'd be worth having. On the other hand, it seems to me the implementation would get quite complicated, as there is no way in roscpp to actually read the MD5/definition values generated to the header files. And the values are generated by python scripts, so it seems the only reliable way would be to use PythonLibs to call genmsg... Anybody correct me if I'm wrong...

Answer 16 · 2019-03-26T22:55:25.000Z

And the "far from optimal" was not meant performance-wise, but generally, calling system() is something I do only as the very last resort...

Answer 17 · 2019-09-23T12:49:26.000Z

The ros_babel_fish package uses embedded python to get the message definition: https://github.com/StefanFabian/ros_babel_fish/blob/5b009431983b751038d1de124b2f798b72d82792/ros_babel_fish/src/generation/providers/embedded_python_description_provider.cpp . I'm not sure though if it is better than calling rosmsg via a system call...

Answer 18 · 2019-09-23T13:56:15.000Z

CC @StefanFabian

I agree with you, we can just use a system call in C++ using system call. it is done only once and it is going to be reasonably fast.

Answer 19 · 2019-09-23T14:58:09.000Z

The ros_babel_fish package uses embedded python to get the message definition

That's not entirely true. It has the option to do that, but the default is the integrated description provider which does not use or depend on python at all. The python version was just my first implementation where I didn't want to reimplement the whole thing in C++ but it ended up not working in some use cases so I implemented a native C++ solution (Edit: For future readers, I don't mean not working because of a differing implementation. What I mean is that it doesn't work in some cases because running any embedded python at all doesn't work in some cases. The python runtime simply can't be initialized.).

I agree with @facontidavide that the performance of a one-time event doesn't matter (within reason).
However, the usage of system() calls is strongly discouraged in the C++ community for various reasons other than the horrible performance.
Here's one example of the numerous articles on why you shouldn't use it.

Answer 20 · 2019-09-23T15:32:51.000Z

Moreover, if you'd like to get definitions of more messages (say, hundreds), and your ROS_PACKAGE_PATH is long (which it usually is if you build using catkin_tools), then just the initialization of the RosPack object done in the system call gets a little expensive, and since you immediately exit the program, the cached paths remain unused and have to be recomputed on every system call.