sequenceplanner/r2r

Use-after-free when failing to `rcl_return_loaned_message_from_subscription`

Closed this issue · 4 comments

The error path when rcl_return_loaned_message_from_subscription fails on native subscriptions suffers from a use-after-free problem:

https://github.com/sequenceplanner/r2r/blob/master/r2r/src/subscribers.rs#L107 drops the handle before the error handling starts. This means that accessing it later (https://github.com/sequenceplanner/r2r/blob/master/r2r/src/subscribers.rs#L113) is incorrect.

I believe the problem can be fixed by moving the drop call after the error handling block, or alternatively leaving out the "drop" and destroying the Box when the function returns.


The issue is a bit tricky to reproduce; I've run across it only when using cyclonedds+iceoryx as a middleware. Use the following pair of senders/receivers:
bug_reproducer.tar.gz

Run them with the CYCLONEDDS_URI environment variable pointed to the provided cyclonedds.xml file (see also https://cyclonedds.io/docs/cyclonedds/latest/config/index.html). The receiver should run into the problem within a few seconds.

An easier way to reproduce might be to temporarily pass a nullptr into rcl_return_loaned_message_from_subscription (instead of the right handle) and trigger the error path that way.

Hi,

Sorry for the delay. Yes it does look like the call to drop should be removed. But perhaps the bigger issue is that we panic in the error handling. Looks like we didn't expect returning the loaned message to ever fail. I am not familiar with the loan message api. What would be an appropriate course of action if returning the loan fails? Try to return again until successful or just leave it be?

That is a good question. I think a panic is not the worst reaction here, there is really no conceivable reason why returning a message should ever fail unless there is some form of corruption. Take a quick look at the errors specified by rcl:

  • RCL_RET_INVALID_ARGUMENT, RCL_RET_SUBSCRIPTION_INVALID are both about passing an invalid parameter. That can't happen unless there is a bug in r2r, so a panic seems the right choice there.
  • RCL_RET_UNSUPPORTED would be a strange error indeed, as we should have already gotten this when loaning the message. So I'd argue this is also a bug in r2r if it happens, and panicking is the right call

In our case, the way we triggered it is to have some sort of memory corruption on the iceoryx level. So a panic was absolutely appropriate there as well.

So overall, I'd recommend to keep it a panic and touch it again if somebody complains and points out a valid scenario where returning a loaned message can legitimately fail.

edit: by the way, if that argument makes sense to you I'm happy to submit a PR with my proposed approach.

I agree with your comments. I realize that I mistakenly pushed some code I added to test out your scenario. Happy to accept a PR in any case.

Fixed by #97.