orocos-toolchain/rtt

Massive deadlocking when disconnecting connections in parallel

doudou opened this issue · 2 comments

The refactored channel implementation that just landed in master has a very liberal use of locking, which leads to deadlocks when doing parallel disconnections. I'm chasing those, and will propose a PR for review, I'm opening this issue to track the problem.

Generally speaking, one major issue I see (apart from the deadlocking) is that remote disconnection calls are done under lock. My current strategy is to split the "remove channel from channel list(s)" from the "destroy the channel".

Thanks for reporting. Could you already share an example or unit test to reproduce the problem?

There is a set of new test cases in ports_test.cpp, introduced in f1404ff, that should have covered issues with parallel port connection, disconnection, reads and writes. But indeed it only spawns a single thread to add and remove an input port from the connection and another for an output port, but never two input or output ports concurrently.

What you suggest sounds a bit like (partially) reverting #283, a patch that was only added recently. Without, the test cases in ports_test.cpp mentioned above did not check whether the port connections were actually successful, which was often not the case.

See #302