eclipse-cyclonedds/cyclonedds-cxx

Random crash when create DataReader/DataWriter...

Closed this issue · 1 comments

Problem descriptor

I found a random crash when create a DataWriter or a DataReader, it crush in:

/* src/ddscxx/src/org/eclipse/cyclonedds/core/ListenerDispatcher.cpp : 191 */
    if (la->reset_on_invoke && la->cpp_ref->obtain_callback_lock()){  // crash because la->cpp_ref is deleted.
        ...
    }
    ...

Minimal Reproducible Example:

//Publisher
#include "dds/dds.hpp"
#include "HelloWorldData.hpp"

using namespace org::eclipse::cyclonedds;

int main() {
    dds::domain::DomainParticipant participant(13);
    dds::topic::Topic<HelloWorldData::Msg> topic(participant, "topic");
    auto p_qos = participant.default_publisher_qos();
    dds::pub::Publisher publisher(participant, p_qos);
    auto qos = publisher.default_datawriter_qos();
    class Listener : public dds::pub::NoOpDataWriterListener<HelloWorldData::Msg> {
    public:
        void on_publication_matched(dds::pub::DataWriter<HelloWorldData::Msg> & dataWriter,
            const dds::core::status::PublicationMatchedStatus &) override {
            std::cout << dataWriter->get_ddsc_entity() << " matched." << std::endl;
        }
    };
    auto listener = new Listener();
    while(true) {
        dds::pub::DataWriter<HelloWorldData::Msg> writer(publisher, topic, qos, listener, dds::core::status::StatusMask::all());
    }
}
// Subscriber
#include "dds/dds.hpp"
#include "HelloWorldData.hpp"

using namespace org::eclipse::cyclonedds;

int main() {
    dds::domain::DomainParticipant participant(13);
    dds::topic::Topic<HelloWorldData::Msg> topic(participant, "topic");
    auto s_qos = participant.default_subscriber_qos();
    dds::sub::Subscriber subscriber(participant, s_qos);
    auto qos = subscriber.default_datareader_qos();
    class Listener : public dds::sub::NoOpDataReaderListener<HelloWorldData::Msg> {
    public:
        void on_subscription_matched(dds::sub::DataReader<HelloWorldData::Msg> &dataReader,
            const dds::core::status::SubscriptionMatchedStatus &) override {
            std::cout << dataReader->get_ddsc_entity() << " matched." << std::endl;
        }
    };
    auto listener = new Listener();
    while(true) {
        dds::sub::DataReader<HelloWorldData::Msg> reader(subscriber, topic, qos, listener, dds::core::status::StatusMask::all());
    }
}

Problem location

I found the problem occurs here(eg. with DataWriter, also in DataReader):

 /* src/ddscxx/include/dds/pub/DataWriter.hpp : 170 */
DataWriter(const dds::pub::Publisher& pub,
                  const ::dds::topic::Topic<T>& topic,
                  const dds::pub::qos::DataWriterQos& qos,
                  dds::pub::DataWriterListener<T>* listener = NULL,
                  const dds::core::status::StatusMask& mask = ::dds::core::status::StatusMask::none()); //This index with (1)
 /* src/ddscxx/include/dds/pub/detail/DataWriter.hpp : 62 */
DataWriter(const dds::pub::Publisher& pub,
               const ::dds::topic::Topic<T>& topic,
               const dds::pub::qos::DataWriterQos& qos); (2)
    /* src/ddscxx/include/dds/pub/detail/DataWriterImpl.hpp : 392 */
    this->listener_set(nullptr, dds::core::status::StatusMask::all(), false);
    dds_entity_t ddsc_writer = dds_create_writer (ddsc_pub, ddsc_topic, ddsc_qos, this->listener_callbacks); //This index with (3)
    dds_delete_qos(ddsc_qos);
    ...
    /* src/ddscxx/src/org/eclipse/cyclonedds/core/EntityDelegate.cpp : 226 */
org::eclipse::cyclonedds::core::EntityDelegate::listener_set(void *_listener, 
                                                                                            const dds::core::status::StatusMask& mask, 
                                                                                            bool reset_on_invoke) { // This index with (4)
    ....
    if (this->enabled_) // This index with (5)
    {
        dds_return_t ret;
        ret = dds_set_listener(this->ddsc_entity, callbacks);
        ISOCPP_DDSC_RESULT_CHECK_AND_THROW(ret, "Setting listener failed.");
    }
    // Delete previous ddsc listener callbacks object
    if (this->listener_callbacks != NULL)
    {
        void *prev_arg;
        dds_lget_data_available_arg(this->listener_callbacks, nullptr, &prev_arg, nullptr);
        dds_delete_listener(this->listener_callbacks);
        delete reinterpret_cast<org::eclipse::cyclonedds::core::ListenerArg *>(prev_arg); //This index with (6)
    }
    ...

I. When call function (1), it will call (2) before (1).
II. When call (1), it will call (3) to set a default dds_listener_t to dds_entity_t.
III. After this, the real listener will be generate in (4).
IV. But at (5), this->enabled_ is false, so it will not set real listener to dds_entity_t.
V. Then, current dds_listener_t which binding on dds_entity_t will be delete by (6).
VI. After all of above, (1) will call enable() to set real listener to dds_entity_t.

So when an event comes between V and VI, it will crashed reason of (6), arg is released.

Solution

I. Make this->listener_callbacks to nullptr.
II. After (3), set this->enable_ = true

I don't know why there is a this->listener_callbacks, so I dont know how to repair it will make a lowest effect.

Thank you! That analysis is super 😀