ClusterLabs/libqb

RB: Call to sem_destroy_fn() without checking ref_count

wmdlr opened this issue · 3 comments

Call to sem_destroy_fn() inside qb_rb_close() should be permitted iff previous call to qb_atomic_int_dec_and_test() returns 1. Otherwise sem_destroy_fn() destroys the semaphore even if the ringbuffer was opened without QB_RB_FLAG_CREATE flag.
However it seems to be more conceptually correct to test QB_RB_FLAG_CREATE in rb->flags.
Thus qb_rb_close() will look like

void qb_rb_close(qb_ringbuffer_t * rb)
{
    if (rb == NULL) {
        return;
    }
    qb_util_log(LOG_DEBUG,
                 "Destroying ringbuffer: %s",
                 rb->shared_hdr->hdr_path);

    (void)qb_atomic_int_dec_and_test(&rb->shared_hdr->ref_count);
    /*
     * proposed changes
    */
    if (rb->flags & QB_RB_FLAG_CREATE)
        (void)rb->sem_destroy_fn(rb);
    /*
     * end of changes
    */
    unlink(rb->shared_hdr->data_path);
    unlink(rb->shared_hdr->hdr_path);
    munmap(rb->shared_data, (rb->shared_hdr->size * sizeof(uint32_t)) << 1);
    munmap(rb->shared_hdr, sizeof(struct qb_ringbuffer_shared_s));
    free(rb);
}

I'll have a look at it. One concern I have is if the process that created the ringbuffer crashes then
we will leak semaphores.

Certainly. A developer should be worry about it. Ringbuffers are asymmetric. I mean that only one side creates them with QB_RB_FLAG_CREATE. And only that side should destroy semaphores.
Let's have a look at qb_rb_open(). If an error occures the function executes the following code:

cleanup_data:
    close(fd_data);
    if (flags & QB_RB_FLAG_CREATE) {
        unlink(rb->shared_hdr->data_path);
    }

cleanup_hdr:
    if (fd_hdr >= 0) {
        close(fd_hdr);
    }
    if (rb && (flags & QB_RB_FLAG_CREATE)) {
        unlink(rb->shared_hdr->hdr_path);
        (void)rb->sem_destroy_fn(rb);
        (void)rb->spin_destroy_fn(rb);
    }
    if (rb && (rb->shared_hdr != MAP_FAILED && rb->shared_hdr != NULL)) {
        munmap(rb->shared_hdr, sizeof(struct qb_ringbuffer_shared_s));
    }
    free(rb);
    errno = -error;
    return NULL;

Thereby only the side with QB_RB_FLAG_CREATE destroys semaphores and unlinks shared files.

I don't have a problem applying this patch (please send a patch to the mailing list).

When I get a moment I want to look at using inotify on the mmap'ed file so in the non-creator side we
can see if the file has been closed (and compare with the refcount) - then cleanup. I want to catch the case
where the creator crashes.

But, yes, you patch does make the behavior more consistent.

In the future if you have a patch please just post to the mailing list. There are others that
might want to be involved in the discussion and then use this to report the bug (I don't know
how to tell github to send bug notifications to the mailing list).

Thanks
Angus