pln-fing-udelar/fast-krippendorff

Unclear if multi-label annotations are supported

Alex-Bujorianu opened this issue · 2 comments

It is unclear if this library supports multi-labelled annotations correctly. I am getting a really low score of 0.079 for my alpha statistic even though the agreement seems decent enough (I would guess a Hamming Score of around 2/3). Does this library consider only perfect matches, or does it consider partially correct matches?

Some more context: I am comparing two annotators who have labelled 50 patients. They have to assign at least one label to a patient, up to 5 and typically 3. There are 11 recommendations to choose from. If one annotator labels a patient [9, 5, 6] and the other [9, 5, 1] then the Hamming Score would be 0.66.

So I suppose my question is: is this an implementation issue, or a conceptual one (i.e. is Kripendorff’s alpha appropriate for this problem?)

I don't know if Krippendorff's alpha is appropriate for that problem. You have to check.

But from what I understand, it seems like it is. This library may support multi-label if you handle it correctly (though I haven't tested this use case, not guaranteed), but it may be inefficient because of what we mention in the README file:

The implementation is fast as it doesn't do a nested loop for the coders. However, `V` should be small, since a `VxV` matrix it's used.

It iterates through all the values of the domain, which in turn, you probably need to provide because it sounds like all the possible values aren't going to be present in your annotation:

If `reliability_data` is provided, then the default value is the ordered list of unique rates that appear.

So you need to provide all the potential combinations in value_domain (which sounds like they're a lot; maybe this implementation isn't appropriate).

Also, it sounds like you want to use a custom distance function ("Hamming Score"). You can provide one using level_of_measurement:

level_of_measurement : string or callable
Steven's level of measurement of the variable.
It must be one of "nominal", "ordinal", "interval", "ratio", or a callable.

I'm closing this due to inactivity. If you still have issues, please feel free to comment.