/simpledorff

Calculate Krippendorff's Alpha on any DataFrame

Primary LanguagePythonMIT LicenseMIT

SimpleDorff - Calculate Krippendorff's Alpha on a DataFrame

Krippendorff's Alpha is a commonly used inter-annotator reliability metric, but it's hard to calculate on a Dataframe. This package makes it easy.

Made with ❤️ by LightTag - The Text Annotation Tool For Teams. We use this in production to give our customers a single number to understand the quality of their labeled data. Read the blog post here

Problem It Solves

Calculating Krippendorff's Alpha assumes data is formatted in a way that just doesn't appear in the wild. We wanted a package that could read a Dataframe in the formats we see in real life and give us the Alpha in one line.

Installing

pip install simpledorff

Usage

import simpledorff
import pandas as pd
Data = pd.read_csv('./examples/from_paper.csv') #Load Your Dataframe
Data.head()
Unnamed: 0 document_id annotator_id annotation
0 0 1 A 1.0
1 1 1 B 1.0
2 2 1 D 1.0
3 3 1 C NaN
4 4 2 A 2.0
simpledorff.calculate_krippendorffs_alpha_for_df(Data,experiment_col='document_id',
                                                 annotator_col='annotator_id',
                                                 class_col='annotation')
0.743421052631579