Confidence Intervals for Difference of Binomial Proportions

Computation of confidence intervals for binomial proportions and for difference of binomial proportions.

🚀 NEW 🚀 Streamlit support! See here for an app deployed on Streamlit Community Cloud.

Installation

Run

python -m pip install diff-binom-confint

or install the latest version in GitHub using

python -m pip install git+https://github.com/DeepPSP/DBCI.git

or git clone this repository and install locally via

cd DBCI
python -m pip install .

`Numba` accelerated version

Install using

python -m pip install diff-binom-confint[acc]

Usage examples

from diff_binom_confint import compute_difference_confidence_interval

n_positive, n_total = 84, 101
ref_positive, ref_total = 89, 105

confint = compute_difference_confidence_interval(
    n_positive,
    n_total,
    ref_positive,
    ref_total,
    conf_level=0.95,
    method="wilson",
)

Implemented methods

Confidence intervals for binomial proportions

Click to view!

Method (type)	Implemented
wilson	✔️
wilson-cc	✔️
wald	✔️
wald-cc	✔️
agresti-coull	✔️
jeffreys	✔️
clopper-pearson	✔️
arcsine	✔️
logit	✔️
pratt	✔️
witting	✔️
mid-p	✔️
lik	✔️
blaker	✔️
modified-wilson	✔️
modified-jeffreys	✔️

Confidence intervals for difference of binomial proportions

Click to view!

Method (type)	Implemented
wilson	✔️
wilson-cc	✔️
wald	✔️
wald-cc	✔️
haldane	✔️
jeffreys-perks	✔️
mee	✔️
miettinen-nurminen	✔️
true-profile	✔️
hauck-anderson	✔️
agresti-caffo	✔️
carlin-louis	✔️
brown-li	✔️
brown-li-jeffrey	✔️
miettinen-nurminen-brown-li	✔️
exact	❌
mid-p	❌
santner-snell	❌
chan-zhang	❌
agresti-min	❌
wang	❌
pradhan-banerjee	❌

Creating report

One can use the make_risk_report function to create a report of the confidence intervals for difference of binomial proportions.

from diff_binom_confint import make_risk_report

# df_train and df_test are pandas.DataFrame providing the data
table = make_risk_report((df_train, df_test), target = "binary_target")
# or if df_data is a pandas.DataFrame containing both training and testing data
table = make_risk_report(df_data, target = "binary_target")

For more details, see corresponding documenation. The produced table is similar to the following:

Click to view!

References

NOTE

Reference 1 has errors in the description of the methods Wilson CC, Mee, Miettinen-Nurminen. The correct computation of Wilson CC is given in Reference 5. The correct computation of Mee, Miettinen-Nurminen are given in the code blocks in Reference 1

Test data

Test data are

taken (with slight modification, e.g. the upper_bound of miettinen-nurminen-brown-li method in the edge case file) from Reference 1 for automatic test of the correctness of the implementation of the algorithms.

generated using DescTools.StatsAndCIs via

library("DescTools")
library("data.table")

results = data.table()
for (m in c("wilson", "wald", "waldcc", "agresti-coull", "jeffreys",
                "modified wilson", "wilsoncc", "modified jeffreys",
                "clopper-pearson", "arcsine", "logit", "witting", "pratt",
                "midp", "lik", "blaker")){
    ci = BinomCI(84,101,method = m)
    new_row = data.table("method" = m, "ratio"=ci[1], "lower_bound" = ci[2], "upper_bound" = ci[3])
    results = rbindlist(list(results, new_row))
}
fwrite(results, "./test/test-data/example-84-101.csv")  # with manual slight adjustment of method names

taken from Reference 7 (Table II).

The filenames has the following pattern:

# for computing confidence interval for difference of binomial proportions
"example-(?P<n_positive>[\\d]+)-(?P<n_total>[\\d]+)-vs-(?P<ref_positive>[\\d]+)-(?P<ref_total>[\\d]+)\\.csv"

# for computing confidence interval for binomial proportions
"example-(?P<n_positive>[\\d]+)-(?P<n_total>[\\d]+)\\.csv"

Note that the out-of-range values (e.g. > 1) are left as empty values in the .csv files.

Known Issues

Edge cases incorrect for the method true-profile.

DeepPSP/DBCI

Confidence Intervals for Difference of Binomial Proportions

Installation

Numba accelerated version

Usage examples

Implemented methods

Confidence intervals for binomial proportions

Confidence intervals for difference of binomial proportions

Creating report

References

NOTE

Test data

Known Issues

`Numba` accelerated version