sherbold/autorank

create_report produces incorrect description for approach="bayesian"

Closed this issue ยท 4 comments

The problem happens for the following example code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rc('text', usetex=False)
from autorank import autorank, plot_stats, create_report, latex_table


pd.set_option('display.max_columns', 7)


raw = np.array([[0.61874876, 0.61219062],
                [0.89017217, 0.90443957],
                [0.62806089, 0.63185734],
                [0.96929193, 0.97255931],
                [0.87340513, 0.95460121],
                [0.84087749, 0.94438674],
                [0.9863088 , 0.98558508],
                [0.94314842, 0.64510605],
                [0.9862604 , 0.99173966]])
data = pd.DataFrame()
data['pop_0'] = raw[:, 0]
data['pop_1'] = raw[:, 1]

result = autorank(data, alpha=0.05, verbose=False, approach="bayesian", rope=1.0, rope_mode="absolute")
create_report(result)

It always outputs:

We found significant and practically relevant differences between the populations pop_1 (MD=0.944+-0.190, MAD=0.061) and pop_0 (MD=0.890+-0.184, MAD=0.117).

even though the ROPE is 1.0.

The reason is because the following condition expects a set with only 'inconclusive' or 'equal'

if {'inconclusive'} == set(result.rankdf['decision']):
print("We failed to find any conclusive evidence for differences between the populations "
"%s." % create_population_string(result.rankdf.index, with_stats=True))
elif {'equal'} == set(result.rankdf['decision']):
print(
"All populations are equal, i.e., the are no significant and practically relevant differences "
"between the populations %s." % create_population_string(result.rankdf.index,
with_stats=True))

but the actual result is

>>> set(result.rankdf['decision'])
{'NA', 'equal'}

Thanks for the report. I will look into this later this week and publish a bug fix release 1.1.1.

The bug should be fixed and the release 1.1.1 is published

That was very quick. Thank you for the excellent tool!

Your report contained a test case, the exact location, and reason for the bug. You cannot make it much simpler for me ๐Ÿ‘