Not enough unqiue values to generate halfviolin plot?
Closed this issue · 3 comments
Hi, @josesho,
I got the following error when there are only two unique values
in a group of comparison
"""
Traceback (most recent call last):
File "/home/tongli/miniconda3/envs/maars/lib/python3.7/site-packages/dabest/_classes.py", line 1295, in plot
out = EffectSizeDataFramePlotter(self, **all_kwargs)
File "/home/tongli/miniconda3/envs/maars/lib/python3.7/site-packages/dabest/plotter.py", line 488, in EffectSizeDataFramePlotter
halfviolin(v, fill_color=fc, alpha=halfviolin_alpha)
File "/home/tongli/miniconda3/envs/maars/lib/python3.7/site-packages/dabest/plot_tools.py", line 17, in halfviolin
V = b.get_paths()[0].vertices
IndexError: list index out of range
"""
It seems that DABEST needs at least three points to generate a half violin plot, is that so?
Concretely, I have several groups to be compared. Most of them have more than 3 unique values.
However, I have one group having a very flatten data point distribution (a lot of 0s
and 1s
). Nothing else.
Frankly, I'm not sure what is happening...
Do you think the error is coming from here?
Best wishes
Tong
It is likely that the resultant bootstrap curve is basically empty? Without access to your data (more specifically, dummy data with the same structure and Ns as your real data), I can't say much more.
First, I confirm there is nothing to do with the unique value
.
Second, I obtained this wired plot, do you have any idea how is this generated?
Third, I exported the dataset that gave the error. However, I was unable to reproduce the plot above... Here is the code I used:
import pandas as pd
import dabest
import matplotlib.pyplot as plt
df = pd.read_pickle("dummy_clean.pkl")
two_groups_unpaired = dabest.load(df, idx=list(df.columns), resamples=10000, ci=95)
two_groups_unpaired.cohens_d.plot()
plt.show()
PS: it's the mad2-DMSO
that generate the error of the plot
It seems that I have too much NaNs
in the dataset that in turn generate this error (during resampling? It does not happen all the time. Purely guessing...)
/lib/python3.7/site-packages/dabest/_stats_tools/effsize.py:226: RuntimeWarning: invalid value encountered in double_scalars
return M / divisor
/lib/python3.7/site-packages/dabest/_stats_tools/confint_2group_diff.py:211: RuntimeWarning: invalid value encountered in less
prop_less_than_es = sum(B < effsize) / len(B)
Frankly, I have no idea what is happening...
In my hands (with dabest==0.2.8
and pandas==0.25.3
), the code and the pickled dataset do produce a plot, . I would suggest making sure your python virtual environment has the above requirements.
Feel free to reopen the issue if you still have problems.