[BUG] - parameters alpha, beta, gamma returned in Fit_Gamma_3P are not the same of Gamma_Distribution?
Closed this issue · 1 comments
Describe the bug
Parameters alpha
, beta
, gamma
returned by Fit_Gamma_3P
are not the same of Gamma_Distribution
?
Are maybe alpha
and beta
swapped?
To Reproduce
# imports
import numpy as np
from reliability.Fitters import Fit_Everything
from reliability.Distributions import Gamma_Distribution
import matplotlib.pyplot as plt
# dummy dataset
data = np.array([ 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 1.5, 1.5,
1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5,
1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5,
1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5,
1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5,
1.5, 1.5, 1.5, 1.5, 1.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5,
2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5,
2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5,
2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5,
2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5,
2.5, 2.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5,
3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5,
3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5,
3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5,
3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5,
3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5,
3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5,
3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5,
3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5,
3.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5,
4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5,
4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5,
4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5,
4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5,
4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5,
4.5, 4.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5,
5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5,
5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5,
5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5,
5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 6.5, 6.5, 6.5,
6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5,
6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 7.5, 7.5, 7.5, 7.5, 7.5,
7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5,
7.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5,
8.5, 8.5, 8.5, 8.5, 8.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5,
9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 10.5, 10.5, 10.5, 10.5,
10.5, 10.5, 10.5, 10.5, 11.5, 11.5, 11.5, 11.5, 11.5, 12.5, 12.5,
12.5, 12.5, 12.5, 13.5, 13.5, 13.5, 13.5, 13.5, 13.5, 13.5, 13.5,
13.5, 13.5, 14.5, 14.5, 14.5, 14.5, 14.5, 14.5, 14.5, 15.5, 15.5,
15.5, 15.5, 15.5, 15.5, 15.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5])
# plot dataset histogram
plt.hist(data, bins=17);
# Fitting with Fit_Everything gives Gamma_3P as best distribution
results = Fit_Everything(data)
# get alpha, beta, gamma parameters from Gamma_3P
results.results.iloc[0]
Distribution Gamma_3P
Alpha 7.65975
Beta 0.570525
Gamma 0.4999
Log-likelihood -1107.83
AICc 2221.71
BIC 2234.14
AD 46.2631
optimizer TNC
# create and plot distribution from estimated parameters
gamma_3P = Gamma_Distribution(
alpha=results.results.iloc[0].Alpha,
beta=results.results.iloc[0].Beta,
gamma=results.results.iloc[0].Gamma,
)
gamma_3P.PDF(xmax=18)
plt.hist(data, bins=18, density=True);
# swapping alpha and beta...
gamma_3P = Gamma_Distribution(
alpha=results.results.iloc[0].Beta,
beta=results.results.iloc[0].Alpha,
gamma=results.results.iloc[0].Gamma,
)
gamma_3P.PDF(xmax=18)
plt.hist(data, bins=18, density=True);
I'm not sure why you think the parameters are switched. The histogram plot with fitted PDF just peaks really high. If you let reliability plot it using results.best_distribution.PDF()
then the ylim will be set automatically.
Just because the PDF of the Gamma 3P doesn't follow the histogram does not mean that the fit is incorrect. Mathematically is is better than all the others so it should be chosen, but it is only the best of the 12 distributions in this package. It may just be that none of the available distributions are suited to the data. "But Gamma_3P is the best fitting distribution" I hear you say. That is true from a log likelihood perspective, but that doesn't mean that there isn't some other distribution out there that's better than Gamma_3P.
Out of interest, look at how well the Weibull Mixture models the histogram:
The dataset is interesting because of the non-zero start point, the few large peaks in the middle and the small peak in the tail.
Not many software packages implement Gamma 3P so I understand benchmarking it against other software is difficult. If you want to verify the accuracy of the Gamma_3P fit, then try drawing samples from a known distribution and refitting it like is done here.
Lastly, before I close this issue for not being a bug, I would like to point out that there is a much better way to access the parameters than using iloc and rebuilding the distribution. Do this instead:
results = Fit_Everything(data, show_histogram_plot=False,show_probability_plot=False,show_best_distribution_probability_plot=False,show_PP_plot=False)
results.best_distribution.PDF()
print("Alpha =",results.best_distribution.alpha)
The best fitting distribution is already built for you and you can extract parameters by name. Have a read of the API reference for Fit_Everything.
If you'd like to know more, please feel free to email alpha.reliability@gmail.com.