scikit-hep/hepstats

Tolerance for zfit model in compute_sweights

srishtibhasin opened this issue · 8 comments

Hi,

I have come across a strange issue when trying to use compute_sweights using a model from zfit.

This is a snippet of the error I see:

name        value    at limit
------  ---------  ----------
N_s82       619.8       False
N_b82         561       False
lamb82  -0.002628       False
Converged:  True
Valid:  True
Traceback (most recent call last):
  File "GBReweighting.py", line 282, in <module>
    folder, foldernum, mc_tos, mc_tis, data_tos, data_tis, sWeights_tos, sWeights_tis = get_data()
  File "GBReweighting.py", line 67, in get_data
    sWeights_tos = get_sWeights(run_period, 'tos', 'ref', data = data_tos, refit = True, apply_NN = False)
  File "/afs/cern.ch/work/s/sbhasin/Analysis/b2oc-AmAn_B02D0D0Kpi_Run12/utils/zfit_sWeights.py", line 136, in get_sWeights
    sweights = re_fit(tuple_path, global_vars, foldernum, trig, sig_or_ref, data, apply_NN)
  File "/afs/cern.ch/work/s/sbhasin/Analysis/b2oc-AmAn_B02D0D0Kpi_Run12/utils/zfit_sWeights.py", line 117, in re_fit
    weights = compute_sweights(model, mass)
  File "/afs/cern.ch/user/s/sbhasin/public/b02d0d0kpi_AmAn_virtualenv/lib/python3.6/site-packages/hepstats/splot/sweights.py", line 107, in compute_sweights
    "The model needs to fitted to input data in order to comput the sWeights."
hepstats.splot.exceptions.ModelNotFittedToData: The model needs to fitted to input data in order to comput the sWeights.

Clearly the model was fitted to the data as the results are printed above, and the error is only coming up for certain data and not for others (different run periods and trigger requirements)

I tried recreating the bit of hepstats code which causes the error and I get for eg.

models = model.get_models()
p = np.vstack([m.pdf(mass) for m in models]).T
Nx = model.ext_pdf(mass)
pN = p / Nx[:, None]
print(np.array(pN).sum(axis=0))

which returns
[1.00044341 1.00166534]

So looks like I'm just over the tolerance in this case - do you know why this would be happening?
FYI the model is a Double Crystal Ball and Exponential

Thanks in advance!

Hi,
Yes indeed there is an absolute tolerance of 1e-3. Do you know by any chance the tolerance used in the minimiser (if you changed the default tolerance)?

Can you also give informations about the fit, does the result look good? How many number of events are fitted? ...

Hi,

I didn't change any of the defaults, and I'm using the Extended Unbinned NLL. Yes, all the fits look good and converge, and results are pretty stable, etc. Ranges from around 400 to 1500 events, and the ones that are above the tolerance are at either end of the range (ie. not just the smaller datasets failing)

@marinang I think we've changed the minimizer tolerance to something larger, I guess 10-4 indeed, to match with RooFit. We could use something like 5e-3? Or do you think this is not sensible enough anymore? I mean it's an EDM, that may as well be underestimated

@srishtibhasin what are the numbers roughly that you see, do they deviate more than 0.005 from 1?

The biggest difference I see is in my example above, 0.0017, so no bigger than 0.005

I guess it's safe to increase it (or have it as a global flag?) since it is a safety check. E.g. if someone fits with a smaller tolerance, we don't want him to be not able to use it, right?

Ok let's increase the tolerance to 0.005 and I will replace the error with a warning message instead.

Hey @srishtibhasin you can this issue fixed in the new hepstats released version :).

@marinang great, thanks very much for the quick response!