scikit-hep/iminuit

Step size of fixed parameters changes after v2.12.1 due to heuristic being applied

Closed this issue · 11 comments

Hi, the step size guess heuristic in #759 is being applied to fixed parameters. This results in the error of fixed parameters changing during MIGRAD. Previously the step size was carried through without changes. This allowed for quickly filtering out fixed parameters when looking at best-fit values and errors without having to refer back to a mask specifying which parameters are fixed.

Example setup:

from iminuit import Minuit


def f(x):
    return x[0] ** 2 + 0.5 * x[1] ** 2


m = Minuit(f, [1.0, 1.0])
m.fixed = [True, False]
m.errors = [0.0, 0.1]
m.migrad()
print(m.errors.to_dict())

with iminuit==v2.12.1:

{'x0': 0.0, 'x1': 1.4142135622742678}

with iminuit==v2.12.2 (same behavior with v2.12.3.beta2):

[...]/iminuit/util.py:143: UserWarning: Assigned errors must be positive. Non-positive values are replaced by a heuristic.
  warnings.warn(
{'x0': 0.1, 'x1': 1.4142135622742678}```

Assigning zero to errors is invalid, but in the past this was not caught. So in my view it is not a bug, but a feature that this code is now giving you a warning message.

How is this causing problems and why do you set errors to zero in the first place?

Setting the Minuit.errors shouldn't be necessary. C++ Minuit2 largely ignores the initially set values, therefore the heuristic guess set by iminuit should be fine.

What pyhf and cabinetry have done so far is setting Minuit.errors to 0.0 for parameters that were known to be fixed in whatever fit was going to be run next. The .migrad call did not touch .errors. When subsequently listing all post-fit .values and .errors, the values for the parameters that were being held fixed would show up, and their errors would show up as zero. That felt like an intuitive choice to me, as I would get zero standard deviation for a parameter when sampling a model where I hold that parameter constant.

It is straightforward to work around this in code calling iminuit by replacing the .errors for fixed parameters with the desired value after a fit. In some sense we already were working around the behavior before these changes, as not setting .errors initially would have resulted in a non-zero value of .errors for fixed parameters post-fit.

The underlying issue in my view is that .errors has this dual-use of step sizes and parameter errors. That leads to the values having a different meaning post-fit depending on whether or not a given parameter was held constant. Users of iminuit can of course take this into account if they are aware of this behavior, and libraries using iminuit can chose to provide versions of .errors with e.g. values set to 0.0 again after a fit.

Would a "correction" of the final .errors for fixed parameters be something to consider for iminuit, or is this out of scope given that in principle the information can be reconstructed by looking at .fixed?

@alexander-held Thank you for the additional discussion.

The underlying issue in my view is that .errors has this dual-use of step sizes and parameter errors. That leads to the values having a different meaning post-fit depending on whether or not a given parameter was held constant. Users of iminuit can of course take this into account if they are aware of this behavior, and libraries using iminuit can chose to provide versions of .errors with e.g. values set to 0.0 again after a fit.

I fully agree with that analysis, but consistently using Minuit.fixed to fix parameters should be the "one obvious way" to do it (quoting zen of python). Fixing parameters by setting errors to zero seems intuitive at first, but I think we cannot rely on C++ Minuit2 code to interpret a step size of zero as a fixed parameter, even if that has worked in the past (?). It is also an issue when you want to release the parameter later again, because then you also need to reset the error/initial step size to a non-zero value. In complicated fits, one often fixes parameters temporarily. Example of signal + background fit:

  1. Mask out the signal region and fix the signal parameters. Fit background parameters.
  2. Fix background parameters and release signal parameters. Unmask the signal region. Fit again.
  3. Release all parameters and fit again.

If iminuit should support fixing parameters by setting errors to zero, we have to agree on a reasonable behavior when the user runs migrad or another minimizer with parameters that have zero step size but are not fixed. In that case, the step size should be set to the heuristic, I think.

I completely agree with using .fixed as the single way of setting parameters to constant, I don't mean to propose looking up step sizes and fixing parameters if the step size is zero. I have previously also used .fixed (like in the small example above).

The only thing that has changed in my workflow is that previously I set step sizes to zero before fitting (and in addition specified what I'd like to keep fixed via .fixed) and .errors was still zero for these fixed parameters after the fit. This is no longer the case, so now I instead read out .errors into an array and override the values corresponding to fixed parameters with 0.0. I find this slightly less intuitive than before: previously I could fully configure Minuit and return it to the user, who then would automatically get these 0.0 values for fixed parameters when investigating .errors. Now the user has to remember that there might be fixed parameters and interpret .errors accordingly.

I see the complication of repeated fits with new configurations where a zero step size is no longer appropriate. I agree that it makes sense to use the heuristic whenever a non-fixed parameter is encountered with zero step size.

What I would like to suggest considering is not applying the heuristic to parameters that are configured as fixed. The step size for those parameters does not matter for the fit, while in scenarios with multiple fits the heuristic could be applied before the next minimization whenever the .fixed configuration has changed.

Ah, but then the real issue is that m.errors should return 0 for fixed parameters? I agree that this would be nicer and more intuitive, but like you said it is not possible since m.errors has this double use as actual errors and initial step sizes. I think I cannot resolve this awkwardness, that originates from how C++ Minuit2 works.

For covariance matrix, where no such restrictions apply, the elements that correspond to fixed parameters are set to zero. So perhaps you could just change your workflow to use Minuit.covariance matrix instead of Minuit.errors. This way, you also get the correlations. To get the errors, do np.sqrt(np.diag(m.covariance)).

What I would like to suggest considering is not applying the heuristic to parameters that are configured as fixed. The step size for those parameters does not matter for the fit, while in scenarios with multiple fits the heuristic could be applied before the next minimization whenever the .fixed configuration has changed.

That would be awkward in other ways. In OOP, we try to make interfaces so that the objects always contain a valid state. A valid state for a step size is a positive value, not a zero value. That zero or negative values were accepted in the past was an error of the API.

I will try to implement #780 to make it a bit easier.

Ah, but then the real issue is that m.errors should return 0 for fixed parameters?

That would be ideal I think, but as you describe seems not possible with the double use of m.errors. Just using the covariance matrix instead seems like a clean solution to work around this, thanks for the suggestion!

That would be awkward in other ways. In OOP, we try to make interfaces so that the objects always contain a valid state. A valid state for a step size is a positive value, not a zero value.

The zero step size is only invalid in my view with a configuration where the corresponding parameters are not also fixed. If such parameters were allowed to float again in subsequent fits, the step size could in principle be set again to nonzero for them.

Ultimately it does not make a big difference whether a user sets the step size to zero before minimization (which worked in the past) or sets the corresponding m.errors entries to zero after minimization (which has worked in the past and still does now). #780 looks useful in that regard, I tried using the m.errors[m.fixed] = 0 pattern described there at first and then instead ended up using np.where(m.fixed, 0.0, m.errors) which works but is perhaps less readable.

I originally opened this issue to understand whether this effect on fixed parameters was intentional, so for me everything here is now resolved. Thank you!

The zero step size is only invalid in my view with a configuration where the corresponding parameters are not also fixed. If such parameters were allowed to float again in subsequent fits, the step size could in principle be set again to nonzero for them.

You are right. I am taking the liberty here to make the API a bit simpler by requiring that errors/step sizes must be always positive whether the parameter is fixed or not. I think that is easier to understand.

np.where(m.fixed, 0.0, m.errors)

I think that's a nice pattern actually. I don't like np.where very much, but this is a good use-case.

Thank you, too, for the discussion, I am closing this.