Gradient descent based fitting for non-gaussian distributions
Opened this issue · 6 comments
It would be usefult to have a parameter inference function that does not assume gaussian errors.
Common examples are: binomial dristribution & Poisson distribution
I'd like to start implementing this soon. However, I think it's worhwhile discussing how we want to do this.
At first thought the following issues come to mind:
- Should this be integrated with FitBase? If yes, It might be good to write a second class with a compatible interface. We could wrap these to make the user facing end a simple argument.
a. The FitBase interface will probably need to be adapted to allow this. - Do we want something that is specific to binomial distributions or more general?
- We will want some kind of parameter confidence interval. However, covariances are no longer valid?
Implementing a baysian calculation for P(Model|Data) would allow for a general maximum likelyhood calculation with a known confidence interval. I'd suggest defaulting to a flat prior which may be overridden by the user.
After discussing with @hartytp, I think there are really 3 issues:
- Local fit likelyhood optimisation of binomial (or other) distributed data.
- Global fit likelyhood optimisation of binomial (or other) distributed data.
- Calculating the confidence interval for a single binomial data-point. (for plotting purposes)
3\. Calculating the confidence interval for a single binomial data-point. (for plotting purposes)
This post seems to suggest the right thing.
Local minimisation of binomial (or other) distributed data.
Global minimisation of binomial (or other) distributed data.
Regarding these, as mentioned before, the only case where the distribution really ends up mattering is for acquisition-time-limited problems like state tomography. For that, established codes to do MLE/Bayesian estimation already exist. (oitg.circuits
has some MLE code; I've recently done Bayesian estimates for the remote ion-ion data using Tomographer.)
For other online calibration problems, it's typically easier to just acquire a bit more data – and if one is data-rate-bound, then adaptive/online Bayesian methods (choosing e.g. Ramsey delay times, etc. based on prior data) where 1/N scaling in the error can often be achieved are the way to go.
Local minimisation of binomial (or other) distributed data.
Global minimisation of binomial (or other) distributed data.
Edited to be intelligible