predict distibutions over point estimate
harell opened this issue · 1 comments
harell commented
Outline
- Why should I care
- Working with distribution objects
- Generating a distribution from point estimation
- Generating a distribution from point estimation and standard deviation
- Generating a distribution from bootstrap sampling
- Conclusion
Motivation
- Applications, quantile point estimates, scaling
- Keeping important information in a succinct form. For example, the normal distribution requires only two parameters, the mean and s.d.
- imposing business rules
Working with distribution object
- What is a distribution object?
- What operations can we perform on a distribution object?
- How can we include distribution objects in our current workflow?
- distribution objects can be represented in one column of data.frame
- purrr/dplyr operations on a column (e.g. taking the mean as a point estimate)
Generating a distribution object
- Calculating distribution empiricaly
- Incorporating prior knowledge, e.g. cars can not have negative weight or the number of gears is a positive integer
Conclusion
- Deciding on merely point estimation early in the project life is a premature decision. Moving from distribution to point estimate is one operation away while the opposite direction incurs substantial changes in the project structure.
- By default, most learning algorithms (with the exception of
fable
), return point estimation. Changing the default is an opt-in action.