ArtPoon/kamphir

Use prior distributions in calculating kernel score

Closed this issue · 2 comments

Use prior distributions in calculating kernel score

Prior distributions specified as strings in JSON, references to scipy.stats objects.

scipy.stats distributions are bloody confusing and the documentation is absolute garbage difficult to comprehend.

  • lognorm(shape, scale=x). For some reason I have had no success in specifying shape with a keyword. Based on this StackOverflow post, this first positional argument seems to correspond to sdlog. Thus, setting it to a very small value (0.0001) gives you a tight distribution around the mean (scale). Setting lognorm(0.5, scale=10000) gives an IQR of about 7000 to 15,000, and a 95% CI of about 4000 to 27,000. lognorm(1.0, scale=0.01) gives an IQR of about 0.005 to 0.02 and a 95% CI of about 0.001 to .07. Note that loc shifts the distribution like any other continuous distribution.
  • norm(loc, scale). This has been more straightforward. loc is simply the offset that determines the mean and scale is the standard deviation.
  • uniform(loc, scale). This is where the reuse of keyword arguments gets a bit obtuse. loc is the lower limit and scale is the range. Thus, uniform(10, 20) samples points uniformly from the range (10, 30).
  • beta(alpha, beta). alpha and beta are positional arguments. loc and scale can be used to shift and rescale the distribution, respectively.