brian-j-smith/Mamba.jl

Sanity check for a beginner

aus70 opened this issue · 10 comments

aus70 commented

I recently started using Mamba and I'm also a newbie in the Bayesian area. I'd really appreciate if an expert could check the code below and let me know if there are issues in my reasoning and/or in the code itself. The goal is to model a noisy linear system and use the model to extrapolate the system's output. I assume the model behaves consistently linearly in any range of the independent variable. Thanks!

using Mamba

a = linspace(0, 10, 100) # Independent variable
b = 3 .* a .+ randn(length(a)) # Dependent variable: The modeled system is linear + some noise
c = linspace(100, 200, 100) # I need to predict (extrapolate) the system's output in this range 

## Data
data = Dict{Symbol, Any}(
    :x => a,
    :y => b,
    :z => c
)
data[:N] = length(data[:y])
data[:M] = length(data[:z])

## Model Specification
model = Model(

  y = Stochastic(1,
    (x, beta, N) ->
        UnivariateDistribution[
            begin
                Normal(beta * x[i], 1)
            end
            for i in 1:N
        ],
    false
  ),

  beta = Stochastic(
    () -> Uniform(1, 5), # ~~not~~ very informative
    true
  ),

  test = Logical(1, # here I compute the extrapolated data points
    (z, beta, M) -> 
        [
            begin
                rand(Normal(beta * z[i], 1))
            end
            for i in 1:M
        ],
    true
  ),

)

## Initial Values
inits = [
    Dict{Symbol, Any}(:y => data[:y], :beta => 3.0) # pretty good guess
]

## Sampling Scheme
scheme = [Slice(:beta, 1.0, Univariate)]
setsamplers!(model, scheme)

## Compute just a few samples
sim = mcmc(model, data, inits, 2500, burnin=250, thin=2, chains=1)
describe(sim)

## Let's see what the system does when the input is between 110 and 120
p = sim[:, vcat(false, 110 .<= c .<= 120), 1] # select all prediction for 110 <= c <= 120
mean(p.value) # that's a value I might want to base a decision on

The code looks okay to me. I did not run it, but since you didn't mention any errors I assume it ran just fine. I would point out that your prior on beta is quite informative. Also its always a good idea to run a few independent chains (from different starting points) and to check convergence diagnostics to ensure you are obtaining enough samples to do inference.

aus70 commented

Thanks! The code runs well, and I'm aware that I should be running multiple chains, but I'm more interested in checking if what I'm doing makes sense at all. About the beta: I tried to strikethrough the not in the code... with limited success.

In this specific case, the Mamba function predict() would provide the same sort of prediction I obtain with the logical test variable, but limited to the in-sample data-points: is my understanding correct?

Yes, I believe that is correct.

aus70 commented

Great, thanks! I'm now probably abusing your patience, but I have a question regarding the samplers: is there a heuristic for setting the scale factor of, for example, the slice sampler? Is visual inspection of the values chart (as shown in Roberts & Rosenthal, Optimal Scaling for Various Metropolis–Hastings Algorithms, p. 353) a good practice?

For the Slice sampler, in general, increasing the scale parameter will increase the runtime of the sampler but it will allow the sampler to explore more of the posterior. This is just an issue when your posterior is quite diffuse over your parameter space (or at worst if your posterior is highly multi-modal) in which case the Slice sampler could easily get stuck in one area of high density and not explore others.

In your simple example, the posterior will be Gaussian and uni-modal so the scale parameter probably won't make too much of an impact.

aus70 commented

So, if I got it right, manual tuning the scale factor by visual inspection of the sample chart is actually a viable approach?

It's a good start, but it might be difficult to identify in those hard cases (multi-modal).

aus70 commented

If the posterior is multi-modal, does it make sense to start with a large scale-factor and decreasing it until it affects the shape of the posterior, so to trade-off speed and exploration?

There's no automatic slice width adaptation built into Mamba currently, so you would have to do something like that manually. But anyways that would not solve the problem.

Look at the picture on the right:

the black vertical is the current state, and the red horizontal represents the slice parameter. The next state is chosen from inside of this horizontal bracket. You can see that as this horizontal gets smaller it will not be inside of the part of the density where the 2nd mode is at.

Specifically, take a look at fig 15 of Neal's original Slice paper. He says retrospective tuning is only possible for univariate and unimodal slice sampling.

aus70 commented

Thank you for your help and patience!