eigenfoo/old-eigenfoo.github.io

Cookbook — Bayesian Modelling with PyMC3 | Eigenfoo

utterances-bot opened this issue · 5 comments

Cookbook — Bayesian Modelling with PyMC3 | Eigenfoo

This is a compilation of notes, tips, tricks and recipes for Bayesian modelling that I’ve collected from everywhere: papers, documentation, peppering my more experienced colleagues with questions.

https://eigenfoo.xyz/bayesian-modelling-cookbook/

Hi George, what are your thoughts about dynamic bayesian networks, if that is something that you have explored before?

I haven't personally implemented a dynamic Bayesian network before, but I've definitely seen them in use and in production. Check out this PyMC3 example notebook for a simple example, Thomas Wiecki's blog post for a really cool example, and the PyMC3 docs to see how PyMC3 supports temporal modeling through random walk processes.

As far as I can see, there are two ways to model sequential/time series data. 1) you could always use a non-stationary model, such as the random walks, or 2) if you know what kind of time-dependent behavior you're expecting from your data, you could build that in to your model. E.g. if you expect your data to get more spread out as time passes, you can model the variance as a linear function of time. Of course, it's a modelling decision, so it depends on your situation!

NOTE (01-08-19): I've migrated from Disqus to utterances to provide comments, so I've lost almost all of the comments on my blog posts. I just thought this comment was worth saving (which was in response to the quoted question), so I copied and pasted it back in.

brews commented

Hey George. Love the post. Very helpful.

For your 'centered parameterization portion and a noncentered parameterization for other portion' idiom. Should mu_x_sd be x_sd when you define x_raw's sd?

Good catch @brews! I've fixed the typo in #36: thanks so much for pointing that out!

Great post.

It should be:

# Display the total number and percentage of divergent chains
diverging = trace['diverging']
print('Number of Divergent Chains: {}'.format(diverging.nonzero()[0].size))
diverging_perc = diverging.nonzero()[0].size / len(trace) * 100
print('Percentage of Divergent Chains: {:.1f}'.format(diverging_perc))

Also, what level of diverging_perc is acceptable?

Thanks for the catch @jonathanng! I fixed the typo in #39.

As to an acceptable diverging_perc, it should be exactly 0, at least for some suitably high acceptance probability. The real worry is when increasing the acceptance probability does not eliminate all divergences. To quote this excellent PyStan workflow tutorial:

Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, which will force more accurate simulations of Hamiltonian trajectories and reduce the false positives. ... In order to argue that divergences are only false positives, the divergences have to be completely eliminated for some [acceptance probability] sufficiently close to 1.