/TobitSUN

Bayesian inference for standard tobit model: i.i.d. sampler from exact SUN posterior & approximate inference

Primary LanguageRMIT LicenseMIT

Bayesian Conjugacy for Tobit Regression via Unified Skew-Normal Random Variables: Exact and Approximate Inference

This repository is associated with the article Bayesian conjugacy in probit, tobit, multinomial probit and extensions: A review and new results. The key contribution of the paper is outlined below.

In this article we review, unify and extend recent advances in Bayesian inference and computation for a core class of statistical models relying on partially or fully discretized latent Gaussian utilities (e.g., probit regression, multinomial probit, dynamic multivariate probit, probit Gaussian processes, skewed Gaussian processes, tobit models, and several extensions of the above constructions to multivariate, skewed, non-linear and dynamic contexts). To address such a goal, we prove that the likelihoods induced by these formulations share a common analytical structure that implies conjugacy with a broad class of distributions, namely the unified skew-normals (SUN), that generalize Gaussians to skewed contexts. This result unifies and extends recent conjugacy properties for specific models within the class analyzed, and opens avenues for improved posterior inference, under a broader class of formulations and priors, via novel closed-form expressions, i.i.d. samplers from the exact SUN posteriors, and more accurate and scalable approximations from VB and EP.

Consistent with Section 5 in the article, this repository provides codes and tutorials to implement the inference methods discussed in the review, in the specific case of tobit regression. The complete tutorial can be found in the file ApplicationTutorial.md where we also provide details on how to generate the synthetic data analyzed.

  • Sampling-based methods: The goal of the first part of the analysis is to compare the performance of an i.i.d. sampler from the exact SUN posterior (i.i.d.), with that of a routinely-implemented state-of-the-art MCMC-based competitor, namely the no-u-turn Hamiltonian Monte Carlo sampler (NUTS).
  • Deterministic approximations: The second part focuses instead on the comparison of the different deterministic approximations of the exact posterior discussed in the article. These include mean-field variational Bayes (MF-VB), partially-factorized variational Bayes (PFM-VB), and expectation-propagation (EP). We employ the results of the i.i.d. sampler as a ground-truth to validate empirically the accuracy of these different approximate methods. We highlight that the EP routine reported in the present repository leverages a novel efficient implementation proposed in the article, crucially improving scalability to high-dimensional scenarios relative to alternative implementations in the literature.

The functions to implement the above methods can be found in the R source file functionsTobit.R, and a tutorial explaining in detail the usage of these functions is available in the file functionsTutorial.md. We refer to Section 4 in the article for details on all the computational methods implemented here.

All the analyses are performed with a MacBook Pro (OS Big Sur, version 11.6.8, Processor 2,7 GHz Intel Core i7, RAM 16 GB), using an R version 4.1.0.

IMPORTANT: Although a seed is set at the beginning of each routine, the outputs reported in ApplicationTutorial.md may be subject to slight variations depending on which version of the R packages has been used in the code implementation. This is due to possible internal changes of certain functions when the package version has been updated. However, the magnitude of these minor variations is negligible and does not affect the conclusions.