gaow/mvarbvs

mash, mvash, varbvs, BSLMM and, m2ash

Opened this issue · 10 comments

gaow commented

What we have so far

  • varbvs (due to @pcarbo) implements Carbonetto and Stephens 2012
    • sparse regression via spike-slab prior, assuming small proportion of variants affecting the phenotype
    • variational inference
  • BSLMM due to Zhou, Carbonetto and Stephens 2013 is a hybrid of LMM and varbvs that adaptively determines sparsity in data
    • MCMC inference (implemented in GEMMA)
    • @pcarbo has implemented a variational inference version of this model (code not currently in github repo)
  • ash, Stephens 2016, proposed a more general unimodal prior -- one implementation is a K mixture of gaussians.
  • mash, due to Urbut, Wang and Stephens (in preparation), is a multivariate version of ash (summary statistics betahat is a matrix, not a vector).
  • mvash, due to @NKweiwang, is an R implementation that extends varbvs to allow for K mixture components (beyond the "spike"), and corresponds to the ash prior.

What we are planning on

  • @pcarbo will implement ash prior in varbvs. This will involve incorporating Wei's code into existing varbvs package, adding some bells & whistles, and aligning the code with the existing interface. Although not directly useful for this project, it will allow Gao to error-check his implementation against the special case of a single phenotype.
  • @pcarbo One possibility is to extend this model slightly to allow for the case when all variables have an effect. This would be a generalization of the BSLMM model. Again, it is not directly useful for this project, but seems natural to do this.
  • @gaow needs to handle the multiple phenotype case for multiple SNPs, which is a multivariate, multiple regression problem using ash prior and VB for inference. Calling it m2ash for now, so m2ash = mvash + mash (short for multi-SNP, multivariate phenotype model with adaptive shrinkage priors -Peter). Prior from Flutre et al 2013 can also be included in this framework. This will be the most general form.
  • For m2ash we may want to implement a MCMC version for (1) verification of the variational approximation, (2) to infer more fine-grained association signals for individual genes, and (3) have a generally useful MCMC algorithm that could be used to implement inference for a variety of models (e.g., BSLMM) and priors (e.g., different pi's for different SNPs based on pathway annotations), at least in "toy" examples in which the data set is small.
  • Note that we are assuming the available of individual-level ("raw") data, not summary statistics (e.g., unlike Xiang's RSS work).
gaow commented

One question is that do we need one single most general implementation, or we should still have multiple versions of essentially the same thing. I think it still worth having both. For instance mvash would become a special case for both varbvs and m2ash and we can check the correctness of implementation.

Yes, I think it is useful to have both m2ash and mvash.

BTW, I think it is useful to have this summary to remind ourselves how the existing models all relate to each other.

By the way, it's not clear to me that this approach, as it is described, could capture the Flutre et al (2013) model as a special case because of the introduction of a "common" effect ($\bar{b}$) for all tissues (see eqs. 8 & 9 on that paper) such that b ~ N(bbar, phi^2). I don't think it would be difficult to extend the m2ash model (+ variational inference) to allow for this.

gaow commented

If we let diag(U_k) = phi^2 + omega^2 and off-diagonal of U_k be omega^2, then we should have something similar to Flutre 2013, right? Though the weights are different because we'll learn pi_{k,l} (K x L parameters) not pi_k and pi_k separately (K + L) parameters. But K x L parameters should make it more flexible.

Similar, but not the same; their model includes an additional regression coefficient that captures the "mean effect" across different tissues/conditions. Whether this is important or not is to be discussed I suppose (or perhaps you have already discussed this point?).

If you integrate out the shared mean effect you get the multivariate normal
that Gao describes. I think this may be briefly mentioned in wen and
Stephens.
Matthew

On Aug 8, 2016 23:58, "Peter Carbonetto" notifications@github.com wrote:

Similar, but not the same; their model includes an additional regression
coefficient that captures the "mean effect" across different
tissues/conditions. Whether this is important or not is to be discussed I
suppose (or perhaps you have already discussed this point?).


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#2 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABt4xaUdBSa9-CciIjxcZ-lJNRV5NpqLks5qd7R3gaJpZM4JfPO0
.

There are multiple possible ways to implement variational inferences for
any given model, so more details required to confirm that what Peter
intends to implement is what Wei implemented. But I'm keen to see that
happen.

I'm not sure whether it helps to say
That Wei implanted what I think is referred to as a "variational em"
algorithm?

Matthew

On Aug 9, 2016 19:59, "Matthew Stephens" stephens999@gmail.com wrote:

If you integrate out the shared mean effect you get the multivariate
normal that Gao describes. I think this may be briefly mentioned in wen and
Stephens.
Matthew

On Aug 8, 2016 23:58, "Peter Carbonetto" notifications@github.com wrote:

Similar, but not the same; their model includes an additional regression
coefficient that captures the "mean effect" across different
tissues/conditions. Whether this is important or not is to be discussed I
suppose (or perhaps you have already discussed this point?).


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#2 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABt4xaUdBSa9-CciIjxcZ-lJNRV5NpqLks5qd7R3gaJpZM4JfPO0
.

Thanks, Matthew. That is helpful to know, especially as we consider the variational approximation. (Integrating over the shared mean effect would introduce correlations between all the beta's, so it wouldn't be a good idea to assume they are conditionally independent in the variational approximation.)

I use "variational EM" to mean an EM algorithm in which the usual E-step is replaced by an approximate E-step. I've developed similar options for the varbvs package since it is sometimes useful to be able to fit the hyperparameters to the data instead of averaging over them.

Another connected piece of work:
Michael is working on an R package (bmass) implementing my PloS ONE 2012 methods for multivariate association analysis. I have talked with him about how this is essentially a special case of the more general mash framework, but he decided that he prefers to implement this special case, rather than working it into the general framework, as he understands it well, and he fears moving to the general case would slow things down too much for his graduation timeline. (My PloS ONE paper used a prior on the residual variance-covariance matrix, and integrated it out, whereas in mash we have switched to using Wakefield-like BF approximations, but when n is large the two are very similar, and the Wakefield approach works more generally).

To be clearer about what I mean by the general mash framework, I mean the model
beta_j \sim \sum_l pi_l N_R(0,Sigma_l)
betahat_j | beta_j \sim N_R(beta_j, V_j)
where N_R denotes the R-variate normal.
That is, the beta come from a mixture of multivariate normals and the betahats are
noisy multivariate observations of the betas with known variance-covariance matrices V_j.

equation

\beta_i &\sim \sum_j \pi_j N_R(0,\Sigma_j) \\
\hat{\beta}_i \,|\, \beta_i &\sim N_R(\beta_i, V_i)
gaow commented

Sure. We should connect this work into the generalized implementation using this slightly different procedure. Now that we build this multivariate, multiple regression framework (working on it now) the inference procedure will be VB. For multivariate only (multiple Y, single X as in mash / bmass) the same VB procedure should also work as a special case but we will not be able to take advantage of convex optimization. Of course we can make that a special implementation for this special case.