gbradburd/conStruct

High Rhat and ESS issues for K>1 with spatial analyses

Closed this issue · 4 comments

Hi there,

I'm encountering the same issues described in threads #25 & #31. Namely, when I run spatial models with k=1 (iter = 5000, chains = 3), the models work well and the diagnostic plots look good. No warnings.

When I increase k to 2, even when I increase iterations to 10,000 and chains to 5, I get these warnings:

Warning messages:
1: There were 1230 divergent transitions after warmup...
2: Examine the pairs() plot to diagnose sampling problems
3: The largest R-hat is 1.69, indicating chains have not mixed...
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable...
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable...

In thread #31 , Gideon wrote, "warnings - as long as the trace plots look ok, and you're getting consistent results across independent runs, I wouldn't worry that much about the warnings. Good mixing makes things more efficient, but your results aren't necessarily suspect if the mixing is poor. It's hard for me to eyeball the traceplots you sent, but it looks like things are broadly consistent (similar log posterior probabilities and parameter estimates), so I wouldn't worry too much about inefficient mixing in any given run."

To my eye, it looks like my trace plots are ok and I'm getting consistent results across independent runs. Can I confirm that these models are trustworthy despite the warnings?

sp_K2_iter10000_chains5_trace.plots.chain_3.pdf
sp_K2_iter10000_chains5_trace.plots.chain_4.pdf
sp_K2_iter10000_chains5_trace.plots.chain_5.pdf
sp_K2_iter10000_chains5_trace.plots.chain_1.pdf
sp_K2_iter10000_chains5_trace.plots.chain_2.pdf

(also attaching one chain's model fit and one layer cov curve plot because they're pretty similar among chains)
sp_K2_iter10000_chains5_model.fit.CIs.chain_1.pdf
sp_K2_iter10000_chains5_layer.cov.curves.chain_1.pdf

Lastly, I'm running a non-spatial analysis with k=2 and am expecting more warnings. If the model output is similar to that described above (i.e., trace plots are ok, consistent results across independent runs...) despite more of the same warnings, can I trust those results as well?

Thanks so much!
Sophie

Hi Sophie,

Yeah those trace plots look reasonable to me, and it's nice that they all appear to be converging on the same stationary distribution (after using my eyes to account for label-switching, which you can do more formally with the match.layers.x.runs function). The caveat there is that it's tough for me to see what's going on with admixture proportions. And yes, if you see similar results for the non-spatial analysis, I think you can trust those as well (to the extent you trust the output of any model!).

Hi Gideon,

Thanks so much for your reply. Re: the admixture proportions: my dataset has no missing data (every individual had to have data for a locus for that locus to be processed). Does that address your caveat? Or is there another plot I could share that would clarify this?

No I mean it was just difficult to see what the traceplots for the estimated admixture proportions for each individual looked like, because they looked like a solid block of red or blue lines.

Got it. Thanks again!