gbradburd/conStruct

rdist rounding errors

joannarifkin opened this issue · 6 comments

Hi Gideon,

Less an issue than a heads up: sometimes rdist misbehaves by rounding (described in this issue: https://stackoverflow.com/questions/49245720/r-distance-matrix-with-non-zero-values-on-diagonal-rdist-earth). I got around it using geosphere (https://stackoverflow.com/questions/45784094/geosphere-dplyr-create-matrix-of-distance-between-coordinates). Flagging the problem in case someone else encounters it.

Cheers,

Joanna

Thanks for the heads-up Joanna! I hadn't seen that issue previously 😬. I don't think I use rdist or rdist.earth anywhere conStruct's hood, but I definitely use it in the examples/vignettes. Is that where you encountered it?

Ah ok good. I'll fix it in the vignettes.

Generally, w/r/t whether you should combine by location or keep as separate, the answer depends on whether you're interested in specific ancestries of different individuals (e.g., recent migrants, F1 hybrids, etc.) at a sampling location. In this case specifically, I'd say that 400-500 sample is a lot, and that conStruct will likely run very slowly for you. If you have the computational resources available to you, I'd recommend setting up a few independent runs (ideally without a runtime limit, like on a local machine or something) using individuals or locations as your samples, and then forgetting about them for a while while running some analyses on a smaller dataset that collapses samples by proximity (or randomly subsamples) that will finish faster.

Ok this is fixed in commit f00db2d, which is now up on CRAN (or should be shortly). Thanks for bringing this to my attention Joanna!