genet.dist not working for populations with large difference in sample size
georgeomics opened this issue · 3 comments
I am attempting to calculate Fst using genet.dist
for 6 populations with the corresponding sample sizes:
1 2 3 4 5 6
97 133 219 16 16 53
My code looks like the following:
df1 <- subset(data, population %in% c(1,2))
dg1 <- df2genind(d1, ploidy=2, ncode=1, pop=d1$population)
calcFst <- genet.dist(dg1, method = "WC84")
And works great as long as one of the populations is not 4 or 5. If population 4 or 5 is used, I receive the following error:
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
arguments imply differing number of rows: 113, 57
In addition: Warning message:
In matrix(unlist(e), ncol = x@ploidy[1], byrow = TRUE) :
data length [113] is not a sub-multiple or multiple of the number of rows [57]
However, the code still works as intended when the populations being compared are 4 AND 5 (i.e., c(4,5)
). One obvious thing to me is the difference in sample sizes. What could be the source of the error?
Hi,
This looks like a issue you have with hierfstat::genetdist
rather that adegenet
. You might consider reposting there. In any case, without an example data set, it is difficult to answer your question. And, I am wondering why you are subsetting your data, as hierfstat::genet.dist
will produce estimates of genetic distances for all pairs of populations?
We're randomizing population assignments between pairwise regions hence the subsetting. Though I resolved the issue, which turned out to be due to the presence of the population column in populations with a relatively "small" number of individuals. I updated the code like so to remove the population column:
df1 <- subset(data, population %in% c(1,2))
dg1 <- df2genind(d1[,-1], ploidy=2, ncode=1, pop=d1$population)
calcFst <- genet.dist(dg1, method = "WC84")
Still not sure why the previous code runs fine with all other populations (and produces similar results), but not specifically for those populations with 16 individuals. But it is running as intended now across all population comparisons.