Error in seq.default with missing_visualization
BrennaF opened this issue ยท 6 comments
Hi Thierry,
I'm running missing_visualization on these data sets:
dat1
/// GENIND OBJECT /////////
// 264 individuals; 12,091 loci; 24,182 alleles; size: 30 Mb
// Basic content
@tab: 264 x 24182 matrix of allele counts
@loc.n.all: number of alleles per locus (range: 2-2)
@loc.fac: locus factor for the 24182 columns of @tab
@all.names: list of allele names for each locus
@ploidy: ploidy of each individual (range: 2-2)
@type: codom
@call: .local(x = x, i = i, j = j, loc = ..1, drop = drop)
// Optional content
@pop: population of each individual (group size range: 21-31)
head(strata)
INDIVIDUALS STRATA library
1 GM1 GM Pw1
2 GM26 GM Pw1
3 GM40 GM Pw1
4 GM31 GM Pw1
5 GM15 GM Pw1
6 GM24 GM Pw1
My strata file has multiple "STRATA" (populations) and libraries (>10 for each).
Here is my call & output:
miss.dat1 <- missing_visualization(dat1, strata=strata)
#######################################################################
#################### grur::missing_visualization ######################
#######################################################################
Folder created:
missing_visualization_20180424@1454
Importing data
Alleles names for each markers will be converted to factors and padded with 0
Scanning for monomorphic markers...
Number of markers before = 12091
Number of monomorphic markers removed = 0
Tidy genomic data:
Number of markers: 12091
Number of chromosome/contig/scaffold: no chromosome info
Number of individuals: 264
Number of populations: 1
Informations:
Number of populations: 1
Number of individuals: 264
Number of ind/pop:
NA
Number of duplicate id: 0
Number of SNPs: 12091
Proportion of missing genotypes (overall): 0.298188
Identity-by-missingness (IBM) analysis using
Principal Coordinate Analysis (PCoA)...
Generating Identity by missingness plot
Error in seq.default(h[1], h[2], length.out = n) :
'to' must be a finite number
In addition: There were 42 warnings (use warnings() to see them)
Any ideas what might be causing the seq.default error?
Thanks!
Brenna
Hi Brenna,
- Are you sure you have more than 1 value in the
STRATA
column ? - Because in the output is says
number of populations : 1
- The strata column might to be filled with
GM
all the way, or with just 1 individual with a different value, that would explain the error. - I'll raise an error earlier in the script when this is detected.
- So the problem is that I always envision the function for population genomics and never intended the function to work with just 1 large grouping, I could if it's of interest.
Also, if you want to visualize missingness of the library
column
use strata.select = c("POP_ID", "library")
as argument in the function.
This will run the function on both columns.
Best
Thierry
Hi Thierry!
Yes, that is a little odd. I definitely have multiple groups in both columns of my strata file:
> table(strata$STRATA)
BC FT GM IM PM SM TT UH UL WT
24 25 31 30 25 23 21 30 26 29
> table(strata$library)
Pc4 Pw1 Pw10 Pw2 Pw20 Pw5 Pw6 Pw7 Pw8 Pw9
2 31 30 31 23 30 29 27 32 29
My input genind file (dat1) also has populations defined:
> table(dat1@pop)
BC FT GM IM PM SM TT UH UL WT
24 25 31 30 25 23 21 30 26 29
The strata file has $STRATA and $library as factors. Is that a problem? Not sure why grur would be reading all of that in as one population.
Brenna
Can you send me the data (.RData) by email ?
Hi Brenna, ok now I see... it's not the same individuals in your strata and data.
The next radiator
and grur
release will generate an error when this is found.
re-open the issue if after fixing this, there's still something wrong with missing_visualization
Best
Thierry
Thanks Thierry - so the rownames in the dat@tab need to have a corresponding column in the strata file? I didn't realize that! I just reran it with matching individual names and it worked. Sorry to bother you with that, but glad it was an easy fix!
the strata object/file requires a minimum of 2 columns, check function doc ??grur::missing_visualization
. You had a column named INDIVIDUALS
in your strata object,
only it was not the same individuals as in dat@tab
so there was no way to match the column STRATA
and library
with the data.
Cheers
Thierry