hanchenphd/GMMAT

Error in eval(parse(text = x, keep.source = FALSE)[[1L]]) : object 'SNP__' not found

Closed this issue · 5 comments

Hi There,

Thanks again for this great tool and apologies for a second question. I am now trying to run Wald test on a subset of the imputed dosages in order to get ORs. My command is:

res <- glmm.wald(formula, data = pheno, kins = rel, id = "IID", random.slope = NULL, groups = NULL,
                 family = binomial(link = "logit"), infile = infile,  snps=snps, is.dosage = TRUE, verbose = TRUE,
                 missing.method="impute2mean", snp.col=1, infile.ncol.print=1, 
                 infile.ncol.skip=1, infile.sep='\t', infile.nrow.skip=1, infile.header.print="SNP")

and the first 5 rows/columns of the input data input data looks like

ID	SID1	SID2	SID3	SID4
1:1917446[b37]G,A	0	0	0	0.85
1:1919318[b37]C,A	0	0	0	0.85
1:3498779[b37]C,T	0	0	0	0
1:3509544[b37]G,T	0	0	0	0
1:20479748[b37]C,T	0	0	0	0

snps is a vector like

c("1:1917446[b37]G,A"  "1:1919318[b37]C,A"  "1:3498779[b37]C,T"  "1:3509544[b37]G,T"  "1:20479748[b37]C,T")

and I am getting the following error message:

Progress of Wald test:
  |                                                                                                                                                                                               |   0%

Analyze SNP  1 :  1:1917446[b37]G,A 
Error in eval(parse(text = x, keep.source = FALSE)[[1L]]) : 
  object 'SNP__' not found
In addition: Warning message:
In glmm.wald(formula, data = pheno, kins = rel, id = "IID", random.slope = NULL,  :
  Argument select is unspecified... Assuming the order of individuals in infile matches unique id in data...

Any idea what the cause of the bug is?

As an alternative approach, I load the data in from a vcf.gz file like so

gdsfn <- paste(invcf, ".gds", sep="")
SeqArray::seqVCF2GDS(invcf, gdsfn, parallel=TRUE, fmt.import="DS", scenario=c("imputation"))
snps = seqGetData(gdsfn, "variant.id")
res <- glmm.wald(formula, data = pheno, kins = rel, id = "IID", random.slope = NULL, groups = NULL,
                 family = binomial(link = "logit"), infile = gdsfn,  snps=snps, is.dosage = TRUE, verbose = TRUE,
                 missing.method="impute2mean")

and in this case, the test runs but just returns all NAs for all output values.

 SNP CHR POS REF ALT  N AF BETA SE PVAL converged
1   1  NA  NA  NA  NA NA NA   NA NA   NA        NA
2   2  NA  NA  NA  NA NA NA   NA NA   NA        NA
3   3  NA  NA  NA  NA NA NA   NA NA   NA        NA
4   4  NA  NA  NA  NA NA NA   NA NA   NA        NA
5   5  NA  NA  NA  NA NA NA   NA NA   NA        NA
6   6  NA  NA  NA  NA NA NA   NA NA   NA        NA

Any idea what could be causing this issue as well?

Can you send me a reproducible example? My email address can be found on the GMMAT home page.

Thanks,
Han

Done. Thank you!

Thank you for sharing the example! I think the problem was that your formula was a class character object, instead of a formula:

> class(formula)
[1] "character"

This can be fixed using as.formula:

> formula<-as.formula(formula)
> class(formula)
[1] "formula"

It fixed the problem in the toy dataset that you sent to me.

Re: Issue with the GDS file: if you grep "variant.id" using seqGetData, it will return an integer vector (variant indices). To get the SNP names you would want to use "annotation/id" instead.

Best,
Han

Thank you! indeed this fixes it.

Cheers,
Dylan