coloc.abf with eQTLgen and GWAS data errors
Closed this issue · 8 comments
Hi community,
I am trying to run coloc.abf using eQTLgen data and GWAS summary statistics. For each data set I provide the following data:
- eQTLgen: snp (rsIDs), snp position, type = "quant", N, MAF, z, pvalues
- GWAS: beta, varbeta, snp, snp position, type = "quant", N, MAF, pvalues
However, I encounter the following errors:
-
When I try to run coloc.abf I get this error: "Error in check_dataset(d = dataset1, 1) :
dataset 1: duplicated snps found". Data set 1 is the eQTLgen data. In this data set I have duplicated SNPs because their effect on expression level have been measured on different genes. How can I overcome it? Should I subset it for each gene and make separate coloc.abf runs? -
Regarding the GWAS summary statistics, when I run check_dataset(data_coloc_abf$GWAS, warn.minp=1e-10), I get a warning: "In check_dataset(data_coloc_abf$GWAS, warn.minp = 1e-10) :
minimum p value is: 0.99406". However, my min(data_coloc_abf[["GWAS"]][["pvalues"]]) = 5.935e-08. Why is this happening?
Thank you in advance for your time!
Thank you for your instant respose! I calculated the varbeta as varbeta = se^2 * N, because I have the per snp sample size available. However, p_value are already available in the dataset and the minimum p_value is 5.935e-08. Even if I provide the p_value as an element of the list for coloc.abf, it calculates it again using beta and varbeta?
Moreover, when I use a subset of my data with unique SNPs, and try to plot my dataset, I get this error:
> plot_dataset(data_susie$eQTLgen, main = "eQTLgen")
Error in sqrt(d$varbeta) : non-numeric argument to mathematical function
my eQTLgen contains the following information:
- snp (rsIDs)
- position (of snp)
- type ("quant")
- N (single number)
- MAF (named with rsIDs)
- z (named with rsIDs)
- pvalues (named with rsIDs)
No, it's not. According to the http://chr1swallace.github.io/coloc/articles/a02_data.html, instead of varbeta, I provided the p_value, MAF and sample size.
ok, I need to update the docs. plot_dataset() requires beta, varbeta for now
Thanks for the clarification!