chr1swallace/coloc

coloc.abf with eQTLgen and GWAS data errors

Closed this issue · 8 comments

Hi community,

I am trying to run coloc.abf using eQTLgen data and GWAS summary statistics. For each data set I provide the following data:

  • eQTLgen: snp (rsIDs), snp position, type = "quant", N, MAF, z, pvalues
  • GWAS: beta, varbeta, snp, snp position, type = "quant", N, MAF, pvalues

However, I encounter the following errors:

  1. When I try to run coloc.abf I get this error: "Error in check_dataset(d = dataset1, 1) :
    dataset 1: duplicated snps found". Data set 1 is the eQTLgen data. In this data set I have duplicated SNPs because their effect on expression level have been measured on different genes. How can I overcome it? Should I subset it for each gene and make separate coloc.abf runs?

  2. Regarding the GWAS summary statistics, when I run check_dataset(data_coloc_abf$GWAS, warn.minp=1e-10), I get a warning: "In check_dataset(data_coloc_abf$GWAS, warn.minp = 1e-10) :
    minimum p value is: 0.99406". However, my min(data_coloc_abf[["GWAS"]][["pvalues"]]) = 5.935e-08. Why is this happening?

Thank you in advance for your time!

Thank you for your instant respose! I calculated the varbeta as varbeta = se^2 * N, because I have the per snp sample size available. However, p_value are already available in the dataset and the minimum p_value is 5.935e-08. Even if I provide the p_value as an element of the list for coloc.abf, it calculates it again using beta and varbeta?

Moreover, when I use a subset of my data with unique SNPs, and try to plot my dataset, I get this error:
> plot_dataset(data_susie$eQTLgen, main = "eQTLgen")
Error in sqrt(d$varbeta) : non-numeric argument to mathematical function

my eQTLgen contains the following information:

  1. snp (rsIDs)
  2. position (of snp)
  3. type ("quant")
  4. N (single number)
  5. MAF (named with rsIDs)
  6. z (named with rsIDs)
  7. pvalues (named with rsIDs)

No, it's not. According to the http://chr1swallace.github.io/coloc/articles/a02_data.html, instead of varbeta, I provided the p_value, MAF and sample size.

ok, I need to update the docs. plot_dataset() requires beta, varbeta for now

Thanks for the clarification!