chr1swallace/coloc

error in check_dataset: duplicated SNPs

PyunJung-Min opened this issue · 4 comments

Hi, thanks for the amazing R package. I am new to COLOC.

I'm trying to integrate GWAS summary data (dataset 1) and eQTL summary data (dataset 2).
If i understood correctly,
SNPs in dataset 1 and dataset 2 should be identical, is this right?

So i merged dataset 1 and dataset 2 by rsid.
However, there are multiple ENSG genes matched to one SNP in eQTL summary data.
So, the merged data (dataset1 and dataset 2) has many duplicated SNPs with differnet ENSG genes.

How can i deal with this problem? or am I wrong with dataset editing?

Many thanks in advance

Jungmin

Thanks for your prompt reply!!

Though my eQTL summary data has 19250 genes.
Is there a smart way to analyse 19250 genes at once, instead of performing "coloc.abf" 19250 times?

Thanks!

Jung-Min

sorry, no. but you probably don't want to run 19250 genes. You know whether each of them have a significant signal in your region of interest, so can discard the rest

Thank you for the answer! :)

My goal using COLOC is identifying causal(target) genes by integrating GWAS summary data for disease and eQTL summary data. I like to select target genes with various p-value thresholds. That's why i tried to run COLOC with all 19250 genes..

Could you please advise how to solve this mission?
I would appreciate any comment:)
Many thanks

Jung-Min