chr1swallace/coloc

Question about Constant Values in r.df2 Column of Coloc Output

Opened this issue · 8 comments

Hi there!

Thank you for the practical package. I have a question regarding the coloc.abf function when running it between my eQTL and GWAS data, both derived from the same samples. Here’s the code I'm using:

r_result <- coloc.abf(
  dataset1=list(snp=input$snp,
                pvalues=as.numeric(input$pvalues_gwas), 
                type="quant", 
                N=unique(input$N_gwas),
                beta=input$beta_gwas,
                varbeta=input$varbeta_gwas,
                sdY=input$sdY_gwas), 

  dataset2=list(snp=input$snp,
                pvalues=as.numeric(input$pvalues_eqtl), 
                type="quant", 
                N=unique(input$N_eqtl),
                beta=input$beta_eqtl,
                varbeta=input$varbeta_eqtl,
                sdY=input$sdY_eqtl), 
  MAF=input$MAF_gwas
)

However, I've noticed that the r.df2 column in the output appears to have the same values across all 3823 rows:

       r.df2
1 0.02200489
2 0.02200489
3 0.02200489

According to the documentation, the r.df2 column represents the LD correlation between SNPs in dataset 2. Could you clarify why it is constant across rows? Am I misunderstanding the output or potentially doing something wrong?

Thank you!

Hi, thank you for your quick response! Indeed, I may have misinterpreted the output. Is it possible to provide the link to the official documentation of the coloc.abf() output? My summary of (input$varbeta_eqtl) is as follows:

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.003872 0.004880 0.006464 0.008235 0.009679 0.022955 

sure!

Here is the summary for the full input parameters:

  • GWAS:
> unique(input$N_gwas)
[1] 467

> summary(input$beta_gwas)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
-0.330330 -0.038549 -0.010866 -0.007179  0.026881  0.333010 

> summary(input$varbeta_gwas)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.001677 0.002151 0.002841 0.003628 0.004326 0.009794 

> summary(input$sdY_gwas)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.04095 0.04623 0.05328 0.05847 0.06577 0.09897 
  • eQTL:
> summary(input$snp)
   Length     Class      Mode 
     3823 character character 

> summary(as.numeric(input$pvalues_eqtl))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0000  0.1883  0.4723  0.4684  0.7322  0.9997 

> unique(input$N_eqtl)
[1] 467

> summary(input$beta_eqtl)
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
-1.2590863 -0.0633625 -0.0003472 -0.0022102  0.0590574  1.3495496 

> summary(input$varbeta_eqtl)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.003872 0.004880 0.006464 0.008235 0.009679 0.022955 

> summary(input$sdY_eqtl)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.06222 0.06986 0.08040 0.08811 0.09838 0.15151 

PS: I was looking for the documentation of the output parameters :)

Thanks again for your quick response!

good catch!

This is my full GWAS summary statistics:

> r_gwas[1:3,]
Key: <Predictor>
               Predictor Chromosome Basepair     A1     A2 Wald_Stat  Wald_P    Effect
                  <char>      <int>    <int> <char> <char>     <num>   <num>     <num>
1: chr7_44077165_T_C_b38          7 44077165      C      T    1.1513 0.24959  0.089449
2: chr7_44078226_A_G_b38          7 44078226      G      A    0.2119 0.83215  0.010901
3: chr7_44078261_G_A_b38          7 44078261      A      G   -0.6554 0.51224 -0.040281
         SD Effect_Liability SD_Liability  A1_Mean      MAF               tissue
      <num>           <lgcl>       <lgcl>    <num>    <num>               <char>
1: 0.077691               NA           NA 0.169165 0.084582 Adipose_Subcutaneous
2: 0.051435               NA           NA 0.447537 0.223769 Adipose_Subcutaneous
3: 0.061465               NA           NA 0.278373 0.139186 Adipose_Subcutaneous
            mtgene                          tissue_gene       rsID n_e_samples
            <char>                               <char>     <char>       <int>
1: ENSG00000198712 Adipose_Subcutaneous_ENSG00000198712 rs10270623         467
2: ENSG00000198712 Adipose_Subcutaneous_ENSG00000198712 rs11761589         467
3: ENSG00000198712 Adipose_Subcutaneous_ENSG00000198712 rs11768442         467

This is how I calculate SE and var_beta_values. Maybe this step is not accurate?

  r_gwas$se=r_gwas$SD/sqrt(r_gwas$n_e_samples) 
  r_gwas$var_beta_values <- (r_gwas$se)^2

I've resolved the issue. The software I used to calculate the summary statistics mislabeled the SE column as SD. Thank you for the debugging session—it really helped me sort things out!