Question about Constant Values in r.df2 Column of Coloc Output
Opened this issue · 8 comments
Hi there!
Thank you for the practical package. I have a question regarding the coloc.abf function when running it between my eQTL and GWAS data, both derived from the same samples. Here’s the code I'm using:
r_result <- coloc.abf(
dataset1=list(snp=input$snp,
pvalues=as.numeric(input$pvalues_gwas),
type="quant",
N=unique(input$N_gwas),
beta=input$beta_gwas,
varbeta=input$varbeta_gwas,
sdY=input$sdY_gwas),
dataset2=list(snp=input$snp,
pvalues=as.numeric(input$pvalues_eqtl),
type="quant",
N=unique(input$N_eqtl),
beta=input$beta_eqtl,
varbeta=input$varbeta_eqtl,
sdY=input$sdY_eqtl),
MAF=input$MAF_gwas
)
However, I've noticed that the r.df2 column in the output appears to have the same values across all 3823 rows:
r.df2
1 0.02200489
2 0.02200489
3 0.02200489
According to the documentation, the r.df2 column represents the LD correlation between SNPs in dataset 2. Could you clarify why it is constant across rows? Am I misunderstanding the output or potentially doing something wrong?
Thank you!
Hi, thank you for your quick response! Indeed, I may have misinterpreted the output. Is it possible to provide the link to the official documentation of the coloc.abf()
output? My summary of (input$varbeta_eqtl)
is as follows:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.003872 0.004880 0.006464 0.008235 0.009679 0.022955
sure!
Here is the summary for the full input parameters:
- GWAS:
> unique(input$N_gwas)
[1] 467
> summary(input$beta_gwas)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.330330 -0.038549 -0.010866 -0.007179 0.026881 0.333010
> summary(input$varbeta_gwas)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.001677 0.002151 0.002841 0.003628 0.004326 0.009794
> summary(input$sdY_gwas)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.04095 0.04623 0.05328 0.05847 0.06577 0.09897
- eQTL:
> summary(input$snp)
Length Class Mode
3823 character character
> summary(as.numeric(input$pvalues_eqtl))
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000 0.1883 0.4723 0.4684 0.7322 0.9997
> unique(input$N_eqtl)
[1] 467
> summary(input$beta_eqtl)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.2590863 -0.0633625 -0.0003472 -0.0022102 0.0590574 1.3495496
> summary(input$varbeta_eqtl)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.003872 0.004880 0.006464 0.008235 0.009679 0.022955
> summary(input$sdY_eqtl)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.06222 0.06986 0.08040 0.08811 0.09838 0.15151
PS: I was looking for the documentation of the output parameters :)
Thanks again for your quick response!
good catch!
This is my full GWAS summary statistics:
> r_gwas[1:3,]
Key: <Predictor>
Predictor Chromosome Basepair A1 A2 Wald_Stat Wald_P Effect
<char> <int> <int> <char> <char> <num> <num> <num>
1: chr7_44077165_T_C_b38 7 44077165 C T 1.1513 0.24959 0.089449
2: chr7_44078226_A_G_b38 7 44078226 G A 0.2119 0.83215 0.010901
3: chr7_44078261_G_A_b38 7 44078261 A G -0.6554 0.51224 -0.040281
SD Effect_Liability SD_Liability A1_Mean MAF tissue
<num> <lgcl> <lgcl> <num> <num> <char>
1: 0.077691 NA NA 0.169165 0.084582 Adipose_Subcutaneous
2: 0.051435 NA NA 0.447537 0.223769 Adipose_Subcutaneous
3: 0.061465 NA NA 0.278373 0.139186 Adipose_Subcutaneous
mtgene tissue_gene rsID n_e_samples
<char> <char> <char> <int>
1: ENSG00000198712 Adipose_Subcutaneous_ENSG00000198712 rs10270623 467
2: ENSG00000198712 Adipose_Subcutaneous_ENSG00000198712 rs11761589 467
3: ENSG00000198712 Adipose_Subcutaneous_ENSG00000198712 rs11768442 467
This is how I calculate SE
and var_beta_values
. Maybe this step is not accurate?
r_gwas$se=r_gwas$SD/sqrt(r_gwas$n_e_samples)
r_gwas$var_beta_values <- (r_gwas$se)^2
I've resolved the issue. The software I used to calculate the summary statistics mislabeled the SE column as SD. Thank you for the debugging session—it really helped me sort things out!