res$summary shows PP.H2.abf=1 AND PP.H4.abf=1

Question

res$summary shows PP.H2.abf=1 AND PP.H4.abf=1

Closed this issue 2 years ago · 4 comments

Some background on what I am doing - due to the nature of our study a custom approach was taken to construct credible sets (i.e. without using SusieR). The approach we took gives us quite narrow credible sets and we have Bayes factors for all the variants in each credible set.

We are studying 9 different traits and we expect them to have some overlapping loci. We now have a long list of partially overlapping credible sets from different trait pairs. I wanted to use coloc to determine if the partially overlapping sets are truly colocalising or not.

We have Bayes factors for the variants in our credible set so I take the following approach:

create vectors of Bayes factors for the 2 credible sets
subset to only those variants that are shared between the credible sets

This gives 2 equal length vectors of Bayes factors for the 2 loci, but I have had to throw away non-intersecting variants (I'm not sure if this is a problem?).

Now it is trivial to simply apply coloc.bf_bf to these 2 vectors. In many cases this gives answers that make sense upon manual inspection.

But for a large number of cases the results summary shows that:

(PP.H2.abf=1 OR PP.H1.abf=1) AND PP.H4.abf=1.

How is this possible? How can 2 mutually exclusive hypotheses be completely certain?

Answer 1 · 2022-11-08T19:52:18.000Z

well no, that shouldn't happen. The way coloc works, is that you would need Bayes factors for all snps in a region, not just those in credible sets, and certainly not just those in both credible sets. So it doesn't sound like you are using it right. But it still shouldn't produce two hypotheses with posteriors of 1. I can't reproduce this behaviour. Can I ask what the vectors of Bayes factors look like?

…

________________________________ From: zainomarali ***@***.***> Sent: 08 November 2022 17:10 To: chr1swallace/coloc ***@***.***> Cc: Subscribed ***@***.***> Subject: [chr1swallace/coloc] res$summary shows PP.H2.abf=1 AND PP.H4.abf=1 (Issue #105) Some background on what I am doing - due to the nature of our study a custom approach was taken to construct credible sets (i.e. without using SusieR). The approach we took gives us quite narrow credible sets and we have Bayes factors for all the variants in each credible set. We are studying 9 different traits and we expect them to have some overlapping loci. We now have a long list of partially overlapping credible sets from different trait pairs. I wanted to use coloc to determine if the partially overlapping sets are truly colocalising or not. We have Bayes factors for the variants in our credible set so I take the following approach: 1. create vectors of Bayes factors for the 2 credible sets 2. subset to only those variants that are shared between the credible sets This gives 2 equal length vectors of Bayes factors for the 2 loci, but I have had to throw away non-intersecting variants (I'm not sure if this is a problem?). Now it is trivial to simply apply coloc.bf_bf to these 2 vectors. In many cases this gives answers that make sense upon manual inspection. But for a large number of cases the results summary shows that: (PP.H2.abf=1 OR PP.H1.abf=1) AND PP.H4.abf=1. How is this possible? How can 2 mutually exclusive hypotheses be completely certain? — Reply to this email directly, view it on GitHub<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fchr1swallace%2Fcoloc%2Fissues%2F105&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7Cac77ea2f5adc4b5dff9d08dac1ac3875%7C49a50445bdfa4b79ade3547b4f3986e9%7C1%7C0%7C638035242667013316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qU%2FgFpfEnf2joUxIQ0lhvo86AJj%2B4JXXLu6ndx9kYj4%3D&reserved=0>, or unsubscribe<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAQWR2BBO2O5ZS4IINH2HZTWHKCQTANCNFSM6AAAAAAR2RSCSU&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7Cac77ea2f5adc4b5dff9d08dac1ac3875%7C49a50445bdfa4b79ade3547b4f3986e9%7C1%7C0%7C638035242667013316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=UNIlCE0ggx6VUQENT%2FQUHAM9eModWR4Pu3ooVr7cIKM%3D&reserved=0>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Answer 2 · 2022-11-09T10:46:50.000Z

Yes I can give you one example

a_vec = [4.567277e+26, 3.942818e+25]
b_vec = [7.097575e+45, 2.174417e+44]

if you run coloc.bf_bf on these vectors you should reproduce the behavior I described. Unfortunately I cannot share very complex examples with many markers or share my code directly as everything is on a protected research network.

Thank you for your clarification on how to use coloc! I will ask our statistician for the other Bayes Factors (I was only provided BF for variants in credible sets)

Answer 3 · 2022-11-09T11:14:09.000Z

these values are huge. I think the issue is that the docs for coloc.bf_bf say these should be Bayes factors, but in fact they should be log Bayes factors. Your example works if I use log(a), log(b). I think the solution is that I update the docs - would you agree?

…

________________________________ From: zainomarali ***@***.***> Sent: 09 November 2022 10:47 To: chr1swallace/coloc ***@***.***> Cc: Chris Wallace ***@***.***>; Comment ***@***.***> Subject: Re: [chr1swallace/coloc] res$summary shows PP.H2.abf=1 AND PP.H4.abf=1 (Issue #105) Yes I can give you one example a_vec = [4.567277e+26, 3.942818e+25] b_vec = [7.097575e+45, 2.174417e+44] if you run coloc.bf_bf on these vectors you should reproduce the behavior I described. Unfortunately I cannot share very complex examples with many markers or share my code directly as everything is on a protected research network. Thank you for your clarification on how to use coloc! I will ask our statistician for the other Bayes Factors (I was only provided BF for variants in credible sets) — Reply to this email directly, view it on GitHub<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fchr1swallace%2Fcoloc%2Fissues%2F105%23issuecomment-1308560694&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C2264063e1399446cc0cd08dac23fd3f1%7C49a50445bdfa4b79ade3547b4f3986e9%7C1%7C0%7C638035876638467170%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bMk5s7mpgA7Lesgh3wChRiQXtub4G%2FFpfYGZK6gxLS0%3D&reserved=0>, or unsubscribe<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAQWR2FD7LJAJKJZK77KCG3WHN6KJANCNFSM6AAAAAAR2RSCSU&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C2264063e1399446cc0cd08dac23fd3f1%7C49a50445bdfa4b79ade3547b4f3986e9%7C1%7C0%7C638035876638467170%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1%2BXBjSwTxxI1%2F7VmWDVeokdvhCeWF5OZZ5Ig0e9Lozg%3D&reserved=0>. You are receiving this because you commented.Message ID: ***@***.***>

Answer 4 · 2022-11-09T11:53:56.000Z

Yes that sounds great! I will retry with log Bayes Factors and with the BFs for the entire region, thank you so much for your quick reply Chris!