broadinstitute/oncotator

M2 Phasing information needs to be integrated into infer ONP

Closed this issue · 10 comments

--infer-onp needs to take into account the phasing information.

Basic solution:

  1. Look for phase ID (just check existence) (FORMAT: PGT)
  2. Check (FORMAT: PID) for match between SNPs. The string should be the same if the variants are in the same phase.

@ldgauthier @kcibul I need to confirm that the phasing example VCF is also public data. Can one of you confirm?

That's from one of the TCGA LUAD samples. I don't know what the status of that data is.

@ldgauthier Then not public.

@ldgauthier and @kcibul I anonymized it manually

@kcibul and @ldgauthier : Let's say I have two consecutive variants (in genomic space). These are adjacent SNPs. The first has phasing info and the second does not (totally absent). Oncotator will treat this as two separate phases and NOT generate an ONP.

I think that's fine. If the assembly engine isn't confident that they're phased (e.g. reads don't span for whatever reason or the adjacent SNPs weirdly end up in different active regions) then neither am I. Are you concerned about consistency?

This was just to confirm. I think that should be treated as a
separate phase/allele.

On Fri, Jun 5, 2015 at 11:32 AM, ldgauthier notifications@github.com
wrote:

I think that's fine. If the assembly engine isn't confident that they're
phased (e.g. reads don't span for whatever reason or the adjacent SNPs
weirdly end up in different active regions) then neither am I. Are you
concerned about consistency?


Reply to this email directly or view it on GitHub
#314 (comment)
.

Lee Lichtenstein
Broad Institute
75 Ames Street, Room 7003EB
Cambridge, MA 02142
617 714 8632

Agreed.

@ldgauthier is there ever going to be phasing information on a 0/0 genotype? Just want to confirm...

I'm pretty sure we won't get that information from the assembly. In the highly unlikely event that it's there in some case, we can ignore it for now.