full-length or exon typing
bsb2014 opened this issue · 9 comments
SpecHLA publication suggests that full-length typing outperforms exon typing. I am wondering if the reconstructed gene sequences/full-length (-u 0) are better than the reconstructed exon sequences (-u 1). Do the reads from noncoding regions (introns) improve phasing? Thanks
Do I need to care about the message below that popped up during the full-length typing (-u 0)? Thanks
Use of uninitialized value $hash{"HLA_DRB1_1"} in split at /home/src/SpecHLA/script/whole/annoHLA.pl line 318.
Use of uninitialized value $hash{"HLA_DRB1_2"} in split at /home/src/SpecHLA/script/whole/annoHLA.pl line 318.
Hi, the reads from noncoding regions (introns) can provide the linkage information between exons, thereby improving typing performance. And don't worry about the warning message, it has no impact.
The warning message
"Use of uninitialized value $hash{"HLA_DRB1_1"} in split at /home/src/SpecHLA/script/whole/annoHLA.pl line 318.
Use of uninitialized value $hash{"HLA_DRB1_2"} in split at /home/src/SpecHLA/script/whole/annoHLA.pl line 318." often occurred with failure of DRB1 typing. Could you please let me know what the message means? Thanks
Could you also explain what do ‘‘Bowtie,’’ ‘‘Exon,’’ ‘‘Whole.norealign,’’ ‘‘Whole,’’ and ‘‘Whole.SV’’ modes mean? Thanks
I found the answer, but it is not clear to me if Exon=Novoalign + exon? (It would be better if some aligner could replace Novoalign that is not free)
If read binning with Bowtie2 + exon typing +15-20x read coverage + 150bp, how much accuracy for 2-field HLA typing? Thanks
Hi,
- The warning is caused by the strict requirement of Perl, we have removed the warning in the latest commit.
- The default parameters are
Novoalign
+whole
+realign
+no SV
. So, the mode name means its difference with the default parameters. E.g.,exon
meansNovoalign
+exon
+realign
+no SV
.realign
indicates using the database to link the unphased blocks. - We have not performed
Bowtie2 + exon
typing. But the accuracy ofBowtie2 + whole + 20x
typing is roughly 0.8 in simulated data.
Many thanks for your helpful replies. I tested the SpecHLA with Novoalign 4. The Novoalign seems to treat Illumina reads as Sanger (see below). Is it normal? Thanks.
"# Interpreting input files as Sanger FASTQ."