mgymrek/lobstr-code

Program crashing on --command train

holtgrewe opened this issue · 1 comments

I am trying to build a lobSTR model for the non-PCR-free nano model. I have a BAM file generated by BWA-MEM. The program crashes as follows.


                       _______
             \\ //  /     -^--\ |
             ||||  / /\_____/ /
 {\         ______{ }        /         lobSTR: profiling short tandem repeats
 {_}{\{\{\{|         \=@____/          from high-throughput sequencing data
<{_{-{-{-{-| ====---- >>>
 { }{/{/{/{|______  _/=@_____
 {/               { }        \         Copyright (C) 2011-2014 Melissa Gymrek
            ||||  \ \______  \         <mgymrek@mit.edu>
             // \\  \    _^_\  |
                     \______/

[allelotype-4.0.0] 2016-07-01.16:07:48 ProgressMeter: Getting run info
[allelotype-4.0.0] 2016-07-01.17:39:12 WARNING: No extended reference sequence found for locus X:308108 read ST-E00127:299:HG2TYCCXX:7:2104:18568:6126
[allelotype-4.0.0] 2016-07-01.17:39:12 WARNING: No extended reference sequence found for locus X:308108 read ST-E00127:299:HG2TYCCXX:5:2216:25246:24972
[allelotype-4.0.0] 2016-07-01.17:39:12 WARNING: No extended reference sequence found for locus X:308108 read ST-E00127:299:HG2TYCCXX:8:2115:25053:57688
[allelotype-4.0.0] 2016-07-01.17:39:12 WARNING: No extended reference sequence found for locus X:308108 read ST-E00127:299:HG2TYCCXX:2:1204:2006:6513
[...]
[allelotype-4.0.0] 2016-07-01.17:44:57 WARNING: No extended reference sequence found for locus Y:28818965 read ST-E00127:299:HG2TYCCXX:2:1115:18254:52766
[allelotype-4.0.0] 2016-07-01.17:44:57 WARNING: No extended reference sequence found for locus Y:28818965 read ST-E00127:299:HG2TYCCXX:8:1212:6654:24866
terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check
Aborted (core dumped)

Here is my command line

lobSTR-bin-Linux-x86_64-4.0.0/bin/allelotype \
    --command train \
    --bam my.bam \
    --haploid X,Y \
    --strinfo lobSTR-bin-Linux-x86_64-4.0.0/GRCh37_v3.0.2/lobstr_v3.0.2_GRCh37_strinfo.tab \
    --index-prefix lobSTR-bin-Linux-x86_64-4.0.0/GRCh37_v3.0.2/lobstr_v3.0.2_GRCh37_ref/lobSTR_ \
    --noise_model illumina_v3

Thanks for sharing this. The "train" feature has not yet been updated to deal with BWA-MEM generated BAMs and for now must be run on BAMs generated by lobSTR. I will work on a fix and update this thread when that happens.