JaneliaSciComp/msg

Fix Stampy Issue: stampy: Error: invalid data stream - and remove ln -s workaround

Opened this issue · 0 comments

Stampy seems to only map the individual fq files if they have an .fq extension. Investigate and possibly fix what is going wrong in stampy. If fixed, remove workaround of creating symbolic links with the fq extension.

This isn't a priority since the workaround is working, but it might be worth doing to A.) improve Stampy, and B.) Make sure there's nothing wrong with our FQ files.

Background:

Hi,

The line that causes the error simply says "line = self.infile.readline()", so you must be pointing Stampy to an invalid file, somehow. It could be compressed perhaps; it's not likely a Stampy problem.

Renaming to .fq might be your best bet - although I don't understand what's going on.

Best wishes
Gerton.

On 16 Feb 2012, at 18:48, Pinero, Gregory wrote:

Hi There,

I'm trying to run stampy on this input file (it's inside the tar.gz file I linked to below.)

It seems to work if I rename the input file to end with .fq but not otherwise, even though I'm specifying the input format.

I appreciate if you could take a look. Let me know if there's anything I can do to help.

I believe I've included all of the necessary files:

stampy_data.tar.gz

Thanks,

Greg Pinero

Here is the command I ran and the output and error I got:

[login2 - pinerog@e02u19]~/msg_work/MSG_toy>stampy.py -v3 --bwaoptions="-q10 parent1_ref.fa" -g parent1_ref.fa.stampy.msg -h parent1_ref.fa.stampy.msg --inputformat=fastq -M test_in_indivA12_AATAAG -o test_out.sam
stampy: Starting Stampy with the following options:
genome=parent1_ref.fa.stampy.msg
logfile=stderr
hash=parent1_ref.fa.stampy.msg
outputformat=sam
output=test_out.sam
inputformat=fastq
stats=mapstats.cache
qualitybase=!
recaldatasuffix=.recaldata
bwaoptions=-q10 parent1_ref.fa
bwa=bwa
verbosity=3
bits=-1
maxcount=200
maxscore=99999
minposterior=-99999
numrecords=-1
lowqthreshold=10
seed=1
insertsize=250
insertsd=60
maxfingerprintvariants=3
linearalignmentband=3
simulate-minindellen=0
simulate-maxindellen=0
tryvariants=-1
fastaqual=30
simulate-numsubstitutions=0
gapopen=40
gapextend=3
recalscoreprefix=20
svprior=55
longindelprior=40
baseentropy=5
banding=60
xa-max=0
xa-max-discordant=0
insertsize2=-2000
insertsd2=-1
padding=160
maxpairseeds=25
paircandlikethres=100
bwamaxmismatch=-1
bwabatchsize=50000
recalfraction=0.01
substitutionrate=0.001
stampy: Opening genome file parent1_ref.fa.stampy.msg.stidx
stampy: Opening hash file parent1_ref.fa.stampy.msg.sthash
stampy: Using BWAVersion: 0.5.7 (r1310) for pre-mapping
stampy: Mapping...
stampy: Traceback:
File "/usr/local/msg/bin/stampy/stampy.py", line 701, in
main()
File "/usr/local/msg/bin/stampy/stampy.py", line 669, in main
mapreads( settings, logger, actiondict['-M'], arguments )
File "/usr/local/msg/bin/stampy/stampy.py", line 474, in mapreads
for output in formatgenerator: pass
File "/Net/fs1/home/gerton/Progs/Mapper/stampy/Stampy/formatter.py", line 115, in formatter
File "/Net/fs1/home/gerton/Progs/Mapper/stampy/plugins/bwa.py", line 147, in generator
File "/Net/fs1/home/gerton/Progs/Mapper/stampy/Stampy/reader.py", line 138, in generator

stampy: Error: invalid data stream