amplab/snap

Snap aligner FASTQ record larger than buffer

Closed this issue · 3 comments

Hi developers of snap,

Introduction

I am a bioinformatician that works on Applied-Maths. I really like snap for mapping reads to bacterial genomes! I congratulate you guys on this most outstanding mapper.

Problem

I run my mapping jobs on a server. When I submit a job to the server, it gzip's the fastqs for faster transfer. However, when I ran snap on the server, it gave an error on these fastq.gz files: "FASTQ record larger than buffer size at /../test_2.fastq.gz:4885027"

I suppose this error is similar to https://www.biostars.org/p/278787/

Solution

When I tried to replicate the mapping on my own local linux system I however did not got any errors. In order to to pinpoint the problem, I compared the fastq's on my own system and those on the servers with md5sum and diff and it appeared that my code had removed the newline on the end of the fastq-file before compressing it with gzip and sending it to the server - my bad. When I re-added this newline at the end of the fastq before compressing, the problem was gone.

I will change my own code, but I also think it might be good for snap that it also accepts fastq.gz-files without a newline on the end of the fastq.

Keep up the good work!
Klaas Mensaert

Any updates to this? I encounter the same error when I try to align nanopore reads. I'm also using gzipped fastq inputs, and I found some older issues online suggesting to decompress the files first, but the SNAP website says it supports gzipped fastq so I don't think this is the problem. For reference, I am using SNAP 1.0beta.18 for Linux (64-bit), and I get the following error:

Loading index from directory... 0s.  17414383 bases, seed size 22
Aligning.
FASTQ record larger than buffer size at /home/kchan/thesis/raw_data/SRR7690687.fastq.gz:8388608
SNAP exited with exit code 1 from line 255 of file SNAPLib/FASTQ.cpp