ERROR: Incorrect FASTA header format
schorlton opened this issue · 1 comments
schorlton commented
Hi @kmnip,
Thanks for your support on my other issues. Here's another interesting one. I'm pretty sure the input FASTQ is valid and again this is just too few reads/too short causing some kind of FASTA invalid error. Thanks for your help!
root@06a8b6dc9fba:/data/retry# rnabloom -outdir rnabloom_out -t 8 -long filtered.fastq -ntcard [2/1879]
RNA-Bloom v1.4.3
args: [-outdir, rnabloom_out, -t, 8, -long, filtered.fastq, -ntcard]
name: rnabloom
outdir: rnabloom_out
WARNING: Output directory does not exist!
Created output directory at `rnabloom_out`
K-mer counting with ntCard...
Running command: `ntcard -t 8 -k 17 -c 65535 -p rnabloom_out/rnabloom @rnabloom_out/rnabloom.ntcard.readslist.txt`...
Parsing histogram file `rnabloom_out/rnabloom_k17.hist`...
Unique k-mers (k=17): 2,368
Unique k-mers (k=17,c>1): 192
K-mer counting completed in 3.973s
Bloom filters Memory (GB)
====================================
de Bruijn graph: 5.232985E-6
k-mer counting: 3.3946708E-6
====================================
Total: 8.627656E-6
> Stage 1: Construct graph from reads (k=17)
Parsing `filtered.fastq`...
Parsed 41 sequences in 0.013s
DBG Bloom filter FPR: 1.56 %
Counting Bloom filter FPR: 0.81 %
> Stage 1 completed in 0.024s
> Stage 2: Correct long reads for "rnabloom"
Parsing `filtered.fastq`...
Corrected Read Lengths Sampling Distribution (n=26)
min q1 med q3 max
18 23 63 92 213
Parsed 41 sequences.
Kept: 26 (63.4 %)
Discarded: 15 (36.6 %)
Corrected reads in 0.292s
Extracting seed sequences...
Bloom filter FPR: 0.0119 %
before: 1 after: 1 (100.0 %)
Extraction completed in 0.104s
> Stage 2 completed in 0.397s
> Stage 3: Assemble long reads for "rnabloom"
ERROR: Incorrect FASTA header format
rnabloom.io.FileFormatException: Incorrect FASTA header format
at rnabloom.io.FastaReader.nextWithComment(FastaReader.java:240)
at rnabloom.RNABloom.splitFastaByLength(RNABloom.java:5269)
at rnabloom.RNABloom.main(RNABloom.java:7083)
kmnip commented
This bug is fixed. Please see my new release of RNA-Bloom v2.0.0: https://github.com/bcgsc/RNA-Bloom/releases/tag/v2.0.0