sagc-bioinformatics/mgikit

Illumina header conversion creates corrupted files

Closed this issue · 4 comments

Hi, I am having problems with demultiplexed files that were run without --disable-illumina option. Every now and then there is a wrongly parsed header that creates 5 lines in fastq file per read.

See the '@;H' header name with the actual name being on the next line.

Screen Shot 2024-02-26 at 09 14 01

@xsvato01: which version did you use? you can see it in the log. Please share the log if possible.

This was an issue with V0.1.3. the version V0.1.4 should resolve it! Please confirm with me.

Sweet! Everything fine with V0.1.4.

Not sure if this feature got changed in V0.1.4, but I did not like that in V0.1.3 basecalling would not start if most barcodes were not detected. I would welcome this as a warning but not as a hard stop (e.g. when you are doing partial demultiplexing/debugging).

Otherwise great tool! Finally a tool with proper docs. The official MGISplitBarcode is awful.

@xsvato01 you can use --ignore-undetermined to make it just a warning. not heavily tested but should work.