wdecoster/NanoPlot

Nanoplot iterations produce different outputs

Closed this issue · 2 comments

ASTR44 commented

Hi,

I have noticed that the output fra Nanoplot differs between iterations. The numbers in the summary statistics are the same but the plots look different. Here are 5 examples all made with the same command:
NanoPlot --fastq_rich Ec001_super.fastq.gz --N50 -o Ec001_super_nanoplot5

1_LengthvsQualityScatterPlot_dot
2_LengthvsQualityScatterPlot_dot
3_LengthvsQualityScatterPlot_dot
4_LengthvsQualityScatterPlot_dot
5_LengthvsQualityScatterPlot_dot

Yes, that is expected behavior. The plotting function will randomly sample up to 10000 reads for the plot (https://github.com/wdecoster/NanoPlot/blob/master/nanoplotter/nanoplotter_main.py#L107). This is mainly for the speed of plotting and disk size of the plots. It may lead to subtle differences in outliers, but should not affect the bulk of your data.

Hope that helps!

ASTR44 commented

Okay, then it makes sense :) Thank you for the quick reply!