Limit on plotting long sequences?
Closed this issue · 3 comments
Hi,
I tried installing NanoPlot with conda but it didn't finish, so I installed it as a regular app.
I wanted to make the quality-score vs sequence length, and it wasn't clear which options to use so I followed one of the examples, and I used the following code:
NanoPlot -t 8 --fastq Ju760.Lig.P-1.fastq --plots hex dot --maxlength 250000
WARNING: hex as part of --plots has been deprecated and will be ignored. To get the hex output, rerun with --legacy hex.
Then, I know that I have a sequence above 200kb,
seqkit stat Ju760.Lig.P-1.fastq
file format type num_seqs sum_len min_len avg_len max_len
Ju760.Lig.P-1.fastq FASTQ DNA 438,478 2,242,313,164 4 5,113.9 240,959
and I was trying to find it because I wanted to check the quality score of it, and I couldn't find it in the plot, even after zooming in, check it within the range of the length.
Could you tell why this sequence is not visible or shown in the plot?
Thanks;
The plotting function randomly samples your data, to maximally show 10,000 dots in the plot. More dots makes things slower, and makes HTML images larger. It seems you have more reads than 10,000 and your longest molecule was removed. The idea is to show the overall distribution of the dataset, which would be reflected with 10k reads, but outliers could be lost.
Hi,
I got a fastq.gz file from a friend, and I tried doing QC and Nanoplot crashed.
I installed it fresh to a new environment. I used this code
NanoPlot -t 12 --fastq JustinDMV002.fastq.gz --plots hex dot --maxlength 100000 -p JustinDMV002
Error message:
/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/nanoplotter/nanoplotter_main.py:283: UserWarning:
`distplot` is a deprecated function and will be removed in seaborn v0.14.0.
Please adapt your code to use either `displot` (a figure-level function with
similar flexibility) or `histplot` (an axes-level function for histograms).
For a guide to updating your code to use the new functions, please see
https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751
alpha=0.8))
/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/nanoplotter/nanoplotter_main.py:308: UserWarning:
`distplot` is a deprecated function and will be removed in seaborn v0.14.0.
Please adapt your code to use either `displot` (a figure-level function with
similar flexibility) or `histplot` (an axes-level function for histograms).
For a guide to updating your code to use the new functions, please see
https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751
alpha=0.8))
If you read this then NanoPlot 1.30.1 has crashed :-(
Please try updating NanoPlot and see if that helps...
If not, please report this issue at https://github.com/wdecoster/NanoPlot/issues
If you could include the log file that would be really helpful.
Thanks!
Traceback (most recent call last):
File "/home/juaguila/miniconda3/envs/pomoxis/bin/NanoPlot", line 10, in <module>
sys.exit(main())
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/nanoplot/NanoPlot.py", line 97, in main
plots = make_plots(datadf, settings)
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/nanoplot/NanoPlot.py", line 163, in make_plots
plot_settings=plot_settings)
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/nanoplotter/nanoplotter_main.py", line 135, in scatter
height=10)
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/seaborn/axisgrid.py", line 2311, in jointplot
grid.plot_joint(plt.hexbin, **joint_kws)
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/seaborn/axisgrid.py", line 1828, in plot_joint
func(self.x, self.y, **kwargs)
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/matplotlib/pyplot.py", line 2593, in hexbin
is not None else {}), **kwargs)
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/matplotlib/__init__.py", line 1565, in inner
return func(ax, *map(sanitize_sequence, args), **kwargs)
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 4802, in hexbin
collection.update(kwargs)
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/matplotlib/artist.py", line 1006, in update
ret = [_update_property(self, k, v) for k, v in props.items()]
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/matplotlib/artist.py", line 1006, in <listcomp>
ret = [_update_property(self, k, v) for k, v in props.items()]
File "/home/juaguila/miniconda3/envs/pomoxis/lib/python3.7/site-packages/matplotlib/artist.py", line 1002, in _update_property
.format(type(self).__name__, k))
AttributeError: 'PolyCollection' object has no property 'stat_func'
I don't understand the point of doing the update, when I just installed it today.
I know the file is big, but I thought that it only uses 10.000, it is not using the entire dataset (14 GB of 16kb reads).
Any reason why it did fail, and potentially how to fix this?
Thank you;
Hi,
I don't understand the point of doing the update, when I just installed it today.
According to the log, you are using NanoPlot 1.30.1, which is not the latest version..
I know the file is big, but I thought that it only uses 10.000, it is not using the entire dataset (14 GB of 16kb reads).
That size should be totally fine.
Any reason why it did fail, and potentially how to fix this?
Duplicate of #347
Best,
Wouter