Issues with plotting: OTU not found
naurasd opened this issue · 3 comments
Hi,
I ran swarm and clustering with d = 13 (yes, very large, I know. I am trying a few parameters others have used. I am working with animal COI metabarcoding data with high intra-specific variability). Went smooth.
I would like to plot the 3rd OTU. I am running the following command (adjusted version from your paper's supplementary material Supp1.
python graph_plot.py -s statistics.txt -i internal.txt -o 3
The statistics.txt and the internal.txt files are the files that have been created for the -s and -i parameters when performing the initial clustering step with swarm. I am leaving out the -d
parameters for now. As I can see, it defaults to zero when not defined.
However, I get this error message:
python graph_plot.py -s statistics.stats -i internal.struct -o 3
Error: OTU does not exists or contains only one element.
Reading target OTU
Parsing amplicon relationships
Why does the 3rd OTU not exist? I have more than 9,000 OTUs.
This is how my statistics.txt file looks (exemplary for the first rows). Ignore the bold font in the first row.
20 | 481765 | ASV1 | 362738 | 0 | 3 | 36 |
---|---|---|---|---|---|---|
16 | 386950 | ASV2 | 210150 | 0 | 1 | 12 |
168 | 476890 | ASV3 | 176472 | 0 | 5 | 55 |
11 | 145906 | ASV4 | 143517 | 0 | 1 | 6 |
35 | 244657 | ASV6 | 101936 | 0 | 2 | 20 |
7 | 187436 | ASV7 | 88833 | 0 | 1 | 13 |
This is how my internal.txt file looks like (exemplary for the first rows). Ignore the bold font in the first row.
ASV1 | ASV20 | 2 | 1 | 1 |
---|---|---|---|---|
ASV1 | ASV41 | 1 | 1 | 1 |
ASV1 | ASV71 | 1 | 1 | 1 |
ASV1 | ASV79 | 1 | 1 | 1 |
ASV1 | ASV477 | 1 | 1 | 1 |
ASV1 | ASV1299 | 1 | 1 | 1 |
ASV1 | ASV1985 | 2 | 1 | 1 |
I would appreciate your help. The -s and -i output files are written by swarm based on your algorithm, so I don't see why my OTU isn't found. The same problem occured when I told swarm to write the files as .stats and .struct files as in your code from the supplementary material.
Thanks so much.
Nauras
@naurasd thank you for trying swarm.
python graph_plot.py -s statistics.txt -i internal.txt -o 3
The statistics.txt and the internal.txt files are the files that have been created for the -s and -i parameters when performing the initial clustering step with swarm.
Yes, graph_plot.py --internal_structure internal.txt
corresponds to swarm --internal-structure internal.txt
, but graph_plot.py --swarms swarms.txt
corresponds to swarm --output swarms.txt
(i.e. swarm's default output), not to swarm --statistics-file stats.txt
.
I realize now that the mixed option names are confusing (-s
for graph_plot
and -o
for swarm
). Sorry about that.
thanks for getting back to me about this. Will try again then.
However, seeing that you mention the plotting option in your paper and how I got it wrong trying to understand the procedure from the supplementary material, I think it would be necessary to add an explanation with an example to the github repository.
Cheers
Nauras
Thanks for the suggestion.
I've added an example to the help message of the graph_plot.py
script (commit f3a7c87).