schneebergerlab/plotsr

Only the SNPs are being displayed

Closed this issue · 8 comments

Thank you for such excellent software, I am having a problem with it. I provided the data in a similar format to the example except for the markers.bed file, but in the output image, it seems that only the SNPs are displayed and the other variant information is missing. I'm not sure where I'm going wrong
I used code like this

syri -c out.filtered.coords -d out.filtered.delta -r ecoli.ref.fasta -q ecoli.fasta
 plotsr --sr querysyri.out.txt --genomes genomes.txt --tracks tracks.txt --cfg base.cfg -o output_plot.png -S 0.5 -W 7 -H 10 -f 8

The output image looks like this
output_plot (6)
and the files used
github.zip
Looking forward to your early reply, thanks!

Actually, currently, only the SNPs track is drawn and no actual SNP data is plotted. Are there any error or warning messages when you run plotsr? My best guess is that there could be some issues with the ecoli.snps.sorted.bed file. Could you please check it for correctness and completeness? Or, if possible, share it.

Thanks for replying so quickly ~ here's my reference gff file with the bed file (I intercepted the first 5 columns in syri.out)
add.zip

2023-08-16 16:29:42,684 - Plotsr - INFO - Starting
readbasecfg - ERROR - font is not a valid config parameter. Using default value.
track - WARNING - Reading GFF file ecoli-r_ASM584v2_genomic.gff. Overlapping transcripts would be plotted as such without any filtering.
/data1/leisiru/software/miniconda3/envs/syri_env/lib/python3.8/site-packages/plotsr/scripts/func.py:1808: RuntimeWarning: invalid value encountered in scalar divide
  ypos = [(t*diff/tposmax)+y0 for t in tpos]
matplotlib.font_manager - WARNING - findfont: Font family 'Arial' not found.
matplotlib.font_manager - WARNING - findfont: Font family 'Arial' not found.
matplotlib.font_manager - WARNING - findfont: Font family 'Arial' not found.
matplotlib.font_manager - WARNING - findfont: Font family 'Arial' not found.
matplotlib.font_manager - WARNING - findfont: Font family 'Arial' not found.
matplotlib.font_manager - WARNING - findfont: Font family 'Arial' not found.
matplotlib.font_manager - WARNING - findfont: Font family 'Arial' not found.
matplotlib.font_manager - WARNING - findfont: Font family 'Arial' not found.
2023-08-16 16:29:43,903 - Plotsr - INFO - Plot output_plot.png generated.
2023-08-16 16:29:43,903 - Plotsr - INFO - Finished

I didn't think the font error was too important, so I didn't pay attention to it

Thanks for sharing. The BED file is incorrect. In BED format, the positions are 0-based, where the start position is inclusive and the end position is non-inclusive. This means, that the line:

NC_000913.3	95	95	G	T

should be

NC_000913.3	94	95	G	T

Fixing the BED file should make it work.

And yes, the font issue is not relevant here.

You're so nice!!! The snps are showing up. But still no indel, etc.
2 CPG
22 CPL
1119 DEL
4 DUP
4 DUPAL
222 HDR
929 INS
6 INVDP
7 INVDPAL
27 NOTAL
107271 SNP
15 SYN
307 SYNAL
2 TDM
2 TRANS
2 TRANSAL
output_plot (2)

I tried to add colour information like cl:#c3bef0 etc. to the last column of syri.out for each variant, but the software prompted me with KeyError: '[0, 1, 2, 5, 6, 7, 10] not in index' which seems to be out of the index range, so I cancelled it.

Hi @hedy-ella. Currently, the tracks.txt has information for two tracks (Genes and SNPs). If you want to visualise more tracks, then you would need to provide corresponding BED files in tracks.txt.
Nevertheless, you can visualise rearrangements (inversions, translocations, and duplications) as alignments. I suspect, that in this case, there are only small rearrangements. You can decrease the -s parameter of plotsr to visualise smaller rearrangements.

Thanks for the reply, I was originally going to show inserts, duplicates, etc. in the area below the track like your example file, and thanks, I can show more info there after I change the -s!
Have a nice day~