lmrodriguezr/nonpareil

Better visualization of Nonpareil.curve.batch for 200 samples

Jigyasa3 opened this issue · 3 comments

Hey @lmrodriguezr

I was wondering if it's possible to call ggplot or any other visualization R software to get a cleaner/publication quality image for Nonpareil.curve for 200 samples?

My codes-
$ls 230*.npo > sample_list_batch.txt #add .rpo filenames to a text file
R

sample_names<-read.table("sample_list_batch.txt",header=FALSE)
colnames(sample_names)<-c("File") #the filenames is the first column called "File"
sample_names$Name<-gsub("_R1.fastq.gz.fa-nonpareli-output.npo","",sample_names$File) #create a new column called "Name"
attach(sample_names)
pdf("batch_curve_plot.pdf")
np<-Nonpareil.curve.batch(sample_names$File,label=sample_names$Name ,modelOnly=TRUE);
#>Nonpareil.legend(np)
detach(sample_names)
dev.off()

My plot-
image

Thanks for help!
Jigyasa

Hello @Jigyasa3

I recommend considering these three things that could improve the visualization for you:

1. Plot model only

You're currently using an old flag (modelOnly = TRUE), that is being ignored. Use instead plot.observed = FALSE:

np <- Nonpareil.curve.batch(sample_names$File, label = sample_names$Name, plot.observed = FALSE);

2. Do not include the legend

There is no way to effectively pair 200 lines with different colors with their corresponding legend, so I'd suggest removing it all together (i.e., do not call Nonpareil.legend). If you have groups of samples that are meaningful for your manuscript, I'd suggest using that instead. You can pass the colors you want to Nonpareil.curve.batch (by default the colors are generated at random), so you could have groups of samples instead.

3. Consider passing other graphical parameters

The Nonpareil.curve.batch function would take almost any additional graphical parameters (I personally use las = 1, for example). Also, there is only one active plot, so you can call any other functions afterwards (e.g., legend).

I hope these tips help improve the visualization.

M

Thank you for replying @lmrodriguezr ! Removing the legend does improve the plot, but it's still a bit crowded with 200 samples. Could you check the new issue I have created for the same?

Thank you for your help!

Did you correct the flag to plot.observed = FALSE? That will also reduce the noise in the plot.