jvhaarst/A50-plot

N50

Opened this issue · 1 comments

Plot is shaping up nicely :)

To me, the addition (longest) in the title has little meaning/added value.

The N50 has a specific definition and always refers to 50% of the total assembly length (of each individual assembly). This number is hard to compare across assemblies, because it is dependent on the total assembly size, which differs for each assembly.

If you want to compare the fragmentation, you should have a lookup value (e.g. 500M) and report the size and index of the contig/scaffold at that value. As notation, I suggest N^500M (so the lookup value as superscript).

Question is what you want to report in this plot. Since it's a comparative plot, what you want to see is the total assembly size, number of seqs, and fragmentation (so N^500M for example).

Good luck.

I50 is great idea.
Total size and num seqs is already there in the legend.

What do you mean by N^500M?