malonge/RaGOO

Unconventional Use of RagTag

goeckeritz opened this issue · 3 comments

Hi there,

Recently I ran RagTag on a scaffolded assembly (created with long read sequences and HiC data) from a plant allopolyploid - both suspected progenitors are extant. 1 progenitor has a published genome, which was created from short-reads and scaffolding of a related species. It's quite possible that my scaffolded assembly is actually more contiguous than the reference of this progenitor, but that's aside the point of the question I am interested in. The other suspected progenitor does not have a published genome.

Simply put, since the allopolyploidization event is recent (estimated to be much less than <1 mya), and there is marker evidence that many regions of the genome segregate as a diploid (some don't -- this species is a segmental allotetraploid!), I was interested in using RagTag to estimate the subgenome groupings of the scaffolds. I figured that if the grouping_confidence scores between two scaffolds assigned to the same progenitor reference chromosome were substantially different, the higher scoring one is likely derived from that progenitor. By default, the other scaffold is assigned to the other progenitor.

I'm sure you can think of a number of flaws with this approach -- but the main one I am struggling with at the moment is I don't have a great sense of how to tell when the difference between 2 grouping_confidence scores is substantial enough to assign the scaffolds confidently to a subgenome. I suppose I could do a t-test of the differences of the 8 groupings and see which are significant... but being the creator of RagTag, I was interested in what you thought of this approach?

Ideally I would be doing a Ks comparison between the progenitor and my scaffolds, but my assembly is not yet annotated, so the coding regions haven't been picked out quite yet. I was hoping to label the scaffolds before doing so, but maybe I should just suck it up and name them later! I also thought about using polyCRACKER, but I'm not familiar with docker whatsoever and the thing seemed like it would be a pain in the ass to get running.

Attached is a file containing my confidence scores. Any advice is much appreciated!

Kindly,
Charity
ragtag.confidence1_16B.txt

Hi Charity,

Could I ask you to open up this same issue on the RagTag GitHub? I eventually plan on archiving this RaGOO repo. In the meanwhile, I will take a look at your issue.

Thanks

Hi Michael,

Yes, done! Sorry about that!

Kindly,
Charity

no problem at all