Advanced diplotype clustering orders genes in the CNV section by label rather than by genomic position
Closed this issue · 8 comments
If the cnv_region
contains multiple genes, the CNV heatmap rows will be ordered by the gene label rather than their genomic position. This can make it a bit confusing to try to understand the structure of the CNVs. Would be better to order by genomic position.
Here's an example:
af1.plot_diplotype_clustering_advanced(
region='X:8,438,477-8,460,887',
snp_transcript='LOC125764232_t1',
cnv_region='X:8,418,477-8,480,887',
sample_sets=['1232-VO-KE-OCHOMO-VMF00044', '1231-VO-MULTI-WONDJI-VMF00043', '1236-VO-TZ-OKUMU-VMF00090'],
sample_query="country in ['Kenya', 'Uganda', 'Tanzania'] and taxon == 'funestus'",
)
interesting, didnt notice this!
Just had a look at your example in the Af1 GFF, they are already in the order of genomic position, although LOC125764275 (middle gene) is on reverse strand.
No I don't think so, here are the three genes in the region I wanted to show CNV data for...
The middle gene should be LOC125764232 but it's not.
Actually, maybe the problem is that the GFF isn't sorted...
Suggested fix is to sort the GFF when it is loaded within the genome_features() function.
Did this ever get resolved? @leehart @alimanfoo
i can do it