agshumate/Liftoff

Results of different patterns are inconsistent

Closed this issue · 5 comments

Hello,

I found something inconsistent bewteen different patterns in Liftoff. I first run Liftoff with the parameter -chroms and then I run Liftoff without -chroms, but I found in the first case, some genes such as 'evm.TU.15.540' (this name is from evidencemodeler, means the No.540 gene at chromosome 15) was not lift suceesfully from reference genome, but in the second case, this gene was succesfully lift over, and both coverage and sequence_ID are 1.0. I have no mind why this happended, could you help me?

Hi,
Did this gene lift over to chr15 in the target genome or somewhere else? It is not unexpected genes map slightly differently with -chroms and without. In the first case, Liftoff maps genes chromosome by chromosome and only tries to map genes that failed to the rest of the genome. In the second case liftoff maps genes to their best location regardless of chromosome. since liftoff does not allow genes to map to overlapping loci, the priority of which genes to keep in which locus differs between these two options. e.g. with -chroms a gene from ref chr14 maps to chr14 in the target with 0.99% identity. A gene from chr15 in the reference fails to map to chr15 in the target, but instead maps to this same chr14 locus with 100% identity. In the first case, the chr14 gene will be kept because it is on the correct chromosome and the chr15 gene will be considered unmapped. In the second case the chr15 gene will be kept because it has a higher sequence identity and Liftoff does not care about chromosomes. I hope this makes sense!

Sorry for I didn't explain my quesiton clearly. In the first case, the gene 'evm.TU.15.540' didn't lift over to target genome, I couldn't find it anywhere, actually I had set a very strict threshold in both cases ( -a 0.9 -s 0.9), after looked your suggestions, I re-run the first case without -a and -s, but this gene was still not present, I still feel confused about such situation. As you said, the gene should "choose" the proper and correct chromosome in the first case instead of the second case, but now it looks opposite, in the first case, with the default coverage and identity threshold, 'evm.TU.15.540' couldn't find place to lift over, but in the second case, this gene found the correct chromosme (from reference chr.15 to target chr.15).

If you look at the GFF from the first case and look at the locus where you expect evm.TU.15.540 to be (based on where it mapped in the second case), is there a different gene mapped there? If so, what is its coverage and sequence identity. If your data is available or you are able to share your data with me, i would be happy to take a look.

Thanks for your mention, there is indeed a gene 'evm.TU.15.541' in the same location with both coverage and sequnece_ID are 1.0, maybe this can explain why 'evm.TU.15.540' cannot lift over, cause in the first case the gene 'evm.TU.15.541' is more better than it in this place. I am sorry I can't share my data for it has not been published yet. Thanks a lot for your kind help!

no problem! when two genes map equally well to the same place liftoff will only allow one gene to map there (it tries to use gene order to pick which gene to keep). It will try to map the gene to another locus, but if it cannot find a mapping, the gene will be considered unmapped. So this seems to be an example of a case where your target assembly has 1 less copy of that particular gene than the reference.