agshumate/Liftoff

Can this program liftoff repeats features like transposons between the two genomes of same species?

Closed this issue · 9 comments

Hi authors,
Thanks for devoloping this very useful tool. I am now tring to use it to liftoff repeats features from one version genome assembly to another of same species but meeting an question. It seems abnormal because it has taken over five days to run and I do know when it will end. Does not this tool deal with repeats features like transposons ?

Thanks for reading this and I am looking to your valuable advice.

Hi are you using the latest version? (v1.3.0)

Hi, I am using v1.1.2. Can v1.3.0 avoid this and deal with repeats features like transposons?

Hi, v1.3.0 will solve the issue of the time it takes to run. However, it has not been tested with repeat features so I cannot guarantee accuracy. If the features have many possible alignments throughout the genome, it may not map them to the correct location. It is certainly worth trying though, but please do upgrade to v1.3.0

Hi, I have tried v1.3.0 but still met an error:
"Traceback (most recent call last):
File "/data/nfs/OriginTools/pcj/python3/miniconda3/bin/liftoff", line 11, in
load_entry_point('Liftoff==1.3.0', 'console_scripts', 'liftoff')()
File "/data/nfs/OriginTools/pcj/python3/miniconda3/lib/python3.7/site-packages/Liftoff-1.3.0-py3.7.egg/liftoff/run_liftoff.py", line 18, in main
File "/data/nfs/OriginTools/pcj/python3/miniconda3/lib/python3.7/site-packages/Liftoff-1.3.0-py3.7.egg/liftoff/liftover_types.py", line 12, in lift_original_annotation
File "/data/nfs/OriginTools/pcj/python3/miniconda3/lib/python3.7/site-packages/Liftoff-1.3.0-py3.7.egg/liftoff/extract_features.py", line 13, in extract_features_to_lift
File "/data/nfs/OriginTools/pcj/python3/miniconda3/lib/python3.7/site-packages/Liftoff-1.3.0-py3.7.egg/liftoff/extract_features.py", line 124, in get_gene_sequences
File "/data/nfs/OriginTools/pcj/python3/miniconda3/lib/python3.7/site-packages/Liftoff-1.3.0-py3.7.egg/liftoff/extract_features.py", line 146, in write_gene_sequences_to_file
IndexError: list index out of range"

the command is as follows:
"/data/nfs/OriginTools/pcj/python3/miniconda3/bin/liftoff -exclude_partial -p 50 -s 0.5 -a 0.5 -t Gshe.genome_chr.fa -r Gshe.genome.fa -g Gshe.repeat.gff -o Gshe.genome_chr.repeat.gff"

I do not know why

Thanks for reading and I am looking forward to your help.

Hi,
Does your gff file only contain repeat features? By default Liftoff looks tries to lift over 'gene' features and their children, and this error indicates to me that there are no gene features in your gff. if you want to lift over features besides genes, use -f to provide a file with the names of the features you want to lift over.

hi,
by setting the -s and -a parameters to 1 and using -exclude_partial, you are only going to map features that are 100% identical to the reference, so even if there is a single nucleotide difference, the repeat will be considered unmapped. If you do indeed only want to count a repeat as 'mapped' if it is a perfect match, and are wondering why so many do not have perfect matches, you can get a better insight to what is going on by looking in the intermediate_files directory at the .sam. alignments from minimap2

Hi,
Yes that is something I will add in the near future.