EDTA222-docker-image-do not find LTR_FINDER_parallel
Closed this issue · 3 comments
$ /usr/bin/time -v singularity exec ./EDTA222.sif EDTA.pl \
--genome NbT2T
--threads 12
--overwrite 0
--anno 1
--sensitive 1
--evaluate 1
--cds Nb-cds-NCBI.fa
#########################################################
Extensive de-novo TE Annotator (EDTA) v2.2.2
Shujun Ou (shujun.ou.1@gmail.com)
#########################################################
Parameters: --genome NbT2T --threads 12 --overwrite 0 --anno 1 --sensitive 1 --evaluate 1 --cds Nb-cds-NCBI.fa
Wed Dec 4 00:57:58 CST 2024 Dependency checking:
All passed!
A CDS file Nb-cds-NCBI.fa is provided via --cds. Please make sure this is the DNA sequence of coding regions only.
Wed Dec 4 00:58:12 CST 2024 Obtain raw TE libraries using various structure-based programs:
Wed Dec 4 00:58:12 CST 2024 EDTA_raw: Check dependencies, prepare working directories.
Wed Dec 4 00:58:26 CST 2024 Start to find LTR candidates.
Wed Dec 4 00:58:26 CST 2024 Identify LTR retrotransposon candidates from scratch.
Can't open perl script "/usr/local/share/EDTA/bin/LTR_FINDER_parallel/LTR_FINDER_parallel": No such file or directory
cat: NbT2T.mod.finder.combine.scn: No such file or directory
Error: cd-hit-est is not found in the CDHIT path !
awk: fatal: cannot open file `NbT2T.mod.pass.list' for reading: No such file or directory
Warning: LOC list - is empty.
perl rename_LTR_skim.pl target_sequence.fa LTR_retriever.defalse
Error: Error while loading sequence
Filter sequence based on TEsorter classifications. Unclassified sequences will also be output to the clean file.
Usage: perl cleanup_misclas.pl sequence.fa.rexdb.cls.tsv
Author: Shujun Ou (shujun.ou.1@gmail.com) 10/11/2019
mv: cannot stat 'NbT2T.mod.LTR.intact.fa.ori.dusted.cln.cln': No such file or directory
mv: cannot stat 'NbT2T.mod.LTR.intact.fa.ori.dusted.cln.cln.list': No such file or directory
cp: cannot stat 'NbT2T.mod.LTR.intact.raw.fa.anno.list': No such file or directory
ERROR: No such file or directory at /usr/local/share/EDTA/bin/output_by_list.pl line 39.
perl filter_gff3.pl file.gff3 file.list > new.gff3
Wed Dec 4 01:10:24 CST 2024 Warning: The LTR result file has 0 bp!
Wed Dec 4 01:10:24 CST 2024 Start to find SINE candidates.
Wed Dec 4 04:06:21 CST 2024 Finish finding SINE candidates.
Wed Dec 4 04:06:21 CST 2024 Start to find LINE candidates.
Wed Dec 4 04:06:21 CST 2024 Identify LINE retrotransposon candidates from scratch.
^CCommand terminated by signal 2
User time (seconds): 6.94
System time (seconds): 6.27
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 11:05:24
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 554204
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 409
Minor (reclaiming a frame) page faults: 286175
Voluntary context switches: 5742
Involuntary context switches: 22434
Swaps: 0
File system inputs: 5860558
File system outputs: 5565448
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Hi, I am working with the latest singularity image I pulled today but am also facing the same error. I think its related to this bug.
The RepeatMasker configure script must be run from
inside the RepeatMasker installation directory:
/usr/local/share/RepeatMasker
Perhaps this is not the "configure" you are looking for?
#########################################################
Extensive de-novo TE Annotator (EDTA) v2.2.2
Shujun Ou (shujun.ou.1@gmail.com)
#########################################################
Parameters: --genome /work/projects/ms-amishra44/amishra44/genome.fa --cds /work/projects/ms-amishra44/amishra44/pan_te/data/sra_long/EDTA/test/genome.cds.fa --curatedlib /work/projects/ms-amishra44/amishra44/pan_te/data/sra_long/EDTA/database/rice7.0.0.liban --exclude /work/projects/ms-amishra44/amishra44/pan_te/data/sra_long/EDTA/test/genome.exclude.bed --overwrite 1 --sensitive 1 --anno 1 --threads 10 --force 1 --repeatmasker /usr/local/share/RepeatMasker
Mon Mar 10 14:08:47 CDT 2025 Dependency checking:
All passed!
A custom library /work/projects/ms-amishra44/amishra44/pan_te/data/sra_long/EDTA/database/rice7.0.0.liban is provided via --curatedlib. Please make sure this is a manually curated library but not machine generated.
A CDS file /work/projects/ms-amishra44/amishra44/pan_te/data/sra_long/EDTA/test/genome.cds.fa is provided via --cds. Please make sure this is the DNA sequence of coding regions only.
A BED file is provided via --exclude. Regions specified by this file will be excluded from TE annotation and masking.
Mon Mar 10 14:08:53 CDT 2025 Obtain raw TE libraries using various structure-based programs:
Mon Mar 10 14:08:53 CDT 2025 EDTA_raw: Check dependencies, prepare working directories.
Mon Mar 10 14:09:10 CDT 2025 Start to find LTR candidates.
Mon Mar 10 14:09:10 CDT 2025 Identify LTR retrotransposon candidates from scratch.
Error: cd-hit-est is not found in the CDHIT path !
awk: fatal: cannot open file `genome.fa.mod.pass.list' for reading: No such file or directory
Warning: LOC list - is empty.
perl rename_LTR_skim.pl target_sequence.fa LTR_retriever.defalse
Error: Error while loading sequence
Filter sequence based on TEsorter classifications. Unclassified sequences will also be output to the clean file.
Usage: perl cleanup_misclas.pl sequence.fa.rexdb.cls.tsv
Author: Shujun Ou (shujun.ou.1@gmail.com) 10/11/2019
mv: cannot stat 'genome.fa.mod.LTR.intact.fa.ori.dusted.cln.cln': No such file or directory
mv: cannot stat 'genome.fa.mod.LTR.intact.fa.ori.dusted.cln.cln.list': No such file or directory
cp: cannot stat 'genome.fa.mod.LTR.intact.raw.fa.anno.list': No such file or directory
ERROR: No such file or directory at /usr/local/share/EDTA/bin/output_by_list.pl line 39.
perl filter_gff3.pl file.gff3 file.list > new.gff3
Mon Mar 10 14:09:19 CDT 2025 Warning: The LTR result file has 0 bp!
Mon Mar 10 14:09:19 CDT 2025 Start to find SINE candidates.
Mon Mar 10 14:10:12 CDT 2025 Warning: The SINE result file has 0 bp!
Mon Mar 10 14:10:12 CDT 2025 Start to find LINE candidates.
Mon Mar 10 14:10:12 CDT 2025 Identify LINE retrotransposon candidates from scratch.
Mon Mar 10 14:11:30 CDT 2025 Warning: The LINE result file has 0 bp!
Mon Mar 10 14:11:30 CDT 2025 Start to find TIR candidates.
Mon Mar 10 14:11:30 CDT 2025 Identify TIR candidates from scratch.
Species: others
Mon Mar 10 14:12:13 CDT 2025 Finish finding TIR candidates.
Mon Mar 10 14:12:13 CDT 2025 Start to find Helitron candidates.
Mon Mar 10 14:12:13 CDT 2025 Identify Helitron candidates from scratch.
Mon Mar 10 14:12:52 CDT 2025 Finish finding Helitron candidates.
Mon Mar 10 14:12:52 CDT 2025 Execution of EDTA_raw.pl is finished!
Mon Mar 10 14:12:52 CDT 2025 Obtain raw TE libraries finished.
All intact TEs found by EDTA:
genome.fa.mod.EDTA.intact.raw.fa
genome.fa.mod.EDTA.intact.raw.gff3
Mon Mar 10 14:12:52 CDT 2025 Perform EDTA advance filtering for raw TE candidates and generate the stage 1 library:
Warning: No sequences were masked
Mon Mar 10 14:16:04 CDT 2025 EDTA advance filtering finished.
Mon Mar 10 14:16:04 CDT 2025 Perform EDTA final steps to generate a non-redundant comprehensive TE library.
Filter RepeatModeler results that are ignored in the raw step.
Mon Mar 10 14:16:14 CDT 2025 Clean up TE-related sequences in the CDS file with TEsorter.
Remove CDS-related sequences in the EDTA library.
Remove CDS-related sequences in intact TEs.
Mon Mar 10 14:22:35 CDT 2025 Combine the high-quality TE library rice7.0.0.liban with the EDTA library:
Mon Mar 10 14:23:39 CDT 2025 EDTA final stage finished! You may check out:
The final EDTA TE library: genome.fa.mod.EDTA.TElib.fa
Family names of intact TEs have been updated by rice7.0.0.liban: genome.fa.mod.EDTA.intact.gff3
Comparing to the provided library, EDTA found these novel TEs: genome.fa.mod.EDTA.TElib.novel.fa
The provided library has been incorporated into the final library: genome.fa.mod.EDTA.TElib.fa
Mon Mar 10 14:23:39 CDT 2025 Perform post-EDTA analysis for whole-genome annotation:
Mon Mar 10 14:23:39 CDT 2025 Homology-based annotation of TEs using genome.fa.mod.EDTA.TElib.fa from scratch.
Mon Mar 10 14:23:58 CDT 2025 TE annotation using the EDTA library has finished! Check out:
Whole-genome TE annotation (total TE: 34.61%): genome.fa.mod.EDTA.TEanno.gff3
Whole-genome TE annotation summary: genome.fa.mod.EDTA.TEanno.sum
Whole-genome TE divergence plot: genome.fa.mod_divergence_plot.pdf
Whole-genome TE density plot: genome.fa.mod.EDTA.TEanno.density_plots.pdf
Low-threshold TE masking for MAKER gene annotation (masked: 17.27%): genome.fa.mod.MAKER.masked
Mon Mar 10 14:23:58 CDT 2025 Evaluate the level of inconsistency for whole-genome TE annotation:
Mon Mar 10 14:24:05 CDT 2025 Evaluation of TE annotation finished! Check out these files:
Overall: genome.fa.mod.EDTA.TE.fa.stat.all.sum
Nested: genome.fa.mod.EDTA.TE.fa.stat.nested.sum
Non-nested: genome.fa.mod.EDTA.TE.fa.stat.redun.sum
If you want to learn more about the formatting and information of these files, please visit:
https://github.com/oushujun/EDTA/wiki/Making-sense-of-EDTA-usage-and-outputs---Q&A