errors during the run of Augustus
Closed this issue · 43 comments
Hi Luigi!
I ran the pipeline (options: --stranded --proteins --short_reads --adapter --mask_genome --max_intron_length 10000 ) and it created an error:
Traceback (most recent call last):
File "/opt/LoReAn/code/lorean.py", line 560, in
main()
File "/opt/LoReAn/code/lorean.py", line 288, in main
augustus_file, genemark_file = inputEvm.braker_folder_find(braker_folder)
File "/opt/LoReAn/code/prepareEvmInputs.py", line 115, in braker_folder_find
gff = [y for x in os.walk(location) for y in glob(os.path.join(x[0], "augustus.hints.gff"))][0]
IndexError: list index out of range
Any suggestions?
Unfortunately, I cannot provide much more details of the run since all intermediate files (outputs of previous tools) were deleted with this crash...
Is it possible to include intermediates in future versions and restart from where the pipeline crashed?!
Thanks for your help
Michael
I think that I solved it. I need few days to make a new image.
That's awesome! Thanks! Looking forward to try again...
Michael
Hi Luigi!
Any updates on this?
Best
Michael
@michieitel
i made a new image. Can you test it?
cheers
Luigi
Hi Luigi!
Thanks for creating a new image. How can I check it is the latest?
I pulled with:
singularity pull docker://lfaino/lorean:noIPRS
when running 'lorean -h' it says 2017 in the last line...!?
cheers
Michael
ah I see. thanks. will try and report
I pulled using the link above... still says 2017 ;-)
But since I got the lorean_latest.sif I assume it should be the correct version...
i guess so. I never noticed
same error again:
Traceback (most recent call last):
File "/opt/LoReAn/code/lorean.py", line 596, in
main()
File "/opt/LoReAn/code/lorean.py", line 320, in main
augustus_file, genemark_file = inputEvm.braker_folder_find(braker_folder)
File "/opt/LoReAn/code/prepareEvmInputs.py", line 115, in braker_folder_find
gff = [y for x in os.walk(location) for y in glob(os.path.join(x[0], "augustus.hints.gff"))][0]
IndexError: list index out of range
PLUS all intermediated removed automatically again...
@michieitel
can you send me the singularity command?
thanks
singularity exec \
-B /home/ubuntu/cbas/lorean/config/:/opt/LoReAn/third_party/software/augustus/config/ \
-B /home/ubuntu/cbas/lorean/Libraries/:/usr/local/RepeatMasker/Libraries/ \
/home/ubuntu/cbas/lorean/lorean_latest.sif \
lorean -t 20 -sp cbas_masurca2 \
--stranded \
--proteins /home/ubuntu/cbas/lorean/data/PORI_Demo_Amphimedon_queenslandica_v2.1__P__FERNANDEZ-VALVERDE.fasta \
--short_reads /home/ubuntu/cbas/lorean/data/CBAS_Concatenated_RNAseq_Read1_Clean_Datasets.fastq,/home/ubuntu/cbas/lorean/data/CBAS_Concatenated_RNAseq_Read2_Clean_Datasets.fastq \
--long_reads /home/ubuntu/cbas/lorean/data/cbas_cDNA_polyA-guppy-3.1.5-hac.porechop_100bp_to_20kb_combined.fastq \
--adapter /home/ubuntu/cbas/lorean/data/TruSeq3-PE-2.fa \
--mask_genome \
--working_dir lorean_run1 \
--max_intron_length 40000 \
/home/ubuntu/cbas/lorean/data/CBAS_MASURCA-2_final.genome.scf.fasta 2> cbas_lorean_run1.log
we can try adding add --keep_tmp to lorean options. The folder will not disappear.
do you have the geneMark key in the home folder of the user that run LoReAn?
ubuntu@lorean:~/cbas/lorean$ cat ~/.gm_key
TTGTTCAATTAGCACGGATGTTTTTTTTTTTTTTTTCCGTCGCCATAAAGTTACTAACAGAATTCAAAAGGGAGCGCATA
520951310
If it helps to keep inmtermediates to find the error I can run again with the suggested option ...
ok
I will test the image again to see if something is wrong.
should I run it on a dummy set?
can you include a test dataset in the repo that we can both run?
it is there.
oh there is one. I will try using that one!
not sure if I can ask here, but is it too difficult to implement an option that allows to pass user-defined options for the gmap step of long reads.... I am asking since I have figured out the settings that worked best for my set of nanopore reads.
which setting do you mean? can you tell me the option?
on the toy dataset all works fine...
is /home/ubuntu/cbas/lorean/config/ folder writing accessible?
for gmap settings I am using:
-k 15 -B 4 --cross-species -A --exons=cdna --format=samse --npaths=0 --sam-extended-cigar
not sure which of these you included...
home/ubuntu/cbas/lorean/config/ is accessible
I got the same error for both example datasets ...
example 1: Plicaturopsis
singularity exec \
-B /home/ubuntu/cbas/lorean/config/:/opt/LoReAn/third_party/software/augustus/config/ \
-B /home/ubuntu/cbas/lorean/Libraries/:/usr/local/RepeatMasker/Libraries/ \
/home/ubuntu/cbas/lorean/lorean_latest.sif \
lorean -a -d -f -mg -t 20 --keep_tmp -rp repeats.scaffold3.bed \
-sr /home/ubuntu/cbas/lorean/LoReAn_Example/Crispa/scaffold3.short_1.fastq,/home/ubuntu/cbas/lorean/LoReAn_Example/Crispa/scaffold3.short_2.fastq \
-lr /home/ubuntu/cbas/lorean/LoReAn_Example/Crispa/scaffold3.long.fasta \
-pr /home/ubuntu/cbas/lorean/LoReAn_Example/Crispa/scaffold3.prot.fasta \
-sp crispa \
--working_dir Plicaturopsis \
/home/ubuntu/cbas/lorean/LoReAn_Example/Crispa/scaffold3.fasta
example 2: Verticillium
singularity exec \
-B /home/ubuntu/cbas/lorean/config/:/opt/LoReAn/third_party/software/augustus/config/ \
-B /home/ubuntu/cbas/lorean/Libraries/:/usr/local/RepeatMasker/Libraries/ \
/home/ubuntu/cbas/lorean/lorean_latest.sif \
lorean -t 20 --keep_tmp -a -f -d \
-sr /home/ubuntu/cbas/lorean/LoReAn_Example/JR2/readsChr.subset.fastq \
-lr /home/ubuntu/cbas/lorean/LoReAn_Example/JR2/longReadsChr8.fastq \
-pr /home/ubuntu/cbas/lorean/LoReAn_Example/JR2/subset.prot.fasta \
-sp JR2 \
--working_dir Verticillium \
-mg /home/ubuntu/cbas/lorean/LoReAn_Example/JR2/chr8.fasta
In the beaker folder in run folder, you should see a genemark folder.
Can you see any error or log file?
Can you check if you get any errors?
what I did not understand for the examples was you specify adaptors with '-a' but then don't pürovide a file? what is the default?
There is a module that looks for them
so I don't have to specify the adaptor file? it is included in the image?
how can I access the braker folder of the image?
It is better if you specify but without it will work.
Can you find out if genemark worked in the braker folder?
how can I access the braker folder of the image?
error log for example 1:
Use of uninitialized value $epath in concatenation (.) or string at /opt/LoReAn/third_party/software/BRAKER/scripts//braker.pl line 2370.
ERROR in file /opt/LoReAn/third_party/software/BRAKER/scripts//braker.pl at line 5616
Failed to execute: perl /opt/LoReAn/third_party/software/gm_et_linux_64/gmes_petap//gmes_petap.pl --verbose --sequence=/home/ubuntu/cbas/lorean/LoReAn_Plicaturopsis/run/braker/genome.fa --ET=/home/ubuntu/cbas/lorean/LoReAn_Plicaturopsis/run/braker/genemark_hintsfile.gff --et_score 10 --max_intergenic 50000 --cores=9 --fungus 1>/home/ubuntu/cbas/lorean/LoReAn_Plicaturopsis/run/braker/GeneMark-ET.stdout 2>/home/ubuntu/cbas/lorean/LoReAn_Plicaturopsis/run/braker/errors/GeneMark-ET.stderr
looks like genemark cannot be fired up? key-related?
Can you check the .error file of genemark?
What is inside?
no .err in GenMark folder. content:
drwxr-xr-x 6 ubuntu ubuntu 4.0K Jul 30 17:55 GeneMark-ET
-rw-r--r-- 1 ubuntu ubuntu 1.5K Jul 30 17:55 GeneMark-ET.stdout
-rw-r--r-- 1 ubuntu ubuntu 10 Jul 30 17:54 bam_header.map
-rw-r--r-- 1 ubuntu ubuntu 714 Jul 30 17:55 braker.error.log
-rw-r--r-- 1 ubuntu ubuntu 9.0K Jul 30 17:54 braker.log
drwxr-xr-x 2 ubuntu ubuntu 4.0K Jul 30 17:54 errors
-rw-r--r-- 1 ubuntu ubuntu 436K Jul 30 17:54 genemark_hintsfile.gff
-rw-r--r-- 1 ubuntu ubuntu 2.4M Jul 30 17:54 genome.fa
-rw-r--r-- 1 ubuntu ubuntu 10 Jul 30 17:54 genome_header.map
-rw-r--r-- 1 ubuntu ubuntu 436K Jul 30 17:54 hintsfile.gff
drwxr-xr-x 2 ubuntu ubuntu 4.0K Jul 30 17:54 species
This
/home/ubuntu/cbas/lorean/LoReAn_Plicaturopsis/run/braker/errors/GeneMark-ET.stderr
GeneMark.hmm eukaryotic 3
GeneMark.hmm eukaryotic 3
Your license period has ended. We hope that you found this
Your license period has ended. We hope that you found this
software useful. If you would like to renew this license,
software useful. If you would like to renew this license,
please contact GeneProbe, Inc. at custserv@genepro.com
please contact GeneProbe, Inc. at custserv@genepro.com(in cleanup) Can't call method "FETCH" on an undefined value at /usr/local/share/perl/5.22.1/Object/InsideOut.pm line 1953 during global destruction.
(in cleanup) Can't call method "FETCH" on an undefined value at /usr/local/share/perl/5.22.1/Object/InsideOut.pm line 1953 during global destruction.
(in cleanup) Can't call method "FETCH" on an undefined value at /usr/local/share/perl/5.22.1/Object/InsideOut.pm line 1953 during global destruction.
The key of genemark is expired. You need a new one
I feel stupid now... let me get it and run again.
many thanks for now
Hi Luigi,
the pipeline finished. It was that error. Now playing with weighings of the datasets.
Many Thanks for your help!
Michael