teflon error
Closed this issue · 17 comments
Hi,
When I run the teflon analysis with command
python3 mcclintock.py
-r /work/mcclintock/test/sacCer2.fasta
-c /work/mcclintock/test/sac_cer_TE_seqs.fasta
-g /work/mcclintock/test/reference_TE_locations.gff
-t /work/mcclintock/test/sac_cer_te_families.tsv
-1 /data/mcclintock/test/SRR800842_1.fastq.gz
-2 /data/mcclintock/test/SRR800842_2.fastq.gz
-p 10
-m teflon
-o /data/mcclintock/test/output/
I got the error
Job counts:
count jobs
1 make_consensus_fasta
1 make_reference_fasta
1 make_te_annotations
1 setup_reads
1 summary_report
1 teflon_post
1 teflon_preprocessing
1 teflon_run
8
Environment defines Python version < 3.5. Using Python of the master process to execute script. Note that this cannot be avoided, because the script uses data structures from Snakemake which are Python >=3.5 only.
Environment defines Python version < 3.5. Using Python of the master process to execute script. Note that this cannot be avoided, because the script uses data structures from Snakemake which are Python >=3.5 only.
python /work/mcclintock/install/tools/teflon/teflon_collapse.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -t 10 -n1 1 -n2 1 -q 20
[Thu Jan 14 18:10:01 2021]
Error in rule teflon_run:
jobid: 2
output: /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/genotypes/sample.genotypes.txt
conda-env: /work/mcclintock/install/envs/conda/c707b3e8RuleException:
CalledProcessError in line 49 of /work/mcclintock/snakefiles/teflon.snakefile:
Command 'source /opt/conda/envs/mcclintock/bin/activate '/work/mcclintock/install/envs/conda/c707b3e8'; set -euo pipefail; /opt/conda/envs/mcclintock/bin/python3.7 /data/mcclintock/test/output/snakemake/3802957/.snakemake/scripts/tmpifvv9aex.teflon_run.py' returned non-zero exit status 1.
File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 2189, in run_wrapper
File "/work/mcclintock/snakefiles/teflon.snakefile", line 49, in __rule_teflon_run
File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 529, in _callback
File "/opt/conda/envs/mcclintock/lib/python3.7/concurrent/futures/thread.py", line 57, in run
File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 515, in cached_or_run
File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 2201, in run_wrapper
Exiting because a job execution failed. Look above for error message
snakemake --use-conda --conda-prefix /work/mcclintock/install/envs/conda --quiet --configfile /data/mcclintock/test/output/snakemake/config/config_3802957.json --cores 10 /data/mcclintock/test/output/SRR800842_1/results/teflon/SRR800842_1_teflon_nonredundant.bed /data/mcclintock/test/output/SRR800842_1/results/summary/data/run/summary_report.txt
Is it bug of teflon software? or I should use some extraa option in command?
Thank you,
tj
Hi @tomaszjacek,
can you post the contents of the TEFLoN specific log? That should make it easier for me to determine what is going wrong. Based on the paths in the error you posted, the TEFLoN log should be at: /data/mcclintock/test/output/log/*/teflon.log
Thanks,
Preston
Im sorry i dont know how to attach the file. is it possible here?
So, I have to pste it.
teflon.log file is 1135 lines long with many times "Processed 990100 reads..."
but ends with error
Thank you,
tj
[M::mem_process_seqs] Processed 990100 reads in 83.325 CPU sec, 8.546 real sec
[M::process] read 990100 sequences (100000100 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (217, 401435, 74, 95)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (61, 132, 672)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1894)
[M::mem_pestat] mean and std.dev: (313.10, 375.83)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 2505)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (276, 301, 320)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (188, 408)
[M::mem_pestat] mean and std.dev: (298.18, 33.74)
[M::mem_pestat] low and high boundaries for proper pairs: (144, 452)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (257, 3703, 9499)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 27983)
[M::mem_pestat] mean and std.dev: (4134.85, 3903.56)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 37225)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (495, 753, 1247)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2751)
[M::mem_pestat] mean and std.dev: (747.34, 386.19)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 3503)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 990100 reads in 86.595 CPU sec, 8.862 real sec
[M::process] read 990100 sequences (100000100 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (205, 396794, 77, 95)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (62, 140, 510)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1406)
[M::mem_pestat] mean and std.dev: (314.52, 367.90)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1854)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (275, 301, 319)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (187, 407)
[M::mem_pestat] mean and std.dev: (297.60, 34.11)
[M::mem_pestat] low and high boundaries for proper pairs: (143, 451)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (271, 4322, 8277)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 24289)
[M::mem_pestat] mean and std.dev: (3993.38, 3576.53)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 32295)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (449, 703, 1217)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2753)
[M::mem_pestat] mean and std.dev: (687.53, 371.81)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 3521)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 990100 reads in 92.206 CPU sec, 9.404 real sec
[M::process] read 918116 sequences (92729716 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (211, 394908, 65, 89)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (71, 135, 446)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1196)
[M::mem_pestat] mean and std.dev: (211.70, 216.78)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1571)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (274, 300, 319)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (184, 409)
[M::mem_pestat] mean and std.dev: (296.91, 34.69)
[M::mem_pestat] low and high boundaries for proper pairs: (139, 454)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (285, 2584, 9521)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 27993)
[M::mem_pestat] mean and std.dev: (3933.18, 3790.37)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 37229)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (404, 643, 1227)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2873)
[M::mem_pestat] mean and std.dev: (683.83, 464.21)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 3696)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 990100 reads in 92.694 CPU sec, 9.479 real sec
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (174, 337492, 61, 93)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (69, 131, 548)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1506)
[M::mem_pestat] mean and std.dev: (310.03, 353.80)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1985)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (271, 298, 317)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (179, 409)
[M::mem_pestat] mean and std.dev: (294.40, 35.79)
[M::mem_pestat] low and high boundaries for proper pairs: (133, 455)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (308, 2984, 9472)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 27800)
[M::mem_pestat] mean and std.dev: (4027.59, 3658.31)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 36964)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (513, 721, 809)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1401)
[M::mem_pestat] mean and std.dev: (719.77, 315.99)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1984)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 918116 reads in 97.453 CPU sec, 9.857 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 10 -Y /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered//teflon.prep_MP/teflon.mappingRef.fa /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_1.fq /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_2.fq
[main] Real time: 389.589 sec; CPU: 3788.028 sec
bwa mem -t 10 -Y /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered//teflon.prep_MP/teflon.mappingRef.fa /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_1.fq /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_2.fq > /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sam
samtools view -Sb /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sam > /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.bam
[bam_sort_core] merging from 20 files...
samtools sort -@ 10 -o /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.bam
samtools index /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam
awk: line 1: syntax error at or near *
Calculating alignment statistics
cmd: samtools stats -t /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/teflon.genomeSize.txt /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam
cmd: samtools depth -Q 20 /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam | awk '{sum+=$3; sumsq+=$3*$3} END {print "Average = ",sum/NR; print "Stdev = ",sqrt(sumsq/NR - (sum/NR)**2)}' > /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.cov.txt
Insert size standard deviation estimated as 45. Use the override option if you suspect this is incorrect!
Warning: coverage could not be estimated, enter coverage manually
python /work/mcclintock/install/tools/teflon/teflon.v0.4.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -i sample -l1 family -l2 family -t 10 -q 20
Traceback (most recent call last):
File "/work/mcclintock/install/tools/teflon/teflon_collapse.py", line 165, in <module>
main()
File "/work/mcclintock/install/tools/teflon/teflon_collapse.py", line 103, in main
samples.append([line.split()[0], line.split()[1], [readLen, insz, sd, total_n,cov,cov_sd]])
UnboundLocalError: local variable 'cov' referenced before assignment
python /work/mcclintock/install/tools/teflon/teflon_collapse.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -t 10 -n1 1 -n2 1 -q 20
python /work/mcclintock/install/tools/teflon/teflon_collapse.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -t 10 -n1 1 -n2 1 -q 20
-bash-4.2$ wc -l teflon.log
@tomaszjacek: thanks for your feedback on running McClintock. You can attach files by clicking on the bottom bar of the comment box and navigating in your finder/explorer and uploading. Alternatively, you can drag and drop files of select types into the comment box and it will upload automatically. See more here: https://docs.github.com/en/free-pro-team@latest/github/managing-your-work-on-github/file-attachments-on-issues-and-pull-requests
Hi, when I run McClintock as following:
python3 ${MCK}/mcclintock.py --reference ../10-reference/HaSCD2.fa \
--consensus ../10-reference/Hadb-families_rename.fa \
--first ../20-NGS/${K}/${K}_1.fastq \
--second ../20-NGS/${K}/${K}_2.fastq \
--proc 48 \
--out ${K} \
--locations ./TE_annotations/HaSCD2/reference_te_locations/unaugmented_inrefTEs.gff \
--taxonomy ./TE_annotations/HaSCD2/te_taxonomy/unaugmented_taxonomy.tsv
I got some errors related to teflon as following:
Error in rule teflon_run:
jobid: 20
output: /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/genotypes/sample.genotypes.txt
conda-env: /home/dell/biosoft/mcclintock/install/envs/conda/54b8d4d7
RuleException:
CalledProcessError in line 49 of /home/dell/biosoft/mcclintock/snakefiles/teflon.snakefile:
Command 'source /home/dell/miniconda3/envs/mcclintock/bin/activate '/home/dell/biosoft/mcclintock/install/envs/conda/54b8d4d7'; set -euo pipefail; /home/dell/miniconda3/envs/mcclintock/bin/python3.7 /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/snakemake/1571076/.snakemake/scripts/tmpc34m4ip0.teflon_run.py' returned non-zero exit status 1.
File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2189, in run_wrapper
File "/home/dell/biosoft/mcclintock/snakefiles/teflon.snakefile", line 49, in __rule_teflon_run
File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 529, in _callback
File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/concurrent/futures/thread.py", line 57, in run
File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 515, in cached_or_run
File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2201, in run_wrapper
teflon.log as following
writing TE bed files...
writing TE bed files completed!
reducing search space...
cmd: samtools view -@ 4 -L /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_complete.bed /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.sorted.bam -b
search space succesfully reduced...
new reduced bam file: /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.sam_files/mega_complete.bam
clustering TE positions...
[ ================================================== ] 100.00%
clustering TE positions completed!
final reduction of search space...
cmd: samtools view -@ 4 -q 20 -L /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.sorted.bam -b
Error running samtools: p.returncode = 1
python /home/dell/biosoft/mcclintock/install/tools/teflon/teflon.v0.4.py -wd /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/ -d /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.prep_TF/ -s /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/samples.tsv -i sample -l1 family -l2 family -t 4 -q 20
python /home/dell/biosoft/mcclintock/install/tools/teflon/teflon.v0.4.py -wd /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/ -d /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.prep_TF/ -s /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/samples.tsv -i sample -l1 family -l2 family -t 4 -q 20
when I run the samtools view manually as
samtools view -@ 4 -q 20 -L /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.sorted.bam -b
I got error as following:
[bed_read] Parse error reading "/home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed" at line 63797
samtools view: Could not read file "/home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed"
therefore, I get the line 63797 of /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed as following
4007749
it just included one site, may be start or end?
Meanwhile, I found another potential error in as following
chr19 4007485 Unchr32 651720 651859
it seems to be chimeric records.
So, the error above may occur during clustering TE positions?
- based on the information posted, it looks like you are having issues with the test data on TEFLoN. I've run with this dataset many times without issue which suggests that the problem is likely unrelated to the data, but rather an issue with the environment
- When I look through the log you posted, I see that the first error is:
awk: line 1: syntax error at or near *
- I do not have an awk interpreter included in the TEFLoN conda environment, so any awk commands internal to TEFLoN would be using the awk interpreter installed on your system.
- I've been running with
GNU Awk 4.2.1
and haven't had issues
$ awk --version
GNU Awk 4.2.1, API: 2.0 (GNU MPFR 3.1.6-p2, GNU MP 6.1.2)
Copyright (C) 1989, 1991-2018 Free Software Foundation.
- TEFLoN has two lines where
awk
is used.
$ grep "awk" *.py
teflon.v0.4.py: cmd="""%s depth -Q %s %s | awk '{sum+=$3; sumsq+=$3*$3} END {print "Average = ",sum/NR; print "Stdev = ",sqrt(sumsq/NR - (sum/NR)**2)}' > %s""" %(exeSAM, str(qual), bam, covFILE)
$ grep "awk" teflon_scripts/*.py
teflon_scripts/subsample_alignments.py: cmd="""%s depth -Q %s %s | awk '{sum+=$3; sumsq+=$3*$3} END {print "Average = ",sum/NR; print "Stdev = ",sqrt(sumsq/NR - (sum/NR)**2)}' > %s""" %(exePATH, str(qual), bamFILE, covFILE)
- Both use
**
to denote an exponent instead of^
. After some googling, I found that this is apparently not compatible with all awk interpreters and may cause issues withmawk
which is used by some linux OS. https://stackoverflow.com/questions/9913368/awk-syntax-errors-with-double-star - Indeed, I was able to replicate your error if I switched my default
gawk
interpreter tomawk
(https://anaconda.org/bioconda/mawk)
- I think the easiest solution is to include
gawk
in the TEFLoN conda environment to ensure that users are using the sameawk
interpreter. - @tomaszjacek I'll update the TEFLoN environment yaml and test that it is working properly. Then I'll let you know when it's ready for you to try out. Hopefully this will resolve your issue.
- @zhjpeng (#76 (comment)) I have seen this issue before as well. It seems to be sample dependent. Most of my McClintock runs with TEFLoN do not have this issue but some specific samples will have this occur where the
mega_clustered.bed
is malformed. - I am fairly certain this is a bug in TEFLoN and not related to mcclintock, so I am going to work on replicating this bug outside of McClintock with just TEFLoN. Then I'll open an issue on the actual TEFLoN repository (https://github.com/jradrion/TEFLoN) to see if their developers know what is going on.
- I'll let you know when I've posted the issue
- @zhjpeng (#76 (comment)) I have seen this issue before as well. It seems to be sample dependent. Most of my McClintock runs with TEFLoN do not have this issue but some specific samples will have this occur where the
mega_clustered.bed
is malformed.- I am fairly certain this is a bug in TEFLoN and not related to mcclintock, so I am going to work on replicating this bug outside of McClintock with just TEFLoN. Then I'll open an issue on the actual TEFLoN repository (https://github.com/jradrion/TEFLoN) to see if their developers know what is going on.
- I'll let you know when I've posted the issue
Thanks for your reply, I am running mcclintock in more samples and check whether other samples have similar errors.
@tomaszjacek: thanks for your feedback on running McClintock. You can attach files by clicking on the bottom bar of the comment box and navigating in your finder/explorer and uploading. Alternatively, you can drag and drop files of select types into the comment box and it will upload automatically. See more here: https://docs.github.com/en/free-pro-team@latest/github/managing-your-work-on-github/file-attachments-on-issues-and-pull-requests
Thank you,
tj
- @tomaszjacek I've updated the mcclintock master branch b61563e with the change to the TEFLoN environment that now includes gawk. You should be able to update your mcclintock repository with a
git pull
. Then you should do a clean install withmcclintock.py --install
which will install TEFLoN with the updated conda environment. - Let me know if this resolves the bug you were experiencing earlier.
- @zhjpeng: @pbasting posted an issue on the TEFLoN repository with the bug you and he have encountered: jradrion/TEFLoN#8
- I'm going to close this issue for now, since @tomaszjacek's original issue has been resolved and the TEFLoN bug is outside the scope of this current issue and will require input from the TEFLoN developer.
- Thank you @zhjpeng and @tomaszjacek for reporting these issues.
- @tomaszjacek I've updated the mcclintock master branch b61563e with the change to the TEFLoN environment that now includes gawk. You should be able to update your mcclintock repository with a
git pull
. Then you should do a clean install withmcclintock.py --install
which will install TEFLoN with the updated conda environment.- Let me know if this resolves the bug you were experiencing earlier.
It works,
Thank you,
tj
unfortunately git pull && mcclintock.py --install
didn't help me
is there any way to verify teflon was updated and/or a way to get a component version being used?
Hi @yuryfunikov ,
- Can you post the version of mcclintock you are using?
cd /path/to/mcclintock
git rev-parse HEAD
- Also can you describe the issues you are having in detail and provide examples of the error messages you are receiving?
- If they are different then what is described in: #76 (comment) and #76 (comment) then please post this information in a new issue.
- And to answer your question from: #76 (comment), the best way to be sure you are using the correct versions of McClintock and the component methods is to do a clean installation of the newest commit: 5849097 following the instructions in the README
Thanks!
Preston
Hi and thanks for the answer,
this is what i got:
- i ran git pull && mcclintock.py --install
- git rev-parse HEAD
mcclintock$ git rev-parse HEAD
5849097de4f74b0b8b149cad138e31024082924c
- then i ran:
python3 ./../mcclintock/mcclintock.py -r dvir-all-chromosome-r.1.06.fasta -c asymmetric_TEs_v1.fasta -1 160JB_dna_seq_1_trimmed.fastq.gz -2 160JB_dna_seq_2_trimmed.fastq.gz -p 1 -m teflon -o mcclintock_out_assTEv1_160_refgen/ --resume --debug
that resulted in following error:
RuleException:
CalledProcessError in line 49 of /path/to/file/mcclintock/snakefiles/teflon.snakefile:
Command 'source /opt/miniconda/envs/mcclintock/bin/activate '/path/to/file/mcclintock/install/envs/conda/cc1216b5'; set -euo pipefail; /opt/miniconda/envs/mcclintock/bin/python3.7 /path/to/file/mcclintock_out_assTEv1_160_refgen/snakemake/3370691/.snakemake/scripts/tmp6dm6acdf.teflon_run.py' returned non-zero exit status 1.
File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2189, in run_wrapper
File "/path/to/filemcclintock/snakefiles/teflon.snakefile", line 49, in __rule_teflon_run
File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 529, in _callback
File "/opt/miniconda/envs/mcclintock/lib/python3.7/concurrent/futures/thread.py", line 57, in run
File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 515, in cached_or_run
File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2201, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: //path/to/file/mcclintock_out_assTEv1_160_refgen/snakemake/3370691/.snakemake/log/2021-03-15T001010.010823.snakemake.log
- then i checked teflon log
-rw-rw-r-- 1 sergey sergey 2425 Mar 15 00:16 ./mcclintock_out_assTEv1_160_refgen/logs/20210315.001008.3370691/teflon.log
./mcclintock_out_assTEv1_160_refgen/logs/20210315.001008.3370691/teflon.log:
writing TE bed files...
writing TE bed files completed!
reducing search space...
cmd: samtools view -@ 1 -L /path/to/file/mcclintock_out_assTEv1_160_refgen/160JB_dna_seq_1_trimmed/results/teflon/unfiltered/sample.bed_files/mega_complete.bed /path/to/file/mcclintock_out_assTEv1_160_refgen/160JB_dna_seq_1_trimmed/results/teflon/unfiltered/teflon.sorted.bam -b
Error running samtools: p.returncode = 1
py
and i must say that it looks like mega_complete.bed wasn't created at all:
/path/to/file/mcclintock_out_assTEv1_160_refgen/160JB_dna_seq_1_trimmed/results/teflon/unfiltered/sample.bed_files/mega_complete.bed: No such file or directory
also i should say that the pipeline used to be working without problems but then it stated failing with this error from time to time and now it fails every time we run the script
pls let me know if you think i should file a new ticket regarding this
Thanks @yuryfunikov this looks like a similar problem as described in: #76 (comment). We have contacted the TEFLoN developer and I think that the bug has been fixed (see: jradrion/TEFLoN#8) but I am currently testing it and integrating the changes in mcclintock. I'll let you know when these changes have been integrated.
hi
sorry for bothering but have you had a chance to look into this?
@yuryfunikov Sorry for not replying earlier, but I have integrated the most recent update to TEFLoN into mcclintock. So I'd suggest re-installing the newest version of mcclintock: 40863ac and trying TEFLoN again on your sample to see if the issue is resolved