WGLab/VirTect

Syntax errors in script

Opened this issue · 0 comments

Hey,
I was asked to check for viral reads in our bulk RNA-seq data, but when I ran your script through your detailed tutorial, I kept getting errors with the "print" syntax throughout the script. Unsure if this was always there or not, but just letting you know in case others are running into the issue but haven't paid attention to the error messages:

I was able to fix the following in VirTect.py:

  1. replace all instances of [print 'Running '] with [print("Running")]
  2. On line 194, replace [print ''] with [print('\t')]
  3. On line 195, replace (again print) [print line.strip()] with [print(line.strip())]

I then got the error "no GTF file!" so I assumed one was not downloaded. However, I found that according to your documentation the it expects "gencode.v25.chr_patch_hapl_scaff.annotation.gtf" but what I got after downloading/indexing as according to your code was "gencode.v29.annotation.gtf.gz" and when I tried running the first error I ran into was as follows, indicating that there is something wrong with the gtf file downloaded/indexed by your code:

[2023-06-29 14:22:13] Building transcriptome data files /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/tmp/gencode.v29.annotation.gtf
[FAILED]
Error: gtf_to_fasta returned an error.
Running samtools sort -n /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped.bam -o /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_sorted.bam
[E::hts_open_format] Failed to open file "/data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped.bam" : No such file or directory
samtools sort: can't open "/data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped.bam": No such file or directory
Running bedtools bamtofastq -i /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_sorted.bam -fq /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_sorted_1.fq -fq2 /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_sorted_2.fq
[E::hts_open_format_impl] Failed to open file /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_sorted.bam
Failed to open BAM file /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_sorted.bam
Running bwa mem /home/kmaroney/programs/VirTect/viruses_reference/viruses_759.fasta /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_sorted_1.fq /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_sorted_2.fq > /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_aln.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem /home/kmaroney/programs/VirTect/viruses_reference/viruses_759.fasta /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_sorted_1.fq /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_sorted_2.fq
[main] Real time: 0.138 sec; CPU: 0.011 sec
Running samtools view -Sb -h /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_aln.sam > /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_aln.bam
Running samtools view /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_aln.bam | cut -f3 | sort | uniq -c | awk '{if ($1>=400) print $0}' > /data/user/kmaroney/Projects/Shreshtha_Lab/Anal_Cancer_1/Virtect/unmapped_viruses_count.txt
awk: cmd. line:1: { if ($2!=(ploc+1)) {if (ploc!=0){printf("%s %d-%d
awk: cmd. line:1: ^ unterminated string
awk: cmd. line:1: { if ($2!=(ploc+1)) {if (ploc!=0){printf("%s %d-%d
awk: cmd. line:1: ^ syntax error
The continous length
----------------------------------------Note: There is no real virus in the sample :)----------------------------

However, I used a GTF file and genome I previously indexed and it seems (stuck on preparing reads step rather than simply failing quickly) to be working. So I think that the code itself is totally fine. Just a couple syntax errors and problem with the human reference. I just wanted to put this here to be helpful. If anyone's having issues, should be able to solve with this. Looking forward to getting my viral reads :)