Difficulty with using wrapper for viral gene counts
KevinMaroney opened this issue · 1 comments
Hi again,
So I followed the links you suggested to create the single virus indices. I made 2 fasta files, one with the complete genome (HPV16_genome.fasta) containing a single entry >HPV16_Complete_Genome, and the other with the annotated genes (HPV16_transcripts.fasta) with >E1^E4 (sequence) >E1 (sequence) etc. as fasta files go. Your tutorial said to just run ./createindex_singlevirus.cwl createindex_singlevirus.job.yaml, so I made that yaml file as suggested which has a slot for both the genome and transcripts:
dir_name_STAR: STAR_index_NC_001526.4_HPV16
genomeFastaFiles:
class: File
path: "./HPV16_transcripts.fasta"
genomeSAindexNbases: 3
index_salmon: salmon_index_NC_001526.4_HPV16
runThreadN: 40
transcripts:
class: File
path: "./HPV16_genome.fasta"
This was successfuly and generated both a STAR_index for the genome fasta file and a salmon index, I assume, for the transcripts (STAR_index_NC... and salmon_index_NC...).
The issue I'm running into is with the inputs for the VIRTUS_wrapper.py
I do not see an argument specifying salmon_index_virus, but do see it for salmon_index_human.
Do I need to just put the genome and transcripts in the same fasta as opposed to what is suggested in the example .yaml file to be able to generate a figure/comparisons as you did?
I tried to use the following command:
DIR_INDEX_ROOT=~/programs/VIRTUS/workflow
~/programs/VIRTUS/wrapper/VIRTUS_wrapper.py input.fastq.csv \
--fastq \
--VIRTUSDir ~/programs/VIRTUS2/workflow/ \
-s1 _R1_1.fastq.gz \
-s2 _R2_2.fastq.gz \
--genomeDir_human ~/programs/VIRTUS/workflow/STAR_index_human \
--genomeDir_virus ~/programs/VIRTUS/workflow/STAR_index_NC_001526.4_HPV16 \
--salmon_index_virus ~/programs/VIRTUS/workflow/salmon_index_NC_001526.4_HPV16 \
--salmon_quantdir_virus salmon_quantdir_NC_001526.4_HPV16
--nthreads=40
But got the following error:
~/programs/VIRTUS/wrapper/VIRTUS_wrapper.py input.fastq.csv \
> --fastq \
> --VIRTUSDir ~/programs/VIRTUS2/workflow/ \
> -s1 _R1_1.fastq.gz \
> -s2 _R2_2.fastq.gz \
> --genomeDir_human ~/programs/VIRTUS/workflow/STAR_index_human \
> --genomeDir_virus ~/programs/VIRTUS/workflow/STAR_index_NC_001526.4_HPV16 \
> --salmon_index_virus ~/programs/VIRTUS/workflow/salmon_index_NC_001526.4_HPV16 \
> --salmon_quantdir_virus salmon_quantdir_NC_001526.4_HPV16
usage: VIRTUS_wrapper.py [-h] [--VIRTUSDir VIRTUSDIR] --genomeDir_human GENOMEDIR_HUMAN --genomeDir_virus
GENOMEDIR_VIRUS --salmon_index_human SALMON_INDEX_HUMAN
[--salmon_quantdir_human SALMON_QUANTDIR_HUMAN]
[--outFileNamePrefix_human OUTFILENAMEPREFIX_HUMAN] [--nthreads NTHREADS]
[--hit_cutoff HIT_CUTOFF] [-s SUFFIX_SE] [-s1 SUFFIX_PE_1] [-s2 SUFFIX_PE_2] [--fastq]
input_path
VIRTUS_wrapper.py: error: the following arguments are required: --salmon_index_human
I haven't even generated a salmon_index_human, as I was using the human STAR_index I previously made for VIRTUS2. Can you point me in the right direction? It also seemed like it didn't... have an error with --salmon_index_virus or --salmon_quantdir_virus despite those not being listed supported arguments? Thank you for your help.
Hi, unfortunately, the wrapper does not support single-virus mode.