ConesaLab/SQANTI3

Issue running Squanti3 V 5.2- reading STAR jucntion/coverage file

sentisci opened this issue · 4 comments

Hi,

I am getting the below error while running squanti3. The error is while reading STAR coverage/junction file. The error happens even when I run with short-read raw fastq files.. Can you please help ??

**** Parsing Isoforms....
Input pattern: /data/CCRSB/apps/IsoSeq-PacBio/STAR_alignment/4447Org_001_P7SJ.out.tab.
The following files found and to be read as junctions:
/data/CCRSB/apps/IsoSeq-PacBio/STAR_alignment/4447Org_001_P7SJ.out.tab
408049 junctions read. 1803 junctions added to both strands because no strand information from STAR.
Process Process-1:
Traceback (most recent call last):
File "/data/CCRSB/apps/pipelineSnakes/pipeline-SB/conda/envs/SQANTI3.env/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/data/CCRSB/apps/pipelineSnakes/pipeline-SB/conda/envs/SQANTI3.env/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/data/CCRSB/apps/IsoSeq-PacBio/SQANTI3-5.2/sqanti3_qc.py", line 1888, in run
isoforms_info, ratio_TSS_dict = isoformClassification(args, isoforms_by_chr, refs_1exon_by_chr, refs_exons_by_chr, junctions_by_chr, junctions_by_gene, start_ends_by_gene, genome_dict, indelsJunc, orfDict, corrGTF)
File "/data/CCRSB/apps/IsoSeq-PacBio/SQANTI3-5.2/sqanti3_qc.py", line 1549, in isoformClassification
inside_bed, outside_bed = get_TSS_bed(corrGTF, chr_order)
File "/vf/users/CCRSB/apps/IsoSeq-PacBio/SQANTI3-5.2/utilities/short_reads.py", line 122, in get_TSS_bed
for rec in BCBio_GFF.parse(in_handle, limit_info=limit_info, target_lines=1):
File "/data/CCRSB/apps/pipelineSnakes/pipeline-SB/conda/envs/SQANTI3.env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 793, in parse
for rec in parser.parse_in_parts(gff_files, base_dict, limit_info,
File "/data/CCRSB/apps/pipelineSnakes/pipeline-SB/conda/envs/SQANTI3.env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 337, in parse_in_parts
cur_dict = self._results_to_features(cur_dict, results)
File "/data/CCRSB/apps/pipelineSnakes/pipeline-SB/conda/envs/SQANTI3.env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 376, in _results_to_features
base = self._add_parent_child_features(base, results.get('parent', []),
File "/data/CCRSB/apps/pipelineSnakes/pipeline-SB/conda/envs/SQANTI3.env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 448, in _add_parent_child_features
child_feature = self._get_feature(child_dict)
File "/data/CCRSB/apps/pipelineSnakes/pipeline-SB/conda/envs/SQANTI3.env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 591, in _get_feature
new_feature = SeqFeature.SeqFeature(location, feature_dict['type'],
TypeError: SeqFeature.init() got an unexpected keyword argument 'strand'
**** Parsing provided files....
Reading genome fasta /data/CCRSB/apps/IsoSeq-PacBio/SQANTI3-5.2/data/GRCh38.p14.genome.fa....
Skipping aligning of sequences because GTF file was provided.

Hi @sentisci would you mind sharing the full sqanti command you run, and the full error message?

Command
slurm-21213640.out.txt

python ${basePath}/SQANTI3-5.2/sqanti3_qc.py
-d ${basePath}/Squanti_4447_kallisto_orf_Junctions_all/
-o Squanti_4447_kallisto_orf_Junctions_all
--CAGE_peak ${basePath}/SQANTI3-5.2/data/ref_TSS_annotation/human.refTSS_v3.1.hg38.bed
--polyA_motif_list ${basePath}/SQANTI3-5.2/data/polyA_motifs/mouse_and_human.polyA_motif.txt
--polyA_peak /${basePath}/SQANTI3-5.2/data/atlas.clusters.2.0.GRCh38.96.bed
-n 5 -t 30 --saturation --report both --isoAnnotLite
--gff3 ${basePath}/SQANTI3-5.2/data/tappAS_Homo_sapiens_GRCh38_Ensembl_86.gff3
-fl ${basePath}/Squanti_4447_kallisto_orf_Junctions_all/all_samples.chained_count.txt
--expression ${basePath}/Squanti_4447_kallisto_orf_Junctions_all/4447_001_P7_kallisto/abundance.tsv
--coverage ${basePath}/STAR_alignment/4447Org_001_P7SJ.out.tab
--SR_bam ${basePath}/STAR_alignment/4447Org_001_P7_bam.fofn
${basePath}/Squanti_4447_kallisto_orf_Junctions_all/all_samples.chained.sorted.gff
${basePath}/SQANTI3-5.2/data/gencode.v45.primary_assembly.annotation.gtf
${basePath}/SQANTI3-5.2/data/GRCh38.p14.genome.fa \

I would appreciate it if you could look at it at your earliest convenience.. thank you

Hi @sentisci, it seems your error is caused by using an incorrect version of Biopython. It was discussed in a past issue #247 and solved with SamGallaher's suggestion to install biopython<=1.81.

To avoid further problems of package compatibility, we suggest you install the SQANTI3.env conda environment as recommended in the SQANTI documentation.

Best,
Carolina.