ngs_tools.gtf.Segment.SegmentError: Invalid segment
Closed this issue · 2 comments
zhewa commented
Hi,
I am trying to run nf-core/scrnaseq using kallisto aligner. During the step of generating reference index, the following error occurred. It seems to have something to do with segment of zero length. I am using GRCh38.p14 fasta and gtf files from NCBI with appended ERCC transcripts. Do you know how to fix this?
Thank you
Workflow execution completed unsuccessfully
Caused by:
Missing output file(s) `kb_ref_out.idx` expected by process `NFCORE_SCRNASEQ:SCRNASEQ:KALLISTO_BUSTOOLS:KALLISTOBUSTOOLS_REF (GCF_000001405.40_GRCh38.p14_genomic_ERCC92.fna.gz)`
Command executed:
kb \
ref \
-i kb_ref_out.idx \
-g t2g.txt \
-f1 cdna.fa \
--workflow standard \
GCF_000001405.40_GRCh38.p14_genomic_ERCC92.fna.gz \
GCF_000001405.40_GRCh38.p14_genomic_ERCC92.gtf.gz
cat <<-END_VERSIONS > versions.yml
"NFCORE_SCRNASEQ:SCRNASEQ:KALLISTO_BUSTOOLS:KALLISTOBUSTOOLS_REF":
kallistobustools: $(echo $(kb --version 2>&1) | sed 's/^.*kb_python //;s/positional arguments.*$//')
END_VERSIONS
Command exit status:
0
Command output:
(empty)
Command error:
[2022-06-28 15:54:36,355] WARNING [main] Gene `RNU6-222P_21` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_21`.
[2022-06-28 15:54:36,355] WARNING [main] Gene `KIR2DP1_29` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR2DP1_29`.
[2022-06-28 15:54:36,355] WARNING [main] Gene `KIR3DP1_32` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_32`.
[2022-06-28 15:54:36,355] WARNING [main] Gene `RNU6-222P_22` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_22`.
[2022-06-28 15:54:36,355] WARNING [main] Gene `KIR3DP1_33` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_33`.
[2022-06-28 15:54:36,355] WARNING [main] Gene `KIR2DP1_30` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR2DP1_30`.
[2022-06-28 15:54:36,355] WARNING [main] Gene `KIR2DP1_31` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR2DP1_31`.
[2022-06-28 15:54:36,355] WARNING [main] Gene `KIR3DP1_34` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_34`.
[2022-06-28 15:54:36,355] WARNING [main] Gene `RNU6-222P_23` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_23`.
[2022-06-28 15:54:36,355] WARNING [main] Gene `KIR2DP1_32` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR2DP1_32`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `KIR3DP1_35` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_35`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `RNU6-222P_24` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_24`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `KIR2DP1_33` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR2DP1_33`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `KIR3DP1_36` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_36`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `RNU6-222P_25` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_25`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `KIR3DP1_37` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_37`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `RNU6-222P_26` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_26`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `KIR2DP1_34` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR2DP1_34`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `KIR3DP1_38` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_38`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `RNU6-222P_27` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_27`.
[2022-06-28 15:54:36,356] WARNING [main] Gene `RNU6-222P_28` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_28`.
[2022-06-28 15:54:36,357] WARNING [main] Gene `KIR3DP1_39` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_39`.
[2022-06-28 15:54:36,357] WARNING [main] Gene `KIR2DP1_35` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR2DP1_35`.
[2022-06-28 15:54:36,357] WARNING [main] Gene `RNU6-222P_29` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_29`.
[2022-06-28 15:54:36,357] WARNING [main] Gene `KIR3DP1_40` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_40`.
[2022-06-28 15:54:36,357] WARNING [main] Gene `KIR2DP1_36` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR2DP1_36`.
[2022-06-28 15:54:36,357] WARNING [main] Gene `RNU6-222P_30` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_30`.
[2022-06-28 15:54:36,357] WARNING [main] Gene `KIR3DP1_41` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_41`.
[2022-06-28 15:54:36,357] WARNING [main] Gene `KIR2DP1_37` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR2DP1_37`.
[2022-06-28 15:54:36,358] WARNING [main] Gene `RNU6-222P_31` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `RNU6-222P_31`.
[2022-06-28 15:54:36,358] WARNING [main] Gene `KIR3DP1_42` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_42`.
[2022-06-28 15:54:36,358] WARNING [main] Gene `KIR2DP1_38` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR2DP1_38`.
[2022-06-28 15:54:36,358] WARNING [main] Gene `KIR3DP1_43` has no transcripts. The entire gene will be marked as a transcript and an exon with ID `KIR3DP1_43`.
[2022-06-28 15:54:41,332] ERROR [main] An exception occurred
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/kb_python/main.py", line 856, in main
COMMAND_TO_FUNCTION[args.command](parser, args, temp_dir=temp_dir)
File "/usr/local/lib/python3.9/site-packages/kb_python/main.py", line 168, in parse_ref
ref(
File "/usr/local/lib/python3.9/site-packages/ngs_tools/logging.py", line 62, in inner
return func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/kb_python/ref.py", line 393, in ref
gene_infos, transcript_infos = ngs.gtf.genes_and_transcripts_from_gtf(
File "/usr/local/lib/python3.9/site-packages/ngs_tools/gtf/__init__.py", line 190, in genes_and_transcripts_from_gtf
introns = exons.invert(transcript_interval)
File "/usr/local/lib/python3.9/site-packages/ngs_tools/gtf/SegmentCollection.py", line 108, in invert
Segment(self._segments[i].end, self._segments[i + 1].start)
File "/usr/local/lib/python3.9/site-packages/ngs_tools/gtf/Segment.py", line 27, in __init__
raise SegmentError(f'Invalid segment [{start}:{end})')
ngs_tools.gtf.Segment.SegmentError: Invalid segment [1095094:1095094)
Work dir:
s3://***
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`````
Lioscro commented
Hi, @zhewa,
Zero-length segments are supported since version 1.5.13.
Could you try updating the package?
zhewa commented
Hi,
Yes. After updating the package it ran successfully. Thank you.