ValueError: invalid coordinates: start > stop
amdqiao1 opened this issue · 6 comments
Hi developers,
I got a pysam ValueError: invalid coordinates: start (248849282) > stop (138479)
when running SitesPerCell.py analysis. Is it an expected behavior for 10x multiome data? I do have sequences aligned to the reverse strand (FLAG=16) in my BAM file. However, the start and stop don't seem to belong to the same entry for 1) it would be too long a sequence to be captured and 2) there is no match after running samtools view bamfile | grep 248849282\t138479
. Can someone help me out on this? Are there any possible solutions? My environment is as following. Many thanks!!
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
_r-mutex 1.0.1 anacondar_1 conda-forge
about-time 3.1.1 pypi_0 pypi
bedtools 2.31.0 hf5e1c6e_2 bioconda
binutils_impl_linux-64 2.36.1 h193b22a_2 conda-forge
binutils_linux-64 2.36 hf3e587d_33 conda-forge
bwidget 1.9.14 ha770c72_1 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
ca-certificates 2023.05.30 h06a4308_0
cairo 1.16.0 h18b612c_1001 conda-forge
curl 7.68.0 hf8cf82a_0 conda-forge
datamash 1.1.0 0 bioconda
expat 2.5.0 h27087fc_0 conda-forge
fontconfig 2.14.1 hc2a2eb6_0 conda-forge
freetype 2.12.1 hca18f0e_0 conda-forge
fribidi 1.0.10 h36c2ea0_0 conda-forge
gcc_impl_linux-64 7.5.0 hda68d29_13 conda-forge
gcc_linux-64 7.5.0 h47867f9_33 conda-forge
gettext 0.21.1 h27087fc_0 conda-forge
gfortran_impl_linux-64 7.5.0 h56cb351_20 conda-forge
gfortran_linux-64 7.5.0 h78c8a43_33 conda-forge
glib 2.74.1 h6239696_1 conda-forge
glib-tools 2.74.1 h6239696_1 conda-forge
graphite2 1.3.14 h295c915_1
gsl 2.5 h294904e_1 conda-forge
gxx_impl_linux-64 7.5.0 h64c220c_13 conda-forge
gxx_linux-64 7.5.0 h555fc39_33 conda-forge
harfbuzz 2.8.1 h6f93f22_0
htslib 1.10.2 h78d89cc_0 bioconda
icu 58.2 hf484d3e_1000 conda-forge
jinja2 3.1.2 pypi_0 pypi
jpeg 9e h166bdaf_2 conda-forge
kernel-headers_linux-64 2.6.32 he073ed8_15 conda-forge
krb5 1.16.4 h2fd8d38_0 conda-forge
ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge
lerc 4.0.0 h27087fc_0 conda-forge
libblas 3.9.0 8_openblas conda-forge
libcblas 3.9.0 8_openblas conda-forge
libcurl 7.68.0 hda55be3_0 conda-forge
libdeflate 1.14 h166bdaf_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgfortran-ng 7.5.0 h14aa051_20 conda-forge
libgfortran4 7.5.0 h14aa051_20 conda-forge
libglib 2.74.1 h606061b_1 conda-forge
libgomp 12.2.0 h65d4601_19 conda-forge
libiconv 1.17 h166bdaf_0 conda-forge
liblapack 3.9.0 8_openblas conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libopenblas 0.3.12 pthreads_hb3c22a3_1 conda-forge
libpng 1.6.38 h753d276_0 conda-forge
libssh2 1.10.0 haa6b8db_3 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
libtiff 4.4.0 h55922b4_4 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libwebp-base 1.2.4 h166bdaf_0 conda-forge
libxcb 1.13 h7f98852_1004 conda-forge
libxml2 2.9.9 h13577e0_2 conda-forge
libzlib 1.2.13 h166bdaf_4 conda-forge
make 4.3 hd18ef5c_1 conda-forge
markupsafe 2.1.3 pypi_0 pypi
ncurses 6.2 h58526e2_4 conda-forge
numpy 1.21.6 pypi_0 pypi
numpy-groupies 0.9.14 pypi_0 pypi
openssl 1.1.1v h7f8727e_0
pandas 1.3.5 pypi_0 pypi
pango 1.40.14 he7ab937_1005 conda-forge
pcre 8.45 h9c3ff4c_0 conda-forge
pcre2 10.40 hc3806b6_0 conda-forge
pip 22.3.1 pyhd8ed1ab_0 conda-forge
pixman 0.38.0 h516909a_1003 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
pybedtools 0.8.1 pypi_0 pypi
pysam 0.16.0.1 pypi_0 pypi
python 3.7.12 hb7a2778_100_cpython conda-forge
python-dateutil 2.8.2 pypi_0 pypi
pytz 2023.3 pypi_0 pypi
r-base 3.6.1 h8900bf8_2 conda-forge
r-cpp11 0.2.7 r36hc72bb7e_0 conda-forge
r-ragg 0.4.0 r36h71971c5_0 conda-forge
r-systemfonts 1.0.2 r36hef9c87a_0 conda-forge
r-textshaping 0.1.2 r36ha19333d_0 conda-forge
r-vgam 1.1_1 r36ha65eedd_0 r
readline 8.1 h46c0cb4_0 conda-forge
rpy2 2.9.4 pypi_0 pypi
samtools 1.10 h2e538c0_3 bioconda
scipy 1.7.3 pypi_0 pypi
sed 4.8 he412f7d_0 conda-forge
setuptools 65.5.1 pyhd8ed1ab_0 conda-forge
six 1.16.0 pypi_0 pypi
sqlite 3.37.0 h9cd32fc_0 conda-forge
sysroot_linux-64 2.12 he073ed8_15 conda-forge
tk 8.6.12 h27826a3_0 conda-forge
tktable 2.10 hb7b940f_3 conda-forge
wheel 0.38.4 pyhd8ed1ab_0 conda-forge
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libice 1.0.10 h7f98852_0 conda-forge
xorg-libsm 1.2.3 hd9c2040_1000 conda-forge
xorg-libx11 1.7.2 h7f98852_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h7f98852_1 conda-forge
xorg-libxrender 0.9.10 h7f98852_1003 conda-forge
xorg-renderproto 0.11.1 h7f98852_1002 conda-forge
xorg-xextproto 7.3.0 h7f98852_1002 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
zlib 1.2.13 h166bdaf_4 conda-forge
zstd 1.5.2 h6239696_4 conda-forge
Dear user,
Thanks for bringing up this issue. It is the first time in my life that I have seen such an error. Could you please check a couple of things?
- Do you find the coordinates individually
samtools view bamfile | grep 248849282
andsamtools view bamfile | grep 138479
- When working with pysam, we usually work with 0-based coordinates. So could you look for the coordinates 248849283 or 138480 ?
Thanks,
Fran
- Do you find the coordinates individually
samtools view bamfile | grep 248849282
andsamtools view bamfile | grep 138479
Yes! I found 248849282, but not 138479. 138479 only appeared as a substring of some greater coordinate.
VH00411:140:AAAYLMTHV:1:2406:24556:16903 1024 chr1 248849282 255 90M * 0 0 CCAGGGCAGATCAAGGGGCCTCTCAGAACCATGTTCCCCAGCCAGGTGAGGACCATTTTCACTGGGACCCAGGCCAAAACCATGTGGGTG CCCCCCCC;CCCCCCCCCCCCCCCCCCC-CCCCC-CCCCCCCCCCCCCCCCCC-CCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCC NH:i:1 HI:i:1 AS:i:88 nM:i:0 RG:Z:results:0:1:AAAYLMTHV:1 TX:Z:ENST00000306562,+2809,90M GX:Z:ENSG00000171161 GN:Z:ZNF672 fx:Z:ENSG00000171161 RE:A:E xf:i:17 CR:Z:TTGCAAGGTCATGCCC CY:Z:CCCCCCCCCC-CCCCC CB:Z:TTGCAAGGTCATGCCC-1UR:Z:ATGCATTCTATA UY:Z:CCCCCCCCCCCC UB:Z:ATGCATTCTATA
VH00411:140:AAAYLMTHV:2:1507:10562:34983 1024 chr1 248849282 255 90M * 0 0 CCAGGGCAGATCAAGGGGCCTCTCAGAACCATGTTCCCCAGCCAGGTGAGGACCATTTTCACTGGGACCCAGGCCAAAACCATGTGGGTG CCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC NH:i:1 HI:i:1 AS:i:88 nM:i:0 RG:Z:results:0:1:AAAYLMTHV:2 TX:Z:ENST00000306562,+2809,90M GX:Z:ENSG00000171161 GN:Z:ZNF672 fx:Z:ENSG00000171161 RE:A:E xf:i:17 CR:Z:GTAAAGCCAATTTGGT CY:Z:CCCCCCCCCCCCCCCC CB:Z:GTAAAGCCAATTTGGT-1UR:Z:TGCAAATAAACA UY:Z:CCCCCCCCC;CC UB:Z:TGCAAATAAACA
- When working with pysam, we usually work with 0-based coordinates. So could you look for the coordinates 248849283 or 138480?
Yes for 248849283, but not 138480. Similarly, 138480 only appeared as a substring.
VH00411:140:AAAYLMTHV:1:1603:33058:30666 1024 chr1 248849283 255 90M * 0 0 CAGGGCAGATCAAGGGGCCTCTCAGAACCATGTTCCCCAGCCAGGTGATGACCATTTTCACTGGGACCCAGGCCAAAACCATGTGGGTGC CCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC-CCCCCCCCCCCCCCCCCCCC NH:i:1 HI:i:1 AS:i:86 nM:i:1 RG:Z:results:0:1:AAAYLMTHV:1 TX:Z:ENST00000306562,+2810,90M GX:Z:ENSG00000171161 GN:Z:ZNF672 fx:Z:ENSG00000171161 RE:A:E xf:i:17 CR:Z:ATGTAACGTCGTTATC CY:Z:;CCCC;CCCCCCC;CC CB:Z:ATGTAACGTCGTTATC-1UR:Z:ACCACTTTTTTG UY:Z:CCCCCCCCCCCC UB:Z:ACCACTTTTTTG
Could you please provide some more info about the next points?
- Could you send the command you used to run SitesPerCell.py? As well as the complete error that you get.
- Could you check if the problematic coordinates are found in the input file you passed to the parameter
--infile
?
- Could you send the command you used to run SitesPerCell.py? As well as the complete error that you get.
The command is as suggested in the tutorial:
python $SCOMATIC/scripts/SitesPerCell/SitesPerCell.py --bam $bam \
--infile $infile/${sample}.calling.step1.tsv \
--ref $ref/genome.fa \
--out_folder $outfile --tmp_dir $temp --nprocs 1
The complete error:
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
- Could you check if the problematic coordinates are found in the input file you passed to the parameter
--infile
?
Here are the matching results for problematic start and stop coordinates in ${sample}.calling.step1.tsv
. There are other matches of 138479 as substrings in this file.
chr1 248849282 248849282 C . . . TGTCC CAGGG . . . 0;39;1 0;29;1 . . DP|NC|CC|BC|BQ|BCf|BCr NA 39|29|0:29:0:0:0:0|0:39:0:0:0:0|0:1326:0:0:0:0|0:28:0:0:0:0|0:11:0:0:0:0 NA NA NA NA NA NA NA NA NA NA
KI270733.1 138479 138479 C . . . CCTCC TCCCT . . . . 0;7;1 0;5;1 . . DP|NC|CC|BC|BQ|BCf|BCr NA 7|5|0:5:0:0:0:0|0:7:0:0:0:0|0:238:0:0:0:0|0:1:0:0:0:0|0:6:0:0:0:0 NA NA NA NA NA NA NA NA NA NA
I have also faced this issue, strangely enough, I have been able to avoid this by increasing the number of cell types in my metadata.
In my first approach I had only 3 groups, immune, stroma and tumor cells, now I have subdivided the barcodes metadata to the actual cell types in the immune and stroma population as well as subdivide the tumor cells by custom-defined transcriptional programs and the script seems to work perfectly fine. In some of the cell types (extremely small number of cells) it doesn't give results (no barcode or sites) but I suppose I can work without those cell types.
It's interesting that it only fails with a low number of cell types. Actually, it is a complaint from the pysam package. I will take a deeper look at this error.
Very low-represented cell types are expected not to provide results with SComatic. That occurs cause they need to reach the minimum required coverage per site to be considered as callable (5 cells per site by default).
Thanks,
Fran