friend1ws/nanomonsv

IndexError: list index out of range

Closed this issue · 6 comments

Hello,

When running the "nanomonsv get" command I get the following error:

Traceback (most recent call last):
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/bin/nanomonsv", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/nanomonsv/__init__.py", line 13, in main
    args.func(args)
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/nanomonsv/run.py", line 290, in get_main
    gather_support_read_seq(args.tumor_prefix + ".rearrangement.sorted.clustered.bedpe",
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/nanomonsv/gather_support_read_seq.py", line 98, in gather_support_read_seq
    set_readid2alignment(readid2alignment, rearrangement_file, 'r', alignment_margin)
  File "/home/rranallo-benavidez/mambaforge/envs/snakemake/lib/python3.11/site-packages/nanomonsv/gather_support_read_seq.py", line 27, in set_readid2alignment
    tinfo1 = info1[i].split(',')
             ~~~~~^^^
IndexError: list index out of range

I looked into the issue a bit and I have a theory as to what the problem is. I am running on duplex reads produced by Oxford Nanopore. (The issue above didn't occur when running on simplex reads). The read IDs of duplex reads contain a semicolon (;) character. For example, "1ab6068c-8bac-4159-aff7-e3f309edd11f;2d547674-d657-4bf6-a40e-af6f8f695e96" is an example read ID for a duplex read. Looking at the gather_support_read_seq.py code I notice several lines with ".split(';')". Maybe the extra semicolon in the read ID is causing the indices to be off? Thanks for taking a look at this.

Thank you very much! Really good to know. Do you know some open duplex read data? Then, I will try to fix nanomonsv so that nanomonsv can handle duplex reads.

Yes, it appears that there is some duplex data here: https://humanpangenome.org/data/.

Great!! I will check soon.

Hi, we have updated to v0.7.1 (sorry but somehow we cannot build conda...), which I believe can treat duplex data. So I'm happy if you could try.

Hi @tbenavi1,

Did the update solve your problem?

I am getting the same issue when running on my data (but not getting the same error with the test data):
03/15/2024 14:36:41 - nanomonsv.run - INFO - Clustering rearrangement type supporting reads for putative SVs
03/15/2024 14:36:46 - nanomonsv.run - INFO - Clustering insertion type supporting reads for putative SVs
03/15/2024 14:59:17 - nanomonsv.run - INFO - Clustering deletion type supporting reads for putative SVs
03/15/2024 15:10:42 - nanomonsv.run - INFO - Gathering sequences of supporting reads
Traceback (most recent call last):
File "/exports/igmm/eddie/semple-lab/eesiribloom/conda/envs/nanomonsv/bin/nanomonsv", line 8, in
sys.exit(main())
File "/exports/igmm/eddie/semple-lab/eesiribloom/conda/envs/nanomonsv/lib/python3.10/site-packages/nanomonsv/init.py", line 13, in main
args.func(args)
File "/exports/igmm/eddie/semple-lab/eesiribloom/conda/envs/nanomonsv/lib/python3.10/site-packages/nanomonsv/run.py", line 315, in get_main
gather_support_read_seq(args.tumor_prefix + ".rearrangement.sorted.clustered.bedpe",
File "/exports/igmm/eddie/semple-lab/eesiribloom/conda/envs/nanomonsv/lib/python3.10/site-packages/nanomonsv/gather_support_read_seq.py", line 104, in gather_support_read_seq
set_readid2alignment(readid2alignment, insertion_file, 'i', alignment_margin)
File "/exports/igmm/eddie/semple-lab/eesiribloom/conda/envs/nanomonsv/lib/python3.10/site-packages/nanomonsv/gather_support_read_seq.py", line 20, in set_readid2alignment
key = f"{F[0]},{F[1]},{F[2]},{F[8]},{F[3]},{F[4]},{F[5]},{F[9]},{mode},{cid}"
IndexError: list index out of range

when using nanomonsv v0.7.1

My command was:
nanomonsv get
./COLO829_nanomonsv
../../PAO32033/PAO32033.bam
../../../GCA_000001405.15_GRCh38_no_alt_analysis_set.fa
--control_prefix ../../../COLO829BL/COLO829BL_nanomonsv
--control_bam ../../../COLO829BL/PAO33946/PAO33946.bam
--min_indel_size 30
--use_racon
--single_bnd