IndexError: list index out of range
Closed this issue · 3 comments
Running into an index out of range issue
$ maast end_to_end --min-prev 0.9 --out-dir test_out --in-dir a_few_asms/
[Warning] Total number of genomes (9) < min. number of genomes required for effective SNP calling with MAF 0.01 (100)
[Warning] Skip tag genome selection, all genomes will be used
reference genome path: a_few_asms/DRR090820_contigs_skesa.fasta
[building mash sketch]: start
[calculating mash distance]: start
[clustering] start
[clustering] done
a_few_asms/DRR090793_contigs_skesa.fasta
Running mummer4; start
reference genome path: a_few_asms/DRR090793_contigs_skesa.fasta
[paired alignment]: start
[paired alignment]: done
DRR090793_contigs_skesa.fasta - DRR090809_contigs_skesa.fasta
DRR090793_contigs_skesa.fasta - DRR090807_contigs_skesa.fasta
DRR090793_contigs_skesa.fasta - DRR090793_contigs_skesa.fasta
DRR090793_contigs_skesa.fasta - DRR090820_contigs_skesa.fasta
DRR090793_contigs_skesa.fasta - DRR090805_contigs_skesa.fasta
DRR090793_contigs_skesa.fasta - DRR090795_contigs_skesa.fasta
DRR090793_contigs_skesa.fasta - DRR090797_contigs_skesa.fasta
DRR090793_contigs_skesa.fasta - DRR090801_contigs_skesa.fasta
DRR090793_contigs_skesa.fasta - DRR090810_contigs_skesa.fasta
Reading reference genome
count contigs: 43
count sites: 4529549
Initializing alignments
count genomes: 0
Reading alignment blocks
Reading SNPs
Writing fasta
path: test_out/temp/mummer4/a_few_asms/msa.fa
Done!
Time (s): 1.15
Running mummer4; done!
Elapsed time: 12.280270099639893
Fetching file-type-specific parser; start
Traceback (most recent call last):
File "/home/pj/.conda/envs/maast/bin/bin/maast.py", line 1371, in <module>
main()
File "/home/pj/.conda/envs/maast/bin/bin/maast.py", line 1366, in main
end2end_main(args)
File "/home/pj/.conda/envs/maast/bin/bin/maast.py", line 1310, in end2end_main
call_snps_main(args)
File "/home/pj/.conda/envs/maast/bin/bin/maast.py", line 1186, in call_snps_main
site_assembly = msa.monolithic_parse(args['msa_path'], args['msa_type'], args['max_samples'])
File "/home/pj/.conda/envs/maast/bin/align_io/msa.py", line 17, in monolithic_parse
return parse_control(msa_path, msa_type, max_sample)
File "/home/pj/.conda/envs/maast/bin/align_io/msa.py", line 14, in parse_control
return parse(msa_path, max_sample)
File "/home/pj/.conda/envs/maast/bin/align_io/xmfa_mummer4_io.py", line 30, in parse
cur_aln.ncols = len(cur_aln.seqs[0].seq)
IndexError: list index out of range
Hi James, thanks for reporting the issue. The cause could be failure or unexpected outcome stemmed from individual alignment. Is there way that you could share these input sequences? I would like to start with reproducing the issue on my end.
You bet. See attached for the assemblies and thanks for taking a look at the issue.
a_few_asms.zip
Thanks for providing the sample assemblies. With those files, I have identified and fixed a bug causing this issues. It turns out to be a couple of places in the scripts did not handle assembly files with a ".fasta" suffix in the way they should have. I've pushed a patch to this repository as well as uploaded a new conda release. I will tentatively close this issue for now. Please feel free to reopen it if you still experience the similar problem.