AntonBankevich/LJA

Child process crashed error

peterdfields opened this issue · 14 comments

Hi,

Running LJA with default parameters I see the following stdout

00:00:00 2Mb  INFO: Hello. You are running La Jolla Assembler (LJA), a tool for genome assembly from PacBio HiFi reads
00:00:00 5Mb  INFO: LJA pipeline started
00:00:00 5Mb  INFO: Performing initial correction with k = 501
00:00:00 0Mb  INFO: Reading reads
00:00:00 0Mb  INFO: Extracting minimizers
00:03:11 6.8Gb  INFO: Finished read processing
00:03:11 6.8Gb  INFO: 11276142 hashs collected. Starting sorting.
00:03:12 7Gb  INFO: Finished sorting. Total distinct minimizers: 1028771
00:03:12 7Gb  INFO: Starting construction of sparse de Bruijn graph
00:03:12 7Gb  INFO: Vertex map constructed.
00:03:12 7Gb  INFO: Filling edge sequences.
00:12:13 12.8Gb  INFO: Finished sparse de Bruijn graph construction.
00:12:13 12.8Gb  INFO:  Collecting tips
00:12:14 13.1Gb  INFO: Added 446527 artificial minimizers from tips.
00:12:14 13.1Gb  INFO: Collected 3542273 old edges.
00:12:15 13.1Gb  INFO: New minimizers added to sparse graph.
00:12:15 13.1Gb  INFO: Refilling graph with old edges.
00:14:08 13.1Gb  INFO: Filling graph with new edges.
00:14:24 13.1Gb  INFO: Finished fixing sparse de Bruijn graph.
00:14:24 13.1Gb  INFO: Finished fixing sparse de Bruijn graph.
00:14:24 13.1Gb  INFO: Statistics for sparse de Bruijn graph:
00:14:30 13.1Gb  INFO: Starting to extract disjointigs.
00:14:39 13.1Gb  INFO: Finished extracting 1652070 disjointigs of total size 1968411175
00:15:18 0Mb  INFO: Loading disjointigs from file "lja_base/k501/disjointigs.fasta"
00:15:52 5.1Gb  INFO: Filling bloom filter with k+1-mers.
00:21:05 5.1Gb  INFO: Filled 4935340205 bits out of 36503171360
00:21:05 5.1Gb  INFO: Finished filling bloom filter. Selecting junctions.
00:24:22 5.5Gb  INFO: Collected 3201119 junctions.
00:24:37 5.5Gb  INFO: Starting DBG construction.
00:24:40 5.5Gb  INFO: Vertices created.
00:25:45 5.5Gb  INFO: Filled dbg edges. Adding hanging vertices
00:25:46 5.5Gb  INFO: Added 77 hanging vertices
00:25:46 5.5Gb  INFO: Merging unbranching paths
00:25:48 5.5Gb  INFO: Ended merging edges. Resulting size 2915824
00:26:10 5.5Gb  INFO: Cleaning edge coverages
00:26:11 5.5Gb  INFO: Collecting alignments of sequences to the graph
00:44:12 28.6Gb  INFO: Alignment collection finished. Total length of alignments is 998912618
00:44:12 28.6Gb  INFO: Correcting dinucleotide errors in reads
06:55:07 28.6Gb  INFO: Applying corrections to reads
07:00:17 28.7Gb  INFO: Applied correction to 143090 reads
07:00:17 28.7Gb  INFO: Corrected 143090 dinucleotide sequences
07:00:17 28.7Gb  INFO: Marking reliable edges
07:00:22 28.7Gb  INFO: Marked 98385 edges in 26500 paths as reliable
07:00:22 28.7Gb  INFO: Correcting low covered regions in reads with K = 800
07:09:51 28.9Gb  INFO: Applying corrections to reads
07:18:03 28.9Gb  INFO: Applied correction to 526763 reads
07:18:03 28.9Gb  INFO: Corrected low covered regions in 526763 reads with K = 800
07:18:03 28.9Gb  INFO: Applying changes to the graph
AGATAGAGACGACGCTCATATATAGCAGTATCAGCATCGTCAGTCATGTCGTCTCGCTGCTGCACTGCACGTACTGCTCGCTCTCATGTCAGTCAGATGTCAGAGACAGTGATGAGTGCACGATAGTGTATCTCATCTGTGTGCAGAGATATCATGACGAGATGACAGATCTGTCTGCTCTCGCTCTGTGCTGACATCGCTCAG
AGTACGACACAGACACTCTCGTATACGAGAGTGTCTGTGTCGTACTCTGAGCGATGTCAGCACAGAGCGAGAGCAGACAGATCTGTCATCTCGTCATGATATCTCTGCACACAGATGAGATACACTATCGTGCACTCATCACTGTCTCTGACATCTGACTGACATGAGAGCGAGCAGTACGTGCAGTGCAGCAGCGAGACGACA
TGACTGACGATGCTGATACTGCTATATATGAGCGTCGTCTCTATCTGTGACGCAGATGCGTGATCGCACGCGCAGTCTCGCTAGACGTATCTA
2
=== Stack Trace ===
/home/peter/bioinformatics/LJA/bin/lja(_Z16print_stacktracev+0x58) [0x55e5b63b3ef8]
/home/peter/bioinformatics/LJA/bin/lja(_ZNK3dbg6Vertex11getOutgoingEh+0xf0) [0x55e5b6445710]
/home/peter/bioinformatics/LJA/bin/lja(+0x5889c) [0x55e5b63be89c]
/usr/lib/x86_64-linux-gnu/libgomp.so.1(+0x1696e) [0x7f9a0cf2d96e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f9a0cae76db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f9a0c81071f]
lja: /home/peter/bioinformatics/LJA/src/projects/dbg/sparse_dbg.cpp:301: dbg::Edge& dbg::Vertex::getOutgoing(unsigned char) const: Assertion `false' failed.

The machine is running ubuntu v.18.04, gcc v.7.5.0, cmake 3.21.3, python v.3.9.7, networkx v.2.6.3, llist v.0.7.1, edlib v.1.2.3, joblib v.1.1.0, biopython v.1.79, pydot v.1.4.2. The genome size of the focal dataset is ~200Mbp, and we have ~100X of HiFi reads. Please let me know if additional information would be helpful to troubleshoot the above error.

Hi! Thank you for the bug report. Could you send me the dbg.log file from your lja_base directory? Also would it be possible for you to share this dataset with me?

Hi @AntonBankevich

You can see the contents of the dbg.log file here

I can share the reads with you. Do you have an email address I can contact you over to share a download link?

Please, send me the link to this address: anton.bankevich@gmail.com

Thank you for the data. I have confirmed the problem and working on fixing it. I will let you know when it is done.

Hi @AntonBankevich

Thank you for letting me know. I'll look forward to working with LJA more once the bug is fixed!

tcb72 commented

Just FYI getting the same error (I think) on two separate genomes (one haploid, one diploid.) Looking forward to the fixed version :) :

00:28:19 2.1Gb  INFO: Applying changes to the graph
CGACTACACATCTCACGACACGAGCTGACGACAGCATGCACACTGTAGAGCGATAGATCTCAGACACACTCTAGCACTGTAGTCTCGCGTGCATCGATACACATCTCACGCTGTGCGCGTCATCTGAGTCACTCTGCGAGCGTACTCAGCGATACTACGCGTAGCTAGATACTGACGTGATACGTCACATCGAGTATCATCGTACGCTAGACTACTGTATCTATCATGCTCTAGCTCGTCTCTGAGTGTCAGTATAGCAGTAGAGTGCTCGCATCGTGTCTCATATCTACGCATCACGCTCACTGATCTCTACTACTATACTCAGTCTCTAGTCTACTGCAGATGAGTGAGCTCAGATACAGTAGACTAGACACTGCAGACGCTACGCAGTATCGATACACTGCATCGTCTACGCGCTGCTGCACGAGTAGCGATGCTCACTAGTACGTCAGATCTCTCTAGTACGAGTACACTCAGTAGTCTCATCTCACGCGTATGCTCTGTCAGCTCGCATGCAGATCTCACTGCTGCTCGTAGAGTCTGACGTGTCTCAGTCAGTGTGCTGATCGTCTCTCAGACAGCTACTGATCGTCGCTGTAGCTACTACACTAGCTATCAGCGCGAGCTCTCTCAGCAGAGTAGAGTCAGTATACTATCTCTCTGTCACAGCATATGTAGCGATGTCATCGTATCTCTAGCGATCTCACGTGTACTCACGTCGCACTATACGAGTATCGTCGACTGCATGTGTAGCATACGCAGCGTCATCTGAGCAGATCACTCTCGTAGTATCTAGTCTCTATCACTAGTAGACTATAGTACTAGCTGTCTGTAGTATGAGACTCACAGTCTACATACTGTGAGTATGTGTCAGTGTATATCTATGTCATCTACGTGACACTAGTATCTGATATATCACAGTAGATCTATATGTCACTACTGACATATATACACTATGCATATATCATCGACGACTCTCTAGTGTCTAGTAGCTCTATGTATAGTGTACTATCAGATGTACATACACATCTATATCACTATACTACTGATATGCTGCTGATGAGTGTATAGTAGACTGATCTATATCAGATCAGAGATACGATCTACTATACAGTGCTGTATAGTAGTGACGCTCGATGTGCTGAGTGTCGCATGTGACAGTCTACTGCACTACTACACGTGATCAGCTGACGTCTCATAGACAGATAGTATCATATATGTCAGACGTGCTAGCTGTGTAGCATCTACGACGTATGCTGCGCTACGATCAGACTATAGAGCTGCTGTGCACAGTGTAGACAGAGTGACAGCATAGCTGAGATATGTCTCAGCGTATCGTAGACATGTCTGAGTATATGACACACTGTGCGCGCTATAGATCTCTCAGTATGTGATCGAGAGCATACTGACGCTGTCGAGTCGACTACTGTATGTACTGATGATGAGCTCATCAGACGTATATATACGTCTGATGAGCTCATCATCAGTACATACAGTAGTCGACTCGACAGCGTCAGTATGCTCTCGATCACATACTGAGAGATCTATAGCGCGCACAGTGTGTCATATACTCAGACATGTCTACGATACGCTGAGACATATCTCAGCTATGCTGTCACTCTGTCTACACTGTGCACAGCAGCTCTATAGTCTGATCGTAGCGCAGCATACGTCGTAGATGCTACACAGCTAGCACGTCTGACATATATGATACTATCTGTCTATGAGACGTCAGCTGATCACGTGTAGTAGTGCAGTAGACTGTCACATGCGACACTCAGCACATCGAGCGTCACTACTATACAGCACTGTATAGTAGATCGTATCTCTGATCTGATATAGATCAGTCTACTATACACTCATCAGCAGCATATCAGTAGTATAGTGATATAGATGTGTATGTACATCTGATAGTACACTATACATAGAGCTACTAGACACTAGAGAGTCGTCGATGATATATGCATAGTGTATATATGTCAGTAGTGACATATAGATCTACTGTGATATATCAGATACTAGTGTCACGTAGATGACATAGATATACACTGACACATACTCACAGTATGTAGACTGTGAGTCTCATACTACAGACAGCTAGTACTATAGTCTACTAGTGATAGAGACTAGATACTACGAGAGTGATCTGCTCAGATGACGCTGCGTATGCTACACATGCAGTCGACGATACTCGTATAGTGCGACGTGAGTACACGTGAGATCGCTAGAGATACGATGACATCGCTACATATGCTGTGACAGAGAGATAGTATACTGACTCTACTCTGCTGAGAGAGCTCGCGCTGATAGCTAGTGTAGTAGCTACAGCGACGATCAGTAGCTGTCTGAGAGACGATCAGCACACTGACTGAGACACGTCAGACTCTACGAGCAGCAGTGAGATCTGCATGCGAGCTGACAGAGCATACGCGTGAGATGAGACTACTGAGTGTACTCGTACTAGAGAGATCTGACGTACTAGTGAGCATCGCTACTCGTGCAGCAGCGCGTAGACGATGCAGTGTATCGATACTGCGTAGCGTCTGCAGTGTCTAGTCTACTGTATCTGAGCTCACTCATCTGCAGTAGACTAGAGACTGAGTATAGTAGTAGAGATCAGTGAGCGTGATGCGTAGATATGAGACACGATGCGAGCACTCTACTGCTATACTGACACTCAGAGACGAGCTAGAGCATGATAGATACAGTAGTCTAGCGTACGATGATACTCGATGTGACGTATCACGTCAGTATCTAGCTACGCGTAGTATCGCTGAGTACGCTCGCAGAGTGACTCAGATGACGCGCACAGCGTGAGATGTGTATCGATGCACGCGAGACTACAGTGCTAGAGTGTGTCTGAGATCTATCGCTCTACAGTGTGCATGCTGTCGTCAGCTCGTGTCGTGAGATGTGTAGTCGCACGAGCGCACTATAGTCTATGTCTAGAGACTGCGTGACACGAGAGTGAGACGACGTCAGTCATCATGCTACACTGCTACACACGTCTACATGTAGACATAGTGCATCGCGAGATAGCTATCTGACTATCAGTACAGATGCAGCTGCACTCGCTGCATGAGTGATCGCTAGTATCGCTGTCAGCTACACAGCGTGATCGTCGCTGTACACACGCGTCACACATGAGCTGTGTACGAGTCGTATCTACGTAGAGAGATGCGAGTATAGTGACGAGTGAGTCGTACAGTAGCGTACGAGTGCGCTGATGACTCTACAGATAGTACTGTAGTGATGACTATCGCTAGCGATATATACAGAGCTATAGCTCAGTGTAGAGCACACTGATAGTGAGTCTCTGTCAGTCAGATAGCTGCTGTATAGCTCAGCTGTAGAGCGCTGCTGCACGCAGATGTCAGCGTCGAGTCGCTACTCACACTATATCTGCTACATATGTCTCGATCAGAGAGCTGATACTGCATCAGAGCGATGAGACGCGATGACGCGATACGCTCGAGCGCAGCTGATCGAGTCGATGAGCACTCATAGACTCTCTGATACATAGAGAGAGCACTGACTGACATCTAGTACAGAGAGTACGATCTAGTAGCGCGAGCGACTGACTGCTACTACTCTGTAGTGTGACGATAGTAGCTATATAGAGATACGTGACGATAGTGATCTATACTAGAGTGATAGTCTGTAGTCGAGTACTCTATATAGTGCTAGCTACGTAGCACTCTCGCTCGAGTAGCATGAGCACGTGATCGTGTGATCGCAGACACTGCAGCTATATCTGATGACGATAGTGACAGTACGTGAGAGTGAGACGAGAGTGAGACATGACAGCTACAGCAGCAGAGCTACGAGTGTCAGCTACTGCGTGCTGTGAGATGAGCGCGACTGTATCTATGCTGTAGATACGAGCATAGTGAGCGAGCTGATACGAGCGTATGTCATAGTACAGACGACGATGATCTACATGCAGATGAGCTAGTACTAGTGAGTCGACTGACTGATGTGATCAGCGATGAGTGTGTAGTGATGCATCGATCGAGCTAGCTGTCTCGATGTGTAGCGCAGCGTGACAGACTAGTAGCACTGTCGTGCGCTGTATCGTACATCGAGCACTCTGATACTAGTGCTCGTGAGAGCGTCACTAGTAGACTCTGATAGCTCAGTGTCGAGAGACAGCAGACACGCTAGCATGATGCTAGTGACAGAGTGAGCTGACACAGAGTGCTAGAGCAGCATCTGAGAGTGCGTATAGCTCACTGTCGAGCATCTGCGCGATAGCGACTAGTCATCTGCGAGCGTGATAGCTAGAGTATCGTAGAGCGTCTGCAGTAGAGTGAGATAGCGAGAGCGAGTGACGACAGAGTGAGATGTCGCTGAGTAGCGACATGTGAGATCATGCGACTAGATCTCGAGCTCGTCGCGAGTAGTCAGACTAGCGAGCGAGAGCGTAGTCGATGACACAGTATATCTGTACAGCTATGTGATATCAGACGAGAGCTAGACTAGCAGAGTGTACTGTCATGTCGAGATGTGAGAGTAGAGATACTGAGTAGCATGAGTACAGATCGATCAGATCAGTAGTGATGTCACGCTCAGAGCTGCACTGTCATACATAGATGCTGTACAGACGACACAGTAGTGTAGAGTATACTAGCGCGAGAGACTCTCTCTAGACTCGCATAGCGTACTCGAGAGTGCTAGAGTGCATACAGTCAGCGACTGTACACACAGTCTCGCTAGTCGTAGACGATGTATGCTGACGCTGCAGTGCGAGTAGACGTGTCGCGTAGCAGCTATACTGAGCGTGACGCGCGTACTATACGTCTAGTAGCGATCTGTCGTAGTCGACGCACGAGCGTACGATGACACTGTCTCGAGAGAGCTCGTGATAGACTGTCTGTGAGATGCGACTCTGCACTGACAGAGACTATGAGCTCACTGCATCTGATGATCTGTCTACTGCGCAGATAGTGAGCATGAGATCTGTGATGAGCATCGTGAGATACACTCTAGCTCGATCTCACT
3
=== Stack Trace ===
LJA/bin/lja(_Z16print_stacktracev+0x38) [0x45b018]
LJA/bin/lja(_ZNK3dbg6Vertex11getOutgoingEh+0xc0) [0x4de1d0]
LJA/bin/lja(_Z11realignReadRKN3dbg14GraphAlignmentERKSt13unordered_mapIPNS_4EdgeESt6vectorINS_16PerfectAlignmentIS4_S4_EESaIS8_EESt4hashIS5_ESt8equal_toIS5_ESaISt4pairIKS5_SA_EEE+0x3ea) [0x46144a]
LJA/bin/lja() [0x46424f]
/cm/local/apps/gcc/8.2.0/lib64/libgomp.so.1(+0x162de) [0x2aaaab5812de]
/lib64/libpthread.so.0(+0x7e25) [0x2aaaab9b8e25]
/lib64/libc.so.6(clone+0x6d) [0x2aaaabccbbad]
lja: /[redacted]/LJA/src/projects/dbg/sparse_dbg.cpp:301: dbg::Edge& dbg::Vertex::getOutgoing(unsigned char) const: Assertion `false' failed.
Child process crashed

Not to sound like a broken record, but also chiming in here w/ the same issue on a 20X Human genome dataset. Looking forward to a resolution :)

Hi @AntonBankevich

I gave the new release a go on the same dataset that produced the error which initiated this issue. With the new version it does seem that the program progresses further, but the new error I'm seeing looks like the following:

61:26:01 2Gb  INFO: Export to GFA and compressed contigs
terminate called after throwing an instance of 'std::out_of_range'
  what():  map::at
Child process crashed

Please let me know if any additional information would be helpful.

Hi, this is very unfortunate. I will get back to you about it in several days.

Hi!
I could not reproduce the problem you described. The assembly successfully finished for me. I need to know several things to move forward in this issue.

  1. Is the dataset you are using the same as you sent me? The dataset I have contains 1965419 reads and the file name contains BAK8A_OA.
  2. Please send me the dbg.log file from the output directory.
  3. Please send me your cmake and compiler versions.
  4. Could you tell me what kind of data is this? I did not look carefully but it looks like a diploid dataset in which case --diploid parameter should be used.

Hi @AntonBankevich

I'm glad to hear at least things worked on your side. This certainly leads me to think the problem is somewhere on my setup's side. To answer your specific questions:

  1. This is the same dataset.
  2. I have emailed you a link to access the log file.
  3. cmake version = 3.21.3 and GCC version = 11.2.0 (let me know if other compiler information is useful)
  4. I also include a description of the genotype that was sequenced in the email I sent you.

Do you by chance have a static binary of the LJA pipeline that I could try?

Hi @AntonBankevich

Running LJA built from the source release code and in diploid mode allowed the assembly to complete successfully! Not quite as contiguous as hifiasm but I'm sure I need to try additional parameter sweeps to see if I can improve things. Thank you again for your assistance.

I am glad the crash problem is resolved. As for the results, I mentioned in the readme and changelog that diploid assembly is still in an experimental state. E.g. our final output currently is a phased assembly (where haplomes are in separate contigs) rather than consensus assembly as in hifiasem (where haplomes are glued together). So it is more appropriate to compare our results with their p_utg.gfa file (where they retain haplotypes separate) rather than their p_ctg.gfa file. Anyway, reporting all appropriate contigs for diploid genomes including consensus assembly and better assembly in general is the main focus of our next release.
Also I wanted to ask if your latest run also took 60 hours. For me the assembly of this dataset only took about 14 hours.

@AntonBankevich Thank you for the explanation of the results. I'll definitely keep an eye on the repo and retry the assembly as updates come along. As for run time the assembly completed in just over 27 hours using 28 vCPU (Xeon 5120).