ekg/seqwish

(core dumped) seqwish

HaploKit opened this issue · 19 comments

Hi, I am trying to use seqwish to construct variation graph from raw long reads. It run successfully when using 10,000 reads, however, It failed when running 100,000 reads, these reads are from the same fastq file. Here are the commands and error information:
prefix=test; minimap2 -x ava-ont -t 48 -c -X $prefix.fq $prefix.fq >$prefix.paf; seqwish -t 32 -s $prefix.fq -p $prefix.paf -g $prefix.gfa
length for 9b3413df-02a4-423e-8bd0-bb1e69cf7a93, expected 450 but got 0
seqwish: /home/software/seqwish-0.1/src/gfa.cpp:133: void seqwish::emit_gfa(std::ostream&, size_t, const string&, mmmulti::iitree<long unsigned int, long unsigned int>&, mmmulti::iitree<long unsigned int, long unsigned int>&, const sdsl::sd_vector<>&, const rank_1_type&, const select_1_type&, seqwish::seqindex_t&, mmmulti::set<std::pair<long unsigned int, long unsigned int> >&): Assertion `false' failed.
work.sh: line 4: 7666 Aborted (core dumped) seqwish -t 32 -s $prefix.fq -p $prefix.paf -g $prefix.gfa
Command exited with non-zero status 134

It seems maybe there is something wrong with this read '9b3413df-02a4-423e-8bd0-bb1e69cf7a93' in test.fq, here is the sequence and quality score for this read:
@9b3413df-02a4-423e-8bd0-bb1e69cf7a93 CGTACTTCGTTCAGTTACGTATTGCTAAGGGTTAAAATCAAGACTCGCTGTGCCTAGTTCAGCACCTGTTTCCCACTGGAGGATAATGGGACGCCAGTTTCGAAGGAAACGTTGTTGGGATACTACCCCCACTTGTGTTATAGCCTCTAACCCGGGTAGGTGATCCCTATCGGAGACAGTGTCTGACAGGCAGTTTGACTGGGGCAGTCGCCTCTAAAAGGTAACAGGAGAGCCCAAAGGTTCCCTCAGAATGGTTGGAAGTAATTGCAGTGTAAAGATGTCAGGGCTTGACTGAGCTACAACTCGAGCAGGGACGAAAAGAGATCGAACAGTGATCCGGTGGTTCCGTATGGAAGACCGTACTCAGCAGGATCAAATGCCGTGTCTCAGTTGGAAGCAGGTGCTGAACTGAAGCTGGCTTGAGTCTTGGTTTAACATGGCAATACGTAA + &%,$(<9?BI7E1?02.))-++.,784*'543/35053?>0=9EB>:79.(.9.6,0-;8.8=>?BBB1E?<><1;<@:2:;324477?C?;=AK5D?H<6&%'&1#%&),0:=;809>?96))-12,&$##&&2/':&,((&##$%%)&()$+('&))'55-338())%/)022744<@96;?;($$/4114CAD>=ACA=5.((8>>?>;)?1C?;)(62B-''3($*-*)7%18*/?KAA>A,C:34CGI>G?5&)*)**%%(###%.$+3-%'$##%$&&(())*)%$##&)//-9<7?6?820++@A:B>?CA0%%$'-2;(%'$%)0??565:;D>;H;<;0FGIH=64/%43,,)%(0&&)*+/($&'*)##%'%%$&/'0$&#%++*.$$$(&92')8>@=;/-*)%'%2013<=@<>@>),7=6+'&$$#./5+)):977%

Any help would be appreciated.

ekg commented

You've found an error that I'm aware of. Sorry for this. You can downgrade and get a much lower-performance version of seqwish that should induce the same graph.

I have a test case and I need to use it to fix this. If you can share yours in some way then I'd be very appreciative as well.

You've found an error that I'm aware of. Sorry for this. You can downgrade and get a much lower-performance version of seqwish that should induce the same graph.

I have a test case and I need to use it to fix this. If you can share yours in some way then I'd be very appreciative as well.

Thank you for your quick response. But I use release version of v0.1 of seqwish from here:
https://github.com/ekg/seqwish/releases/tag/v0.1 . Should I try v0.2 ?

ekg commented
ekg commented

And please let me know which work or not.

On Sun, Mar 22, 2020, 15:40 Erik Garrison @.> wrote: Try v0.2. If not that, then try the current master. On Sun, Mar 22, 2020, 15:34 Vincent @.> wrote: > You've found an error that I'm aware of. Sorry for this. You can > downgrade and get a much lower-performance version of seqwish that should > induce the same graph. > > I have a test case and I need to use it to fix this. If you can share > yours in some way then I'd be very appreciative as well. > > Thank you for your quick response. But I use release version of v0.1 of > seqwish from here: > https://github.com/ekg/seqwish/releases/tag/v0.1 . Should I try v0.2 ? > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#36 (comment)>, or > unsubscribe > https://github.com/notifications/unsubscribe-auth/AABDQEKFOL22VSGTPXKD3TTRIYOXHANCNFSM4LRJ4TJA > . >

Hi, I tried both v0.2 and the current master, but it failed when building, I could not figure it out. Could you please help me? here is the error info:

cmake -H. -Bbuild && cmake --build build -- -j3
-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Check for working C compiler: /export/scratch1/home/vincent/software/miniconda3/bin/x86_64-conda_cos6-linux-gnu-cc
-- Check for working C compiler: /export/scratch1/home/vincent/software/miniconda3/bin/x86_64-conda_cos6-linux-gnu-cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /export/scratch1/home/vincent/software/miniconda3/bin/x86_64-conda_cos6-linux-gnu-c++
-- Check for working CXX compiler: /export/scratch1/home/vincent/software/miniconda3/bin/x86_64-conda_cos6-linux-gnu-c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
CMake Error at /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:2525 (message):
No download info given for 'sdsl-lite' and its source directory:

/export/scratch3/vincent/software/seqwish-0.2/deps/sdsl-lite

is not an existing non-empty directory. Please specify one of:

  • SOURCE_DIR with an existing non-empty directory
  • DOWNLOAD_COMMAND
  • URL
  • GIT_REPOSITORY
  • SVN_REPOSITORY
  • HG_REPOSITORY
  • CVS_REPOSITORY and CVS_MODULE
    Call Stack (most recent call first):
    /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:3100 (_ep_add_download_command)
    CMakeLists.txt:53 (ExternalProject_Add)

CMake Error at /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:2525 (message):
No download info given for 'tayweeargs' and its source directory:

/export/scratch3/vincent/software/seqwish-0.2/deps/args

is not an existing non-empty directory. Please specify one of:

  • SOURCE_DIR with an existing non-empty directory
  • DOWNLOAD_COMMAND
  • URL
  • GIT_REPOSITORY
  • SVN_REPOSITORY
  • HG_REPOSITORY
  • CVS_REPOSITORY and CVS_MODULE
    Call Stack (most recent call first):
    /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:3100 (_ep_add_download_command)
    CMakeLists.txt:65 (ExternalProject_Add)

CMake Error at /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:2525 (message):
No download info given for 'gzipreader' and its source directory:

/export/scratch3/vincent/software/seqwish-0.2/deps/gzip_reader

is not an existing non-empty directory. Please specify one of:

  • SOURCE_DIR with an existing non-empty directory
  • DOWNLOAD_COMMAND
  • URL
  • GIT_REPOSITORY
  • SVN_REPOSITORY
  • HG_REPOSITORY
  • CVS_REPOSITORY and CVS_MODULE
    Call Stack (most recent call first):
    /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:3100 (_ep_add_download_command)
    CMakeLists.txt:73 (ExternalProject_Add)

CMake Error at /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:2525 (message):
No download info given for 'mmmultimap' and its source directory:

/export/scratch3/vincent/software/seqwish-0.2/deps/mmmultimap

is not an existing non-empty directory. Please specify one of:

  • SOURCE_DIR with an existing non-empty directory
  • DOWNLOAD_COMMAND
  • URL
  • GIT_REPOSITORY
  • SVN_REPOSITORY
  • HG_REPOSITORY
  • CVS_REPOSITORY and CVS_MODULE
    Call Stack (most recent call first):
    /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:3100 (_ep_add_download_command)
    CMakeLists.txt:81 (ExternalProject_Add)

CMake Error at /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:2525 (message):
No download info given for 'iitii' and its source directory:

/export/scratch3/vincent/software/seqwish-0.2/deps/iitii

is not an existing non-empty directory. Please specify one of:

  • SOURCE_DIR with an existing non-empty directory
  • DOWNLOAD_COMMAND
  • URL
  • GIT_REPOSITORY
  • SVN_REPOSITORY
  • HG_REPOSITORY
  • CVS_REPOSITORY and CVS_MODULE
    Call Stack (most recent call first):
    /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:3100 (_ep_add_download_command)
    CMakeLists.txt:90 (ExternalProject_Add)

CMake Error at /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:2525 (message):
No download info given for 'mmap_allocator' and its source directory:

/export/scratch3/vincent/software/seqwish-0.2/deps/mmap_allocator

is not an existing non-empty directory. Please specify one of:

  • SOURCE_DIR with an existing non-empty directory
  • DOWNLOAD_COMMAND
  • URL
  • GIT_REPOSITORY
  • SVN_REPOSITORY
  • HG_REPOSITORY
  • CVS_REPOSITORY and CVS_MODULE
    Call Stack (most recent call first):
    /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:3100 (_ep_add_download_command)
    CMakeLists.txt:97 (ExternalProject_Add)

CMake Error at /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:2525 (message):
No download info given for 'ips4o' and its source directory:

/export/scratch3/vincent/software/seqwish-0.2/deps/ips4o

is not an existing non-empty directory. Please specify one of:

  • SOURCE_DIR with an existing non-empty directory
  • DOWNLOAD_COMMAND
  • URL
  • GIT_REPOSITORY
  • SVN_REPOSITORY
  • HG_REPOSITORY
  • CVS_REPOSITORY and CVS_MODULE
    Call Stack (most recent call first):
    /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:3100 (_ep_add_download_command)
    CMakeLists.txt:106 (ExternalProject_Add)

CMake Error at /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:2525 (message):
No download info given for 'bbhash' and its source directory:

/export/scratch3/vincent/software/seqwish-0.2/deps/BBHash

is not an existing non-empty directory. Please specify one of:

  • SOURCE_DIR with an existing non-empty directory
  • DOWNLOAD_COMMAND
  • URL
  • GIT_REPOSITORY
  • SVN_REPOSITORY
  • HG_REPOSITORY
  • CVS_REPOSITORY and CVS_MODULE
    Call Stack (most recent call first):
    /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:3100 (_ep_add_download_command)
    CMakeLists.txt:115 (ExternalProject_Add)

CMake Error at /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:2525 (message):
No download info given for 'atomicbitvector' and its source directory:

/export/scratch3/vincent/software/seqwish-0.2/deps/atomicbitvector

is not an existing non-empty directory. Please specify one of:

  • SOURCE_DIR with an existing non-empty directory
  • DOWNLOAD_COMMAND
  • URL
  • GIT_REPOSITORY
  • SVN_REPOSITORY
  • HG_REPOSITORY
  • CVS_REPOSITORY and CVS_MODULE
    Call Stack (most recent call first):
    /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:3100 (_ep_add_download_command)
    CMakeLists.txt:125 (ExternalProject_Add)

CMake Error at /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:2525 (message):
No download info given for 'atomicqueue' and its source directory:

/export/scratch3/vincent/software/seqwish-0.2/deps/atomic_queue

is not an existing non-empty directory. Please specify one of:

  • SOURCE_DIR with an existing non-empty directory
  • DOWNLOAD_COMMAND
  • URL
  • GIT_REPOSITORY
  • SVN_REPOSITORY
  • HG_REPOSITORY
  • CVS_REPOSITORY and CVS_MODULE
    Call Stack (most recent call first):
    /export/scratch3/vincent/software/miniconda3/share/cmake-3.12/Modules/ExternalProject.cmake:3100 (_ep_add_download_command)
    CMakeLists.txt:135 (ExternalProject_Add)

-- Configuring incomplete, errors occurred!
See also "/export/scratch3/vincent/software/seqwish-0.2/build/CMakeFiles/CMakeOutput.log".

ekg commented

Thank you very much!
It works now. But there is still 'core dumped' error when running this test data: https://drive.google.com/file/d/1yvTEAYTZJnCabr3J9nGCaV6uizJcCaj7/view?usp=sharing
It would be great if you can download and test.

The commands I use:
prefix=test minimap2 -x ava-ont -t 48 -c -X $prefix.fq $prefix.fq >$prefix.paf seqwish -t 32 -s $prefix.fq -p $prefix.paf -g $prefix.gfa
Here is the error infor:

length for 9b3413df-02a4-423e-8bd0-bb1e69cf7a93, expected 450 but got 451
seqwish: /export/scratch3/vincent/software/seqwish/src/gfa.cpp:137: void seqwish::emit_gfa(std::ostream&, size_t, const string&, mmmulti::iitree<long unsigned int, long unsigned int>&, mmmulti::iitree<long unsigned int, long unsigned int>&, const sdsl::sd_vector<>&, const rank_1_type&, const select_1_type&, seqwish::seqindex_t&, mmmulti::set<std::pair<long unsigned int, long unsigned int> >&): Assertion `false' failed.
work.sh: line 4: 14107 Aborted (core dumped) seqwish -t 32 -s $prefix.fq -p $prefix.paf -g $prefix.gfa
Command exited with non-zero status 134

ekg commented

Yeah that's the bug. Sorry about this it will take me a few days to get to.

On Sun, Mar 22, 2020, 23:04 Vincent @.***> wrote: Thank you very much! It works now. But there is still 'core dumped' error when running this test data: https://drive.google.com/file/d/1yvTEAYTZJnCabr3J9nGCaV6uizJcCaj7/view?usp=sharing It would be great if you can download and test. The commands I use: prefix=test minimap2 -x ava-ont -t 48 -c -X $prefix.fq $prefix.fq >$prefix.paf seqwish -t 32 -s $prefix.fq -p $prefix.paf -g $prefix.gfa Here is the error infor: length for 9b3413df-02a4-423e-8bd0-bb1e69cf7a93, expected 450 but got 451 seqwish: /export/scratch3/vincent/software/seqwish/src/gfa.cpp:137: void seqwish::emit_gfa(std::ostream&, size_t, const string&, mmmulti::iitree<long unsigned int, long unsigned int>&, mmmulti::iitree<long unsigned int, long unsigned int>&, const sdsl::sd_vector<>&, const rank_1_type&, const select_1_type&, seqwish::seqindex_t&, mmmulti::set<std::pair<long unsigned int, long unsigned int> >&): Assertion `false' failed. work.sh: line 4: 14107 Aborted (core dumped) seqwish -t 32 -s $prefix.fq -p $prefix.paf -g $prefix.gfa Command exited with non-zero status 134 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#36 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQENK43YKGXF2LJL5TSDRI2DPLANCNFSM4LRJ4TJA .

OK,thanks.

Hi, Erik, is there any update on this issue? Thank you.

ekg commented
ekg commented
ekg commented

Please let me know if this resolves the problem you were running into!

I had a range extension check that was not implemented correctly for both strands. Somehow, this was only a problem in certain kinds of graphs, typically those with extremely complex tangled regions.

ekg commented

Also, it might make sense to use the -X option when mapping.

I got this to make what seems like a reasonable graph:

minimap2 -cx asm20 -t 48 -X test.fastq test.fastq >test.paf
ekg commented

@vincentluo91 I tried this:

minimap2 -cx asm20 -t 48 -X test.fastq test.fastq >test.paf
seqwish -t 48 -s test.fastq -p test.1.paf -g test.gfa
odgi build -g test.gfa -o - -p | odgi sort -p bSn -A -i - -t 48 -o test.odgi
odgi viz -i test.odgi -o test.odgi.png -x 4000 -y 400 -P 1

image

This shows that a bit of the graph is covered by a lot of the reads, but there are a ton of "tips" (part to the right, mostly empty) where the read ends aren't aligning to each other.

Not sure if this matches what you'd expect, but I thought I'd share the process I would use initially.

In odgi or another related tool, I plan to work out assembly graph steps like tip pruning and bubble popping. They both amount to a kind of read correction. I'm thinking about the best way to implement these. My current thought is to use MEMs found in the GBWT to structure the error correction.

ekg commented