medvedevgroup/TwoPaCo

Large k value

Closed this issue · 8 comments

Hello !
I would be very interested to use TwoPaCo with large kmers.
It works with 281 but not with 291 on Ecoli reference genome.
./twopaco -f 30 -k 291 ../../../../data/ecoli.fa
Give a segfault.

Would it be possible for TwoPaCo to works on arbitray size of k ?

Yes, it is possible. I updated the doc: https://github.com/medvedevgroup/TwoPaCo#k-mer-size

In the next release I will make it a parameter for running CMake such that the user will be allowed to specify maximum K directly without editing the code.

That would be the perfect solution.

Also do you think it would be possible for graphdump to have a regular fasta output option ?
It would be really convenient.

Excellent piece of work by the way !

Thank you. You mean FASTA file with compressed paths? If so, it is not hard to do, I will add in the next release.

Yes it would be great. Thank you for your answers

Added in 0.9.2.

I just tested it

graphdump -f fasta test.bin -s seq51.fa
PARSE ERROR: Argument: -f (--format)
Value 'fasta' does not meet constraint: seq|group|dot|gfa1|gfa2

That is interesting. Are you sure you pulled the latest revision? What is the output of --help, does fasta appear in the formats list?