Incremental update of the graph
huguesrichard opened this issue · 8 comments
Hello,
Thank you very much for abPOA, it is a nice tool, installation was fast and easy.
I would like to be able to incrementally update the POA graph. That would be very practical to use the graph as a compressed aligned version of the sequences and to add sequences to it as they accumulate over time.
Typical use case would be to first generate a graph (for instance in gfa format) and then to be able to add sequences to this graph with additional commands. For instance with an --increment
option:
abpoa -r 3 seqs.fa > graph.gfa
abpoa --increment newseqs.fa graph.gfa > newgraph.gfa
Best regards,
Hugues
This is theoretically doable, I will give it a try.
I will post the updates here when it's ready.
Yan
Hello again,
As a complement to my previous request, the gfa graphs produced by abPOA are usually quite huge and it was easy to transform it to unitigs using Heng Li's gfatools
, e.g.
gfatools
asm -u graph.gfa > graph_unitig.gfa`
That would be great if graph_unitig.gfa
could be provided to abPOA as input. The graph can then practically be used as a database for short sequences.
Hugues
Thanks for the suggestion!
I will try to add this feature in the next version.
Yan
@huguesrichard Please try out the latest abPOA v1.1.0.
It now can incrementally align sequences to an existing GFA or MSA.
Let me know if this works for you.
Yan
Hello @yangao07,
I tried adding sequences to a gfa produced by apPOA and this worked directly.
That's really a great feature, thank you!
I will try it out a little more in the next days and let you know if I see anything strange on the resulting MSAs.
I also tried with a gfa simplified to unitigs (using Heng Li's gfatools
) but in this case abPOA did not recognise the gfa file.
Also, that would be great to have a few information messages printed to stderr as abPOA runs. I am running it on a few thousand sequences now and I am always unsure where it in in terms processing the files.
Anyway, thanks again for adding the feature
Also, I could not get access to the release, I guess it was not published yet.
Also, I could not get access to the release, I guess it was not published yet.
I haven't pushed it to the release yet.
I also tried with a gfa simplified to unitigs (using Heng Li's
gfatools
) but in this case abPOA did not recognise the gfa file.
The unitigs by gfatools have no P
lines, which are required for incremental graph alignment, that is why it is not supported.
On the other hand, I think it is not hard for abPOA to output GFA with unitigs. I will try to add this feature.
a graph output with unitig would be really really helpfull. From my small tests on viral genomes I had around 50-fold compression generating the unitig version.