Error Correction with multiple dataset
Opened this issue · 2 comments
Dear Tamas,
I am thinking of doing a denovo assembly of E.coli using the assembler CANU. As setting up a pipeline, I came to think of merging CANU, with poreseq for error correcting the assembly.
I am taking the 4 datasets and combining into one Highquality2D dataset and running the Assembly. So basically I am merging 4 datasets. Can you please enlighten me whether should I error correct the assembled data(after assembly) with each data set individually, and use the error corrected data as the input for the next error correction and continue 4 times the error correction? Or is it possible to give the path to all the 4 data sets of fast5 folders and do it one shot.
I hope you got my idea.
Thanks in advance.
Athul
Hi Athul,
You should have no problem generating the assembly using all of the data and then using all 4 datasets to error correct in one pass. If you have problems, however, you may wish to try nanopolish instead - I believe they have done a better job keeping up with the changes to Oxford's format and files than I have.
-Tamas
Hi Tamas,
Thank you for your suggestion. Will try both the tools.
Athul