bioinfologics/w2rap-contigger

Read lengths and large K

andreaswallberg opened this issue · 1 comments

Dear developers,

I have ~40x coverage of 2x150 Illumina reads produced using 10x Chromium libraries for an organism with a complex genome (we also have long reads) and like to try w2rap-contigger. However, I don't really understand how to select a value for the parameter large K=n.

Is this value bounded by the read-length, i.e. should it be below 150 (or 300)? In other words, what are the important factors that govern the value specified for K?

Hi Andres, sorry about the delay on this, we've not been keeping a close eye on issues, obviously.

Yes, you want large_K to be smaller than you read size. Important factors are: smaller than read size, you want smaller values if the graph gets disconnected, and larger values if it is too tightly connected. These are functions of gneome composition, but a bit complicated to get into details here.