mummer4/mummer

What is meant by the number of "sampled suffix positions"?

krinsman opened this issue · 0 comments

Referring specifically to the -k option of the mummer command.

I searched on Google, as well as in the complete Mummer documentation at
https://github.com/mummer4/mummer/blob/master/docs/maxmat3src.pdf

The only place I could find the term "sampled suffix position" (or even "sampled position" or "suffix position") used was in the help output for the -k option.

<< "-k sampled suffix positions (one by default)" << '\n'

Apparently the -k option is valid only for -maxmatch, but it is not clear why

if(K != 1 && type != MEM) { std::cerr << "-k option valid only for -maxmatch" << std::endl; exit(1); }

Also apparently the -k option has something to do with "sparseness", but again it is not clear why

<< "these include: sparseness (-k), suffix links (-suflink), child array (-child) and kmer table size (-kmer)." << std::endl;

Also why the -threads option is only valid for when k > 1 is not clear. This in turn makes it even more difficult to understand why there is both a -threads and a -qthreads option.

I would have just ignored the existence of the option, except that the default examples given in the help both use a non-default option for k.

<< "./mummer -maxmatch -l 20 -b -n -k 3 -threads 3 ref.fa query.fa" << '\n'

<< "./mummer -maxmatch -l 20 -b -n -k 3 -qthreads 3 ref.fa query.fa" << '\n'

Here is the most basic question one could ask which isn't clear to me based on the documentation:

  • does a higher value of k lead to increased computational expense but better/more accurate results?
  • or does a higher value of k lead to decreased computational expense but worse/less accurate results?

The fact that multi-threading is only an option when k > 1 suggests the former. At the same time, since it apparently has something to do with "sparseness", it also seems plausible that k > 1 would lead to worse output.

Any improvements to the documentation would be greatly appreciated and would make me feel more confident recommending Mummer to colleagues.