mclow/Boost.Algorithm

Document the complexity/storage requirements of the search algorithms

Opened this issue · 0 comments

However in the documentation I have not found anything (no benchmarks, measurements, etc) that would help the user of the library to understand how fast each algorithm is and in which situations it is advisable to use it. (I believe Phil Endecott has mentioned this issue as well) This is especially important because of the following reasons:

  • Boyer-Moore and Boyer-Moore-Horspool have identical average time complexity, hence it is not easy for the user to choose among them based on their complexities.
  • Although KMP has much better worst-case complexity than a typical std::search implementation, their average complexities are practically the same.
  • The straightforward (brute-force) sequence searching is known to frequently outperform some KMP implementations.

For all these reasons I believe that some performance analysis and performance guidelines should be added to the documentation, before this library becomes a part of Boost. For example it would be very nice to have a graphical performance comparison of these three algorithms together with the std::search, in different pattern and alphabet sizes.