PacktPublishing/Cpp17-STL-Cookbook

Segfault on Chapter2 -> Recipe12

jvillasante opened this issue · 2 comments

On Implement a writing style helper tool for finding very long sentences in text with std::multimap the implementation "segfaults" if the input ends in something else than "." On most systems the input will end in "\n" or "\r\n".

The issue can be fixed if the update to "it2" checks for failure before continuing the loop:

    it1 = next(it2, 1);
    if ((it2 = find(it1, end_it, '.')) == end_it) {
      break;
    }

If we don't "break" out of the loop when updating "it2" then the program segfaults.

tfc commented

Hi Julio,

Thanks for finding this, i did indeed only test it with "nice" input!

If the condition it2 == end_it holds, then the it1 = next(it2, 1); step lets it1 point past the end, indeed.

I would fix this by putting this check before the increment of both iterators. The patch that you suggest would let the program miss the last sentence.
Look for example at input like "this is a sentence. this is another sentence without a dot at its end". Here, we would break out of the loop too early, because it1 points to the space after the only '.' character, and it2 points to the null terminator of the string. We need to iterate once more, but must not increment after that again.

So my fix looks like:

        if (it2 == end_it) {
            break;
        }

        it1 = next(it2, 1);
        it2 = find(it1, end_it, '.');

Do you agree?

Hello Jacek

Yes, I totally agree. You're right about the miss of the last sentence. Your solution is right!

Another thing I noticed was that the implementation of filter_ws would truncate the last character of every sentence but the last. I would suggest something like this:

string filter_ws(const string& s) {
  const char* ws{" \r\n\t"};
  auto first{s.find_first_not_of(ws)};
  auto last{s.find_last_not_of(ws)};

  if (first == string::npos) {
    return {};
  }
  return s.substr(first, (last - first + 1));
}

Of course, this will only matter if we have a sentence of only 1 character!!!

Great book BTW. Thanks for charing with the community!!!