Segfault on Chapter2 -> Recipe12
jvillasante opened this issue · 2 comments
On Implement a writing style helper tool for finding very long sentences in text with std::multimap the implementation "segfaults" if the input ends in something else than "." On most systems the input will end in "\n" or "\r\n".
The issue can be fixed if the update to "it2" checks for failure before continuing the loop:
it1 = next(it2, 1);
if ((it2 = find(it1, end_it, '.')) == end_it) {
break;
}
If we don't "break" out of the loop when updating "it2" then the program segfaults.
Hi Julio,
Thanks for finding this, i did indeed only test it with "nice" input!
If the condition it2 == end_it
holds, then the it1 = next(it2, 1);
step lets it1
point past the end, indeed.
I would fix this by putting this check before the increment of both iterators. The patch that you suggest would let the program miss the last sentence.
Look for example at input like "this is a sentence. this is another sentence without a dot at its end"
. Here, we would break out of the loop too early, because it1
points to the space after the only '.'
character, and it2
points to the null terminator of the string. We need to iterate once more, but must not increment after that again.
So my fix looks like:
if (it2 == end_it) {
break;
}
it1 = next(it2, 1);
it2 = find(it1, end_it, '.');
Do you agree?
Hello Jacek
Yes, I totally agree. You're right about the miss of the last sentence. Your solution is right!
Another thing I noticed was that the implementation of filter_ws
would truncate the last character of every sentence but the last. I would suggest something like this:
string filter_ws(const string& s) {
const char* ws{" \r\n\t"};
auto first{s.find_first_not_of(ws)};
auto last{s.find_last_not_of(ws)};
if (first == string::npos) {
return {};
}
return s.substr(first, (last - first + 1));
}
Of course, this will only matter if we have a sentence of only 1 character!!!
Great book BTW. Thanks for charing with the community!!!