Add adjacent_remove_if (like) adaptor
tcbrindle opened this issue · 5 comments
From @codereport on Twitter "X":
I was looking at that solution the other day and have been meaning to open an issue requesting the equivalent of range-v3's adjacent_remove_if
So here it is!
Range-V3 has both adjacent_remove_if
and adjacent_filter
, which seem like one should just call the other with an inverted predicate, but which actually do slightly different things:
views::adjacent_filter
: For each pair of adjacent elements in a source range, evaluate the specified binary predicate. If the predicate evaluates to false, the second element of the pair is removed from the result range; otherwise, it is included. The first element in the source range is always included. (For instance, adjacent_filter with std::not_equal_to filters out all the non-unique elements.)
views::adjacent_remove_if
: or each pair of adjacent elements in a source range, evaluate the specified binary predicate. If the predicate evaluates to true, the first element of the pair is removed from the result range; otherwise, it is included. The last element in the source range is always included.
There's also views::unique
, which actually does call adjacent_filter
with an inverted predicate.
The end result is that unique
retains the first of a run of elements that satisfy the predicate, whereas adjacent_remove_if
retains the last such element (example).
So the question is, which behaviour do we want in Flux? Is it worth trying to provide both, like Range-V3 does, or would that just be confusing? (Given that it's just taken me an hour to work out why Rv3 provides both unique
and adjacent_remove_if
when one appears redundant, I'm voting for "confusing".)
After we've sorted that out, there's also the question of naming:
adjacent_remove_if
is very long, and we don't use the termremove_if
anywhere else in Flux, but it does at least describes what it doesadjacent_filter
is a little bit shorter, but retaining elements that match the predicate is probably not what users would want most of the timeunique
is consistent with the STL, but (to me) has strong connotations of unique-ifying the whole sequence rather than just adjacent elements (unless the sequence was sorted to begin with)
The Rust itertools crate has a pair of functions, dedup()
and dedup_by()
, the latter of which takes a predicate and the former specialised for equality*. In C++ we could just say dedup()
with a default parameter, which I quite like -- although again, it doesn't necessarily make clear that it only removes adjacent duplicates.
*: The example for dedup_by
suggests that the Rust version has the same behaviour as Rv3's views::unique
(retain the first element), but this isn't actually made clear in the docs
@codereport Any thoughts on what behaviour and/or name(s) Flux should use?
Ping @brevzin as well if case you have any suggestions!
I actually didn't know adjacent_filter
existed. Furthermore, this slight different difference seems like an incredible gotcha. It breaks the "symmetry" of filter
and remove_if
.
In C++ we could just say dedup() with a default parameter, which I quite like
I would veer away from default parameters. My rule of thumb is except for very common function names like sum
(which I would assume is just a specialization), avoid default params and force the user to choose default operation. This is was got us into trouble with std::mismatch
and std::adjacent_difference
. The zip_find
and adjacent_transform
versions in Flux and C++23 Ranges (respectively) are much better names and designs imo.
@codereport Any thoughts on what behaviour and/or name(s) Flux should use?
If this was C++26 ranges, I would definitely recommend the name adjacent_filter
and keep the semantics of range-v3. However, Flux breaks tradition with things like using map
instead of transform
. That being said, it does have both adjacent_map
and filter
so I would probably still recommend adjacent_filter
.
In general, I think Rust choosing dedup
was a bad choice. [ at this point, I remember that I actually gave a ⚡ talk on this a couple years ago and went and found it https://www.youtube.com/watch?v=tsfaE-eDusg only to realize that it did not support what I was about to say 😂 ] .
..
So I guess dedup
is the more common name for std::unique
🤦 lol.
I still like adjacent_filter
, but if you like dedup
adding a specialization similar to sum
for fold
sounds find to me.
Thanks for the feedback Conor!
So here's a proposal:
- I'll add
flux::adjacent_filter()
taking a binary predicate, with the same semantics as Range-V3'sviews::adjacent_filter
- I'll also add
flux::dedup()
which callsadjacent_filter(std::not_equal_to{})
, as shorthand for the most common use-case
If we ever decide we want versions with similar semantics to Rv3's views::adjacent_remove_if
, we can call them something like adjacent_filter_last
/dedup_last
.
Does that sound reasonable?
Sounds great to me 👍