Suggestion to add Array#partition as well
MaxGraey opened this issue ยท 36 comments
Examples
Basics:
const a = 1, b = 2, c = 3, d = 4
const [ab, cd] = [a,b,c,d].partition((_, index) => index >= 2)
const [ac, bd] = [a,b,c,d].partition(value => value % 2 === 0)
Immutable (out of place) Quick Sort:
function quickSort([p, ...tail]) {
if (!tail.length) return [p, ...tail]
const [l, r] = tail.partition(x => x > p)
return [...quickSort(l), p, ...quickSort(r)]
}
quickSort([5, 2, 3, -1])
Efficient and correct implementation usually not trivial:
// not trivial
function partition(arr, callback) {
return arr.reduce((result, value, index, array) => {
result[callback(value, index, array) ? 1 : 0].push(value);
return result;
}, [[], []]);
}
or
// not efficient (twice linear scan)
function partition(arr, callback) {
return [
arr.filter((...args) => !callback(...args)),
arr.filter(callback)
];
}
or based this proposal:
function partition(arr, callback) {
return [
arr.filterOut(callback),
arr.filter(callback)
];
}
Related
OCaml: List.partition
Haskell: Data.List.partition
Lodash: _.partition
RxJS: RxJS/partition
Kotlin: Array.partition
For the note, underscore _.partition
and lodash _.partition
are also exist (16,309 downloads weekly for lodash version)
Thanks added to Related
Hi @MaxGraey, I think partition needs to be its own proposal. It's certainly related to select
/reject
, but it's different enough that we would have to argue over semantics delaying the whole process.
Reopening to discuss. @MaxGraey, do you know of any libraries/projects/etc that are using partition?
/cc @michaelficarra
- Underscore: https://underscorejs.org/#partition
- Lodash: https://lodash.com/docs/latest#partition
- Ramda: https://ramdajs.com/docs/#partition
edit: Sorry, I didn't see that these were mentioned in earlier comments.
@michaelficarra yup. Are you aware of projects using them, though?
@jridgewell we use partition at my work rather extensively, typically around usage of https://github.com/graphql/dataloader, which can often provide arrays of the type (T | Error)[]
. Being able to partition errors separately from non-errors is a typical pattern for us.
The source is not available for this usage, but here are some real-life examples that I've copied:
const [rejectedReviews, approvedReviews] = partition(allReviews, (review) => review.rejectionReasonCode != null);
// Type-safety of this function ensures well-typed `errors` and `items` below
function isError(candidate: unknown): candidate is Error {
return candidate instanceof Error;
}
const itemsAndErrors = await dataLoader.loadMany(ids);
const [errors, items] = partition(itemsAndErrors, isError);
btw lodash.partition has33k downloads/week
I guess query exactly_.partition
in db isn't fully represent all cases like:
const partition = require('lodash.partition');
const { partition } = require('lodash');
and etc
As a radical proposal, what about implementing Array#partition
instead of Array#filterOut
? Users who only desire the falsy partition can destructure the output, and only consume the second value.
const [, cd] = [a,b,c,d].partition((_, index) => index >= 2)
// is equivalent to...
const cd = [a,b,c,d].filterOut((_, index) => index >= 2)
A sufficiently-clever implementation should ideally be able to recognize this situation, and internally fall back to a more-efficient filter
or filterOut
algorithm to inhibit any major performance penalties over a standalone filterOut
method.
@schmod Good point!
Btw I propose reversed order result as [filterOutResult, filterResult] which mean it will even more simple:
const [ab] = [a,b,c,d].partition((_, index) => index >= 2);
// which equivalent to
const ab = [a,b,c,d].filterOut((_, index) => index >= 2);
As a radical proposal, what about implementing Array#partition instead of Array#filterOut? Users who only desire the falsy partition can destructure the output, and only consume the second value.
I find partition
to be particularly clunky. Having to grab the either the first or second array, depending on which you care about, seems much less ergonomic than just doing filter
or filterOut
to get the result.
Then, is it [falsey, truthy]
, or [truthy, falsey]
, because it's not self explanatory. Lodash has a similar (but more generic) form with _.groupBy
, which would return a object with keys returned by the cb
and arrays as values. That seems more useful than partition
.
Then there's also the query data. Use of partition
is relatively small at ~2%, while reject
10-25%, and filter
at 70-80%.
While partition
is still an option for the proposal, I don't think it's the best one.
I think groupBy
is a great alternative. A partition
implementation that returns a { true, false }
record is simply a special case that calls ToBoolean
on the result. I think either one would satisfactorily solve the problem this proposal is trying to address.
That's true. partition
is a specialized case of groupBy
. But filterOut
also specialized case of filter
. Let's compare.
This:
const lessThan2 = arr.filter(val => !(val >= 2));
// or const [lessThan2] = arr.groupBy(val => Number(val >= 2));
much less expressive than
const lessThan2 = arr.filterOut(val => val >= 2);
same as:
const [lessThan2, greaterOrEqThan2] = arr.groupBy(val => Number(val >= 2));
less expressive than
const [lessThan2, greaterOrEqThan2] = arr.partition(val => val >= 2);
You have to sort of intricately "know" what order the tuple returns the values in, it might be better to return a struct:
{
in,
out
}
ex:
const {
in: greaterThanOrEq2,
out: lessThan2
} = arr.partition(val => val >= 2);
Although, thinking about it, in
and out
could be ambiguous...
Thinking about it,
in
andout
could be ambiguous...
Oh, then how about included
and excluded
?
Thinking about it, in and out could be ambiguous...
it also contradicts the generally accepted conventions already in JS:
Array#entries
, Map#entries
, Set#entries
. All this return array like tuples.
Main advantages of array tuple you could always skip unnecessary part:
const [, cd] = [a,b,c,d].partition((_, index) => index >= 2)
const [ab] = [a,b,c,d].partition((_, index) => index < 2)
but also you could this:
const { 1: cd } = [a,b,c,d].partition((_, index) => index >= 2)
const { 0: ab } = [a,b,c,d].partition((_, index) => index < 2)
with objects:
const { out: cd } = [a,b,c,d].partition((_, index) => index >= 2)
const { in: ab } = [a,b,c,d].partition((_, index) => index < 2)
Main advantages of array tuple you could always skip unnecessary part:
You can do the same with the struct, just don't use the other property from the struct.
const {
in: greaterOrEqThan2,
out: lessThan2
} = arr.partition(val => val >= 2);
vs
const { out: lessThan2 } = arr.partition(val => val >= 2);
Array#entries, Map#entries, Set#entries. All this return array like tuples.
They return arrays of key-value pairs, in contrast, here would you want to say that there is an association between what was filtered in and out? I would say "no," because we are intentionally trying to separate them into two totally separate categories, therefore you wouldn't want to pair them.
But actually, returning a tuple seems more like the opposite of Array#flat
, and it shows that the array has been partitioned into two separate sections, while still being an array. So I guess it can be seen either way.
Regardless, I personally would support Array#partition
over Array#filterOut
, it is more reusable and offers a utility that neither Array#filterOut
, nor Array#filter
provide. I have made a function that explicitly looped over arrays a few times in order to obtain the behavior of the Array#partiton
before.
How about getting a TypedArray#partition
too?
I think partition could be generalize more and include slide, stride and chunk features. See this talk:
https://www.youtube.com/watch?v=-_lqZJK2vjI
Well, I finally needed a partition today: ampproject/amphtml@dc3db0e#diff-d06fcad39d9e58f050ba123f5f506cbc1a8865fde8a6016c4dc32bbfc71d06c9R256-R264
I've several times done stuff like this, so it may be worth doublechecking for that kind of pattern:
const foos = items.filter(item => item.type === "foo")
const bars = items.filter(item => item.type !== "foo")
And of course there's several ways this could be expressed, each of which I've seen personally in the wild:
filter(f)
+reject(f)
filter(f)
+filter(i => !f(i))
filter(x => x.type === "a")
+filter(x => x.type === "b")
where the two are mutually exclusive
@isiahmeadows I'm not sure what you're trying to point out, can you clarify?
Your code example is equivalent to the following:
const partitioner = item => item.type === "foo";
const foos = items.filter(partitioner);
const bars = items.filter(item => !partitioner(item));
Which is the current problem that we would like to fix.
filter(f) + reject(f)
I don't get this right away, can you give an example?
where the two are mutually exclusive
As long as they are mutually exclusive, the original proposal should work just fine.
As for your example, would you want a method that takes two comparer functions?
const [
foos,
bars
] = items.partition(
item => item.type === "foo",
item => item.type === "bar"
);
I feel like that could easily be extended to accept N functions, and return an array of N, storing all values that the function returned a truthy value for, but really seems like it should be a user-implemented library solution.
I don't get this right away, can you give an example?
@isiahmeadows is demonstrating that they needed both the "filtered out" and the "filtered in" at the same time, which partition
solves. It's a +1 that partition
is useful.
I feel like that could easily be extended to accept N functions
This seems a bit more complicated than a generic groupBy
. The ordering of the functions would affect the output, where groupBy
could easily handle the cases.
@jridgewell hey, maybe already makes sense to add it to the proposal? Seems it's really required and the community wanna it.
Seems it's really required and the community wanna it.
Absolutely not. It overcomplicates the algorithm, the understanding of purpose, and overall documentation of this hypothetical method.
partition
, like the other Array.prototype methods, should be as simple as possible, but no simpler.
Please, let's not overcomplicate things. It will be difficult to know when to teach, when to use, and whether it's efficient enough to bother with. If you want to worry about those things, then use an NPM package. I don't want to see that canonicalized within ECMAScript.
@ckknight for the previous 5 months .partition
logic was required for me at least 5-10 times. NPM packages for everything that should be in the standard library is bullshit - I don't wanna depend on a separate non-standard script for a simple action (hi, left-pad
hell).
@ckknight I remember how this approach is "useful" on "leftpad" example
I am in favor of having a simple, useful partition
method, on the official Array.prototype
within the ECMAScript standard. I am opposed to overcomplicating the algorithm and signature for it.
@ckknight The algorithm would require about 3 more lines of ES spec text, compared to filter/filterOut, one line to call an abstract operation to get the array species constructor, an extra variable declaration using that constructor, the line with "let k be k + 1" can be removed, as it is now always pushing to an array, and the creation of the structure to return.
Regardless, the current Array#filter discards useful data that developers intend to use. Array#filter feels broken, there's no question about it.
I also needed a groupBy
recently: https://github.com/babel/babel/pull/13021/files#diff-3c2e3e5a9cc9bdb04de11117cfaedee2a2da03910480e9186c252a62fff7ce27R31-R42
hey, maybe already makes sense to add it to the proposal?
Yah, I can bring up both at the next meeting.
I also prefer groupBy
to partition
, even in cases where I just want to split into two buckets, for the reason @jridgewell gives above: groupBy
means not having to remember which entry is which. Given that the genesis of this whole proposal is that people get the behavior for filter
backwards, it seems better to go with the API without this issue, especially given that it's strictly more powerful.
The spec text currently includes groupBy
even though the README doesn't mention it, is it in or out?
groupBy
was welcomed by the committee, and is being moved to https://github.com/tc39/proposal-array-grouping.