rust-lang/rust

Tracking Issue for `array_zip`

Closed this issue ยท 35 comments

Feature gate: #![feature(array_zip)]

This is a tracking issue for the zip method on arrays.

Just like Iterator::zip combines two iterators into one, [T; N]::zip() returns a new array where every element is a tuple where the first element comes from the first array, and the second element comes from the second array. In other words, it zips two arrays together, into a single one.

let x = [1, 2, 3];
let y = [4, 5, 6];
let z = x.zip(y);
assert_eq!(z, [(1, 4), (2, 5), (3, 6)]);

Public API

pub fn zip<U>(self, rhs: [U; N]) -> [(T, U); N]

Steps / History

  • Implementation: #79451
  • Stabilization PR

Unresolved Questions

@cramertj comment

[...] This seems like a pretty niche feature to me personally, but I can understand that there are situations when you might want it. I'm not sure if the libs team has a standard for "fill in the missing methods" PRs like this.

@yoshuawuyts comment

overlap with potential std::iter::Zip trait [...]

Could you show two use cases?

Could you show two use cases?

What made me open the PR was for doing element wise operations on vector-like objects without the need for any heap allocations or unnecessary zeroing.

struct Vector([T; N]);

impl Add<Vector> for Vector {
    fn add(self, rhs: Vector) -> Vector {
        self.0.zip(rhs.0).map(|(lhs, rhs)| lhs + rhs)
    }
}

As for a second use case, well I will have to think about that one...

Maybe bit far-fetched but here goes:

const RES: usize = WIDTH * HEIGHT;
fn compute_pixels_for_screen_on_embedded_thing(
    background: [ImgColor; RES],
    object: [ImgColor; RES]
) -> [ScreenColor; RES] {
    background.zip(object).map(magic_color_mix)
}

(which I suppose is pretty much the same use case but with different names)

Wouldn't this become obsolete after #65798?

CryZe commented

No because iterators don't encode the length in the type, so if you intend to go back to an array you'd have to unwrap.

No because iterators don't encode the length in the type,

Can't we give arrays a different iterator that encodes the length too? :-)

No because iterators don't encode the length in the type,

Can't we give arrays a different iterator that encodes the length too? :-)

Wouldn't that be problematic in combination with methods like filter or skip etc?

Yeah. We can't have a "filter" there. So only some iterators are possible. But "skip" is doable as long as the number of items to skip is given as a const generics value (and in my code the argument of "skip" is often a value known at compile-time).

No because iterators don't encode the length in the type,

Can't we give arrays a different iterator that encodes the length too? :-)

Would that be/work with #80470 or something completely different like new trait? Just asking since it seems like array IntoIter might be getting stabilized

@leonardo-m

Yeah. We can't have a "filter" there. So only some iterators are possible. But "skip" is doable as long as the number of items to skip is given as a const generics value (and in my code the argument of "skip" is often a value known at compile-time).

I have started doing some experimenting here. Atleast to me this feels like a much more reusable and flexible solution than adding all of those of methods directly to [T; N].

As somebody who has stumbled upon this unstable method while implementing some const generic vector operations, I would like to point out something regarding the presented use case. To actually efficiently lift traits like Add as component-wise operations, one would rather need a zip_with operation combining zip and map, maybe with some other name more consistent with rust conventions. Using the unstable array_map from #75243 on array_zip would potentially write the array twice, at least given the current implementation.

@MDoerner any thoughts on something like iter_fixed?

Edit:

There is an example here for vector addition

let x = [1, 2, 3];
let y = [4, 5, 6];

let z: [_; 3] = x
   .into_iter_fixed()
   .zip(y)
   .map(|(a, b)| a + b)
   .collect();

This method has ended up being incredibly useful in my fixed-array audio DSP library, mainly in frame addition/multiplication. It allows me to do an element-wise binary operation without needing to use iterators (non-const? pah!) or having to create an incrementally-updated output array. The latter is especially nice in cases where the output type is expensive to instantiate: having to create N instances for a temporary output buffer that are immediately going to be overwritten is wasteful.

Something I'd love to suggest would be a corresponding unzip method, the inverse operation of zip. I imagine its signature looking like something along the lines of:

impl<T, U, const N: usize> [(T, U); N] {
    fn unzip(self) -> ([T; N], [U; N]) {
        // impl
    }
}

I'd be happy to work on an impl of unzip if others think it would be a useful method. I'd personally find quite a few good spots to use this in that DSP library.

@usbalbin can we get rid of into_iter_fixed and make into_iter return a fixed iterator (which is an iterator as well)?

The problem is that regular Iterators shrink every time you call next() on them (unless they are infinite). So as far as I can tell, that makes it problematic (impossible?) to make a useful type that is both Iterator and IteratorFixed(I would be happy to be proven wrong :) ).

Note however that IteratorFixed does implement IntoIterator.

hmm, yes it looks impossible. I was dreaming of an iterator with a Option<usize>, which next and similar methods are only for None (unknown sizes) but map and zip and ... can handle sized ones as well as unknowns. But it still needs two methods, one with None size for using next, and one with known size (or it can be generic but it will need explicit type annonations which is worse)

The problem is that regular Iterators shrink every time you call next() on them (unless they are infinite). So as far as I can tell, that makes it problematic (impossible?) to make a useful type that is both Iterator and IteratorFixed(I would be happy to be proven wrong :) ).

Note however that IteratorFixed does implement IntoIterator.

I think such a fixed iterator would have to implement a separate iterator trait, but that trait could have a method that converts it into a normal iterator. For instance, the fixed iterator could have methods that operate on it that transform it from one fixed length to another, and it could lazily evaluate to an array (no next() at all). However, the fixed iterator could have a method that converts it into a regular iterator. For instance:

// Note that an array would implement the fixed iterator trait (output equals the array itself).
let arr = [1, 2, 3];
// You could use the map method to change the array values.
let mapped = arr.map(|v| v*2);
// Then you could either turn the mapped fixed iterator into an array
let double_arr: [i32; 3] = mapped.array();
// or you could turn it into an iterator (note that the iterator values are of type `i32`, not `&i32`).
let sum: i32 = mapped.iter().sum();
// Note that it was used a second time here because the mapped array implements `Copy`.
// Then you could either turn the mapped fixed iterator into an array
let double_arr: [i32; 3] = mapped.array();

Now that method needs to be named something like "lazy_map()"

@leonardo-m Yes, I suppose it is too late to create a lazy array trait with a map() method at this point.

In #103555 .zip().map() has been reported to be slow. #80094 (comment) mentions that kind of pattern as motivation.

Does anyone actually need the [(T, U); N] output? Or would a [T; N]::map_pairwise(other: [U; N]) -> [V; N] (or the iter_fixed crate) cover most needs?

Potential concern: overlap with potential std::iter::Zip trait

I'm currently working on the futures-concurrency library as part of the Async WG. We're currently adding array::zip API which conflicts with the proposed API here. The API we're working on would work as a counterpart to our Merge trait, but providing a different output ordering:

//! Intended use if `Zip` were part of the stdlib.

use std::iter::{self, Zip};

let s1 = iter::repeat(0).take(5);
let s2 = iter::repeat(1).take(5);
let s3 = iter::repeat(2).take(5);

// zips N iterators together.
for [n1, n2, n3] in [s1, s2, s3].zip() {
    assert_eq!([n1, n2, n3], [0, 1, 2]);
}

// it works on tuples too!
for (n1, n2, n3) in (s1, s2, s3).zip() {
    assert_eq!((n1, n2, n3), (0, 1, 2));
}

In particular because I believe the plan is still to e.g. enable collect to work for arrays - which means what's provided by this API may be achievable through other means as well. In comparison: being able to zip arrays, tuples, and vecs through a shared API may be a lot harder, if not impossible to achieve through other means.

Conclusion

I'm not trying to say that just because we're experimenting with something, this API should be abandoned. But I'm raising this so that when a discussing pops up about potentially stabilizing this API, the fact that it would shut out (parts of 1) the other design should be weighed into consideration.

Footnotes

  1. It would certainly be possible to implement Zip just on tuples and vecs, but tbh that would be asymmetrical with how we're looking at implementing Merge and potentially Chain as well. array::zip as defined in this unstable tracking issue is the only API which would present a conflict, hence I'm raising this issue here. โ†ฉ

(s1, s2, s3).zip() does, in my opinion, look a lot nicer than s1.zip(s2).zip(s3) :)

Oh wait (s1, s2, s3).zip() is for iterators which is not really the same thing as this issue... Sorry for the confusion.

Guess the question/concern still stands...

Rua commented

Does anyone actually need the [(T, U); N] output? Or would a [T; N]::map_pairwise(other: [U; N]) -> [V; N] (or the iter_fixed crate) cover most needs?

The Nalgebra crate has a zip_map method for its vector types. That would be more useful than the currently proposed zip, because you can use zip_map to map to an array of tuples anyway. It may also be worth investigating whether it's possible to extend this to any number of arrays in a single "zip", for example the ability zip 3 arrays, 4 arrays, or any number of arrays const M: usize. Nalgebra has zip_map for two arrays and zip_zip_map for three, but doesn't go any further than that and has no generic solution.

There should also be a try_zip_with function that can be used with functions such as i16::checked_add.

The usefulness of tools like zip was beautifully expressed in this issue about [T; N]::map:

"It regularly happens to me that I need to create an array not because that's the end product of my computation, but because that's a useful intermediate object when going from dynamic-sized entities like iterators, slices, or arbitrary generator functions, to static-sized entities like SIMD types or loops that must be unrolled."

So I agree with the other remarks suggesting that we don't really want this function, especially if it optimizes poorly. It's natural for iterators which have much more dynamic qualities, but we are building up a kind of "static iteration" API here. That has different needs, even if it shares some idioms (e.g. map). While I obviously have a bias towards explicit SIMD due to my project work, I still would hope that whatever we picked very naturally autovectorized in some cases, since that's what people expect. And the problem with this function is that it actually changes the shape, eagerly, whereas iterators, being lazy, instead change what will be yielded.

As the original author of this sentence, I fully agree that many of the situations where I use arrays are really workarounds for lack of iterators whose size is known at compile time, which is what I would really want.

But as someone else pointed out, static-sized iterators as we know Iterator cannot be made to work, since Iterator::next(&mut self) changes the length and thus invalidates the trait's contract, without changing the type since it only takes &mut self.

I have spent some time in the past thinking about how to make a dedicated StaticIterator<N> trait for this use case, assuming infinite type system features, and reached the conclusion that making it work with a next() method as we know and love it would require massive code bloat and very poor API ergonomics. You would want StaticIterator<N>::next() to consume self and return impl StaticIterator<N.saturating_sub(1)> alongside the output item, and then would need to use an extra-weird kind of specialization that changes a function's output type in order to handle StaticIterator<0> specially (other StaticIterator<N> impls must return an item on next(), whereas this one must not do so).

So most likely we would not want to have a next() API, but instead use a different iteration design, where the core iteration primitive is not "consume one value", but "consume all remaining values at once". Perhaps something like for_each() with an unsafe contract (in the unsafe trait sense) that the underlying callable will be called exactly N times could be a better basic building block? Or perhaps people would be happier with something that switches back to the dynamic-sized world at the specific time where iteration starts, along the lines of into_iter(self) -> impl Iterator + TrustedLen? Though in my experience it can be harder to make LLVM optimize the latter well.

Whatever it is, in order to be considered successful, I think that core primitive should not require collecting into an array on every iterator adapter (which LLVM has often trouble optimizing out) nor require instantiating N different types during use. Though of course, collect() into an array should be an option for end-user convenience.

Perhaps something like for_each() with an unsafe contract (in the unsafe trait sense) that the underlying callable will be called exactly N times could be a better basic building block?

The problem with that is that things can panic in the middle and then you need a partial drop. We have the same issue with TrustedRandomAccess on owning iterators. This is solvable by adding a post-iteration cleanup method that tells the adapter chain how many elements have actually been consumed but it means that step-by-step execution is indeed unsafe.

If we relax that to "at most N" we can use something like UncheckedIterator instead and the source would have to keep track of how many items have been consumed. This is also unsafe but less onerous on the caller because they don't need a panic-safe drop guard. The downside is that it doesn't optimize well for more complex things like Zip where now multiple sources would have to keep track of how many items have been consumed.

The 3rd alternative is to instead implement something like Java's Streams which are implemented by building up something like a stream-evaluator (called StreamPipeline internally) instead of adapters. The evaluator is then in charge of executing final operation. Since it's all owned by the same module it can have internal methods that can interrogate the built-up structure to see what the optimal execution strategy is and choose between different ones depending on its overall shape. It can also do optimizing transformations before evaluation (e.g. eliminating a sort step if a later step isn't order-preserving).

The downside of that approach is that it makes it more difficult for 3rd parties to extend the interface.

We do something similar for std-owned Iterator types by using specialization but since that has to happen cooperatively between many separate structs it's more complicated than having a central evaluator.

Anyway, that's a bit of a tangent...

So, I'll repeat my previous question:

Is there anyone who's in favor of keeping zip as it is today or can we rip it out and then consider replacements separately?

@the8472 see my earlier comment, do you have any thoughts on that?

Edit: Just saw you mentioned it earlier.

Nominating this for removal since [(T, U); N] optimizes poorly and seems to be rarely what's really wanted.

@rfcbot close

Team member @m-ou-se has proposed to close this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

FWIW, this now at least codegens decently well for basic cases. For example, x.zip(y).map(|(x, y)| x - y) gives a simd subtraction, with no extra ceremony https://rust.godbolt.org/z/63fEGnKd6:

pub fn short_integer_zip_map(x: [u32; 8], y: [u32; 8]) -> [u32; 8] {
// CHECK: %[[A:.+]] = load <8 x i32>
// CHECK: %[[B:.+]] = load <8 x i32>
// CHECK: sub <8 x i32> %[[A]], %[[B]]
// CHECK: store <8 x i32>
x.zip(y).map(|(x, y)| x - y)
}

But I do also think that a zip_with is probably what I'd always actually want here, because eagerly getting tuples isn't particularly useful.

rfcbot commented

๐Ÿ”” This is now entering its final comment period, as per the review above. ๐Ÿ””

rfcbot commented

The final comment period, with a disposition to close, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.