c-cube/ocaml-containers

Add a zip function for lists as a monoidal product

grayswandyr opened this issue · 5 comments

Currently, List.combine fails for lists of different lengths. A zip function works similarly, except that it just returns [] in the latter case. This can be leveraged to implement the "zip list" monoid product:

utop # let f xs ys zs = 
let+ x = xs 
and+ y = ys 
and+ z = zs in
x + y + z;;
val f : int list -> int list -> int list -> int list = <fun>
utop # f [1;2] [5;6;7] [3;3;3;3];;
- : int list =
[9; 9; 9; 9; 10; 10; 10; 10; 11; 11; 11; 11; 10; 10; 10; 10; 11; 11; 11; 11;
 12; 12; 12; 12]

Now with zip:

utop # let rec zip a b = match a, b with 
| [], _ | _, [] -> [] 
| x::a, y::b -> (x,y):: zip a b ;;
val zip : 'a list -> 'b list -> ('a * 'b) list = <fun>
utop # let ( and& ) = zip;;
val ( and& ) : 'a list -> 'b list -> ('a * 'b) list = <fun>
utop # let f xs ys zs = 
let+ x = xs 
and& y = ys 
and& z = zs in
x + y + z;;
val f : int list -> int list -> int list -> int list = <fun>
utop # f [1;2] [5;6;7] [3;3;3;3];;;;
- : int list = [9; 11]

The same and& can also be used with the monad (let*).

Would you welcome a PR? If so, is and& OK (with & evoking "lock step synchronization")?

I was dubious at first, but I think I'm open to the idea, if you can convince me further 🙂

  • I'd rather have it named combine_<something> (like combine_truncating or something like that) so that it's clear that the functions are related but not just aliases of one another (i.e. it's clear that zip is not just a sassier name for combine). Of course, a proper comment would also help. combine must stay the same by compatibility with the stdlib.
  • I know nothing about the mathematical interpretation of monoid_product but is this zip function mathematically sound? As in, does it compose well (I guess so) and is it going to be unsurprising to a haskeller?

I'm not a specialist of applicatives and monads, but this zip (or parallel) list is well-known in the Haskell world. The zip name I'm proposing actually comes from them. Sometimes, you don't want to use lists to represent non-determistic computations, but also "parallel" ones. This is useful for (again) comprehensions and apparently sound considering the two following links: https://db.inf.uni-tuebingen.de/staticfiles/publications/haskell2011.pdf and https://caml.inria.fr/pub/docs/manual-ocaml/bindingops.html .

Regarding the name, I personally prefer zip as it's a classic in Haskell. I find it cumbersome that Haskell and OCaml choose different names most of the time...

Regarding the name, I personally prefer zip as it's a classic in Haskell. I find it cumbersome that Haskell and OCaml choose different names most of the time...

yes but consistency within the library also matters! I wouldn't necessarily mind zip for a lazy list type, if it were a new module, but for lists in OCaml it's combine. Let's not introduce more confusing terminology.

Otherwise, I'm ok with and& (or maybe and| ? it is more "parallel"? 😅) as a feature. You may want to add the same things to Seq (or OSeq) for lazier computations.

yes but consistency within the library also matters! I wouldn't necessarily mind zip for a lazy list type, if it were a new module, but for lists in OCaml it's combine. Let's not introduce more confusing terminology.

No problem. Here are some suggestions: combine_truncating, combine_truncate, combine_shortest, combine_chop... Tell me your preference.

Yeah, I also hesitated with and| but was a bit reluctant as the pipe also reminds of disjunction. Once again: as you wish.

I'll include Seq too.

So: just waiting for your choice of names.

I think combine_shortest could be a good choice. I agree with and| being a bit ambiguous.