ocaml/v2.ocaml.org

Mistake in OCaml Manual / Feature request: Matching multiple `when` cases

m9xiuz opened this issue · 5 comments

There's probably a mistake in https://ocaml.org/manual/expr.html#pattern-matching

pattern-matching  ::= [ | ] pattern  [when expr] ->  expr  { | pattern  [when expr] ->  expr }

because e.g. this code is valid although it doesn't satisfy pattern-matching rule:

match 1 with
| 0 
| 1 -> true
| _ -> false

But this code is invalid:

match 1 with
| 0 when true
| 1 when true -> true
| _ -> false

which is very very sad, btw, because I badly need to match multiple cases that each has a guard.

The manual is not maintained in this repository, but rather in https://github.com/ocaml/ocaml/

The description is correct: a branch of a pattern matching is a sequence of a pattern and an optional when guards. In your first case 0 | 1 -> .. the pattern is 0 | 1. Indeed, patterns include or-pattern (for instance let f ([ ([]| _ :: _)] | _ :: _ | []) = () is a valid function). In other words, or-patterns are first class patterns, whereas when guards are not (because they make patterns matching less static and thus less optimisable).

@Octachron

In your first case 0 | 1 -> .. the pattern is 0 | 1.

Aha, thanks, you're right, I didn't notice that patterns might contain | https://ocaml.org/manual/patterns.html#pattern

because they make patterns matching less static and thus less optimisable

I don't understand why. For example, we can always convert this

match x with
| p1 when cond1
| p2 when cond2 -> expr1
| p3 -> expr2

into

let case_num = match x with 
| p1 when cond1 -> 1
| p2 when cond2 -> 2
| p3 -> 3
in
match case_num with
| 1 | 2 -> expr1
| 3 -> expr2

And what happens with nested when guards:

| (A (x,y,_) when f y) | (A (y,x,_) when f y)  when g x -> y

How many times should g be called?
Similarly, what it is supposed to happen when an inner when guard mutates the beginning of a pattern:

type ab = A |B
type t = { x:ab; y:ab; z:ab }
let x = ref (A, A, A)
let f x = match x with 
| { x = A; 
    y = ( A when x.x <- B; x.y <- B; false | B when x.y <- A; false  | A when x.z <- B; false | A ); 
    z = B 
   } -> ...

But even without those difficulties, exploding all or-pattern leads to a combinatoric explosion and inefficient pattern matching.
Consider:

type t = A of t | B of t | C of t | D of t | E

let f = function
  | (A _ | B _ | C _ | D _ ),
    (A (A _ | B _ | C _ | D _ )
    | B (A _ | B _ | C _ | D _ )
    | C (A _ | B _ | C _ | D _ ) |
    D (A _ | B _ | C _ | D _ ) ) -> 0
  | A _, (A (A _ | B _ | C _ | D _ ) | B _ | C _ | D _ | E) -> 1
  | _ -> 2

Exploding all the or-patterns lead to 72 cases, whereas the actual compiled lambda code for the pattern:

 (let
    (f/87 =
       (function param/89 : int
         (catch
           (let (*match*/92 =a (field 0 param/89))
             (catch
               (if (isint *match*/92) (exit 1)
                 (let (*match*/101 =a (field 1 param/89))
                   (catch
                     (switch* *match*/101
                      case int 0: (exit 2)
                      case tag 0:
                       (if (isint (field 0 *match*/101)) (exit 1) (exit 4))
                      case tag 1:
                       (if (isint (field 0 *match*/101)) (exit 2) (exit 4))
                      case tag 2:
                       (if (isint (field 0 *match*/101)) (exit 2) (exit 4))
                      case tag 3:
                       (if (isint (field 0 *match*/101)) (exit 2) (exit 4)))
                    with (4) 0)))
              with (2) (switch *match*/92 case tag 0: 1
                                          default: (exit 1))))
          with (1) 2)))

requires at most 6 nested tests. Similarly when patterns inhibits optimization because they forbids to move around rows of the pattern matrix and make it impossible to share information between submatrix (because a function call could have rendered invalid of this information).

And what happens with nested when guards:

We can just forbid nested when guards. They don't seem useful.
I'm not proposing to extend pattern with when. I'm proposing to make -> expr optional in pattern-matching for all cases except the last one https://ocaml.org/manual/expr.html#pattern-matching
So it will look like this:

pattern-matching ::= [`|`] pattern [`when` expr] `->` expr
                 |   [`|`] pattern [`when` expr] [`->` expr] { `|` pattern [`when` expr] [`->` expr] } `|` pattern [`when` expr] `->` expr

But even without those difficulties, exploding all or-pattern leads to a combinatoric explosion and inefficient pattern matching.

We don't need to explode all of them, only top-level ones, in your example there're only 3 of them.