Mistake in OCaml Manual / Feature request: Matching multiple `when` cases
m9xiuz opened this issue · 5 comments
There's probably a mistake in https://ocaml.org/manual/expr.html#pattern-matching
pattern-matching ::= [ | ] pattern [when expr] -> expr { | pattern [when expr] -> expr }
because e.g. this code is valid although it doesn't satisfy pattern-matching
rule:
match 1 with
| 0
| 1 -> true
| _ -> false
But this code is invalid:
match 1 with
| 0 when true
| 1 when true -> true
| _ -> false
which is very very sad, btw, because I badly need to match multiple cases that each has a guard.
The manual is not maintained in this repository, but rather in https://github.com/ocaml/ocaml/
The description is correct: a branch of a pattern matching is a sequence of a pattern and an optional when
guards. In your first case 0 | 1 -> ..
the pattern is 0 | 1
. Indeed, patterns include or-pattern (for instance let f ([ ([]| _ :: _)] | _ :: _ | []) = ()
is a valid function). In other words, or-patterns are first class patterns, whereas when
guards are not (because they make patterns matching less static and thus less optimisable).
In your first case 0 | 1 -> .. the pattern is 0 | 1.
Aha, thanks, you're right, I didn't notice that patterns might contain |
https://ocaml.org/manual/patterns.html#pattern
because they make patterns matching less static and thus less optimisable
I don't understand why. For example, we can always convert this
match x with
| p1 when cond1
| p2 when cond2 -> expr1
| p3 -> expr2
into
let case_num = match x with
| p1 when cond1 -> 1
| p2 when cond2 -> 2
| p3 -> 3
in
match case_num with
| 1 | 2 -> expr1
| 3 -> expr2
And what happens with nested when guards:
| (A (x,y,_) when f y) | (A (y,x,_) when f y) when g x -> y
How many times should g
be called?
Similarly, what it is supposed to happen when an inner when
guard mutates the beginning of a pattern:
type ab = A |B
type t = { x:ab; y:ab; z:ab }
let x = ref (A, A, A)
let f x = match x with
| { x = A;
y = ( A when x.x <- B; x.y <- B; false | B when x.y <- A; false | A when x.z <- B; false | A );
z = B
} -> ...
But even without those difficulties, exploding all or-pattern leads to a combinatoric explosion and inefficient pattern matching.
Consider:
type t = A of t | B of t | C of t | D of t | E
let f = function
| (A _ | B _ | C _ | D _ ),
(A (A _ | B _ | C _ | D _ )
| B (A _ | B _ | C _ | D _ )
| C (A _ | B _ | C _ | D _ ) |
D (A _ | B _ | C _ | D _ ) ) -> 0
| A _, (A (A _ | B _ | C _ | D _ ) | B _ | C _ | D _ | E) -> 1
| _ -> 2
Exploding all the or-patterns lead to 72 cases, whereas the actual compiled lambda code for the pattern:
(let
(f/87 =
(function param/89 : int
(catch
(let (*match*/92 =a (field 0 param/89))
(catch
(if (isint *match*/92) (exit 1)
(let (*match*/101 =a (field 1 param/89))
(catch
(switch* *match*/101
case int 0: (exit 2)
case tag 0:
(if (isint (field 0 *match*/101)) (exit 1) (exit 4))
case tag 1:
(if (isint (field 0 *match*/101)) (exit 2) (exit 4))
case tag 2:
(if (isint (field 0 *match*/101)) (exit 2) (exit 4))
case tag 3:
(if (isint (field 0 *match*/101)) (exit 2) (exit 4)))
with (4) 0)))
with (2) (switch *match*/92 case tag 0: 1
default: (exit 1))))
with (1) 2)))
requires at most 6 nested tests. Similarly when
patterns inhibits optimization because they forbids to move around rows of the pattern matrix and make it impossible to share information between submatrix (because a function call could have rendered invalid of this information).
And what happens with nested when guards:
We can just forbid nested when
guards. They don't seem useful.
I'm not proposing to extend pattern
with when
. I'm proposing to make -> expr
optional in pattern-matching
for all cases except the last one https://ocaml.org/manual/expr.html#pattern-matching
So it will look like this:
pattern-matching ::= [`|`] pattern [`when` expr] `->` expr
| [`|`] pattern [`when` expr] [`->` expr] { `|` pattern [`when` expr] [`->` expr] } `|` pattern [`when` expr] `->` expr
But even without those difficulties, exploding all or-pattern leads to a combinatoric explosion and inefficient pattern matching.
We don't need to explode all of them, only top-level ones, in your example there're only 3 of them.