Implement aggregates
hydrolarus opened this issue · 2 comments
It would be nice to have aggregates in crepe. I am trying to do Advent of Code this year with as much crepe as I can and I ran into some situation where having them would have been nice.
The small set of aggregates that Soufflé has seems already really nice and powerful.
I am still learning to use Datalog, so I am very green when it comes to the implementation that crepe uses, but in general I would still be up to contribute this with some guidance. 🙃
That sounds interesting. What do you think would be a good syntax for aggregates?
crepe! {
@input
struct A(i32, i32);
@output
struct B(i32, i32);
B(x, $syntax_for_max$ A(x, ?)) <- A(x, _);
}
Maybe something like the above? I think the only limitation is that it cannot be a valid expression syntax.
My initial idea was to allow them in the same place as relations as clauses but with lowercase identifiers.
A(x, y)
would be a relation(...)
would be a guard expressionlet ...
would be a bindingsum(x) { <clause>,* }
would be an aggregate
Special binding syntax
One issue with that would be though that it's hard to bind the result of that to a variable. However, since all those aggregates return numeric values (ignoring the range
aggregate from Soufflé) there is no need for more complex pattern matching, so maybe n = sum(x) { <clause>,* }
could work nicely?
Your example then could look like this:
crepe! {
@input
struct A(i32, i32);
@output
struct B(i32, i32);
B(x, n) <- n = max(x) { A(x, _) };
}
Pro:
- easy to fit into current syntax
Con:
- Not immediately clear when to use
n = ..
andlet n = ..
Out parameters
I am not super convinced about starting with an assignment because that might be confusing when to use let n =
and when to use n =
, so maybe a Prolog style "out parameter" could work too. The n = max(x) { A(x, _) }
would become max(x, n) { A(x, _) }
. That would be clear to parse but maybe not as clear to read, because max(x, n)
almost looks like the max of x
and n
will be taken.
crepe! {
@input
struct A(i32, i32);
@output
struct B(i32, i32);
B(x, n) <- max(x, n) { A(x, _) };
}
Pro:
- easy to fit into current syntax
Con:
- it might be unintuitive to have to use out parameters, especially when the name of the aggregate is the same for known mathematical functions
Special casing let
rhs expressions
Maybe to keep the syntax "clean" the right hand side expression for let
bindings could be special cased for aggregates.
If the {
}
are required then this could be used to still allow user functions called like aggregates.
// aggregate
let n = max(x) { A(x, _) },
// user defined functions `max`
let n = max(x),
Pro:
- Same binding syntax for aggregate results and regular variable/pattern binding
Con:
- more magic ✨
- harder to parse, will introduce a special case to otherwise well-known syntax