How to calculate and produce count of empty aggregations
Opened this issue · 5 comments
Handling aggregations incrementally is tricky. For example, the simple query of
MATCH (n)
RETURN count(n)
should return a single row (containing 0
) for an empty database. So, the philosophical question is, for an empty aggregation set, do we return an 0
or nothing?
The issue can be demonstrated with a PostgreSQL console:
postgres=#
select count(a)
from (select 1 as a) as subq
where a = 2;
count
-------
0
(1 row)
However, if we aggregate for a
:
postgres=#
select count(a)
from (select 1 as a) as subq
where a = 2
group by a;
count
-------
(0 rows)
Or, if you think aggregating for a
is ugly, we can aggregate for b
:
postgres=#
select count(a)
from (select 1 as a, 1 as b) as subq
where a = 2
group by b;
count
-------
(0 rows)
The issues of handling null
s for OPTIONAL MATCH
clauses is also related...
It's worth checking the Postgres docs:
If a query contains aggregate function calls, but no GROUP BY clause, grouping still occurs: the result is a single group row (or perhaps no rows at all, if the single row is then eliminated by HAVING). The same is true if it contains a HAVING clause, even without any aggregate function calls or GROUP BY clause.
end of 7.2.3: https://www.postgresql.org/docs/9.6/static/queries-table-expressions.html#QUERIES-GROUP
via @jmarton
Related literature (thanks to @bergmanngabor): https://dl.acm.org/citation.cfm?id=137852
This causes BI Q7 to break.
A simplified version shows the issue:
MATCH (message2:Message)
OPTIONAL MATCH (message2:Message)<-[like:LIKES]-(p3:Person)
RETURN message2.id AS m, count(like) AS likes
ingraph results: List(ArrayBuffer((likes,1), (m,44)), ArrayBuffer((likes,1), (m,88)))
neo4j results: List(ArrayBuffer((likes,1), (m,44)), ArrayBuffer((likes,1), (m,88)), ArrayBuffer((likes,0), (m,99)))
The problem can be pinpointed to the following condition in AggregationNode
s:
if (oldValues != newValues)
Of course, this line is there for a reason...