eiffel-community/eiffel-intelligence

Limit EI aggregations on event type

Opened this issue · 0 comments

Description

Today, EI queries the event repository upstreams to find all linked events from any received event that is defined as a start event of an aggregation. There is no limit given towards the event repository in that query. This could potentially put a heavy load on the event repository, if a start event has a very long list of linked events in its upstream event graph. Furthermore it could result in very large aggregation objects within EI, depending on how the aggregation rules are written. We would need a possibility to restrict both the queries towards the event repository (to reduce the load there) and also the size of the aggregated object (to reduce EI object storage and compute resources needed).
I propose that when defining an aggregation in EI it should be possible to state not only the start event, but also on what event types to stop following links further up the event graph. Such 'stop event' types should hopefully be possible to propagate to the event repository when performing upstream queries there.

Motivation

To decrease the load on both the event repository and on EI itself (including its connected object store)

Exemplification

In the Artifacts example aggregation it could be suitable to stop the upstream query on SCC or SCS for example. If you're interested in aggregations further up in the event graph you could then use the SourceChange example aggregation instead. As it is now we will potentially aggregate the same data multiple times in the Artifacts aggregation since the upstream graph could often contain additional ArtC events which will have their own aggregations defined for them.

Benefits

Less load on event repository and EI

Possible Drawbacks

Unless defined carefully, the aggregations might miss important upstream events. It could also result in that a user would need to query EI for further information instead of getting it all in one subscription callback. So, it should of course still be possible to define aggregations resulting in non-limited upstream queries and aggregations to mitigate such scenarios.