pedroassumpcao/incident

Question: EventStore Adapter loading all past aggregate events during command execution?

Closed this issue · 2 comments

byu commented

Question:

I'm reading the code to understand the code under the hood from the blog post https://pedroassumpcao.ghost.io/event-sourcing-and-cqrs-using-incident-part-1/

Here, depending on the aggregate state, an error may be returned, or an event (created from the aggregate logic, eg Bank.BankAccount.execute) may be emitted to the eventhandler (by the command handler). I was curious about how the current BankAccountState was determined: loaded from projection from projection store, if state was materialized in the event store, or if state was recalculated on every command execution. TL;DR: I believe it's recalculated, and would like your confirmation.

  def execute(%OpenAccount{aggregate_id: aggregate_id}) do
    case BankAccountState.get(aggregate_id) do
      %{account_number: nil} = state ->
        new_event = %AccountOpened{
          aggregate_id: aggregate_id,
          account_number: aggregate_id,
          version: 1
        }

        {:ok, new_event, state}

      _state ->
        {:error, :account_already_opened}
    end
  end

Based on my reading of the code, I believe that the Incident.AggregateState uses the postgres adapter to load every event, in order, for the given aggregate; then reduce/applies all the events to come up with the current state; this state is what the aggregate's (eg Bank.BankAccount) execute method uses for success or error. Is this what's going on? or am I misreading? Thanks!

In postgres event store adapter:

  def get(aggregate_id) do
    query =
      from(
        e in Event,
        where: e.aggregate_id == ^aggregate_id,
        order_by: [asc: e.id]
      )

    repo().all(query)
  end

If the above is such, is there a plan to address performance for the aggregates that would have a long event history?

Hi @byu, thanks for the question, you got it all correct.

During command execution, all events for that aggregate are loaded and we reduce them. As the events are the source of truth, the data consistency is the most important part. While this seems a concern in terms of performance, this will only be a problem in case your aggregate has a large number of events that would compromise the command execution time.

To address that, there is a concept of Snapshot, what is explained in https://cqrs.nu/Faq/event-sourcing:

An optimization where a snapshot of the aggregate's state is also saved (conceptually) in the event queue every so often, so that event application can start from the snapshot instead of from scratch. This can speed things up. Snapshots can always be discarded or re-created as needed, since they represent computed information from the event stream.

Typically, a background process, separate from the regular task of persisting events, takes care of creating snapshots.

Snapshotting has a number of drawbacks related to re-introducing current state in the database. Rather than assume you will need it, start without snapshotting, and add it only after profiling shows you that it will help.

So, Incident by now does not have Snapshots as I consider an enhancement and I wanna focus on other major things but for sure it is in the plan along with benchmarking the current capacity to better know when a bottleneck is reached.

byu commented

Thanks for the quick reply! All this looks great so far. I've never had to use Event Sourcing, but I'm going to try this out in a toy project!