prooph/pdo-event-store

Running projections causes constant full table scans due incorrect forced index

atymic opened this issue · 15 comments

We're developing a pretty new app using prooph, with only ~20k events in a single stream. We have ~30 projections, and when running these all at once the database gets completely destroyed.

I dived in with a debugger and figured out this is because we are doing 30x full table scans per 100ms, as every loop of each projection is doing a full table scan.

Here's an example query:

SELECT * FROM `_4228e4a00331b5d5e751db0481828e22a2c3c8ef` USE INDEX(ix_query_aggregate)
WHERE `no` >= 19067
ORDER BY `no` ASC
LIMIT 1000;

image

Note the USE INDEX(ix_query_aggregate). This forces mysql to not use the the primary index on no.

Removing the USE INDEX means that the index is used properly, and the query is orders of magnitude faster.

image

Apologies if this is a misconfiguration from our end, but I think that the ix_query_aggregate index should only be using when doing queries based on metadata, not when scanning the event stream.

Running:
prooph/pdo-event-store: v1.12.0
prooph/event-store: v7.5.6

Thanks in advanced!

Can you post your configuration please?

Config
return [
    /*
    |--------------------------------------------------------------------------
    | Plugins
    |
    | List of all global event store plugin service IDs
    |
    |--------------------------------------------------------------------------
    */
    'plugins' => [],

    /*
    |--------------------------------------------------------------------------
    | Metadata Enrichers
    |
    | List of all global event store metadata enrichers
    |
    |--------------------------------------------------------------------------
    */
    'metadata_enrichers' => [],

    /*
    |--------------------------------------------------------------------------
    | Event Stores
    |--------------------------------------------------------------------------
    |
    | Each event store will be configured here. Currently only the mysql,
    | maria_db and postgres stores are supported. Each store will be bound to
    | event_store.stores.<key> as well as to the EventStore FQCN. Available
    | settings are:
    |
    | - persistence_strategy: The class name or service ID of the persistence strategy
    | - load_batch_size: The number of events a query should return in a single batch. Default is 1000
    | - event_streams_table: The event stream table to use. Default is event_streams
    | - message_factory: The message factory to use. Default is FCQNMessageFactory
    | - disable_transaction_handling: Boolean to turn off transaction handling. Default is false
    | - action_event_emitter: Defaults to ProophEventActionEmitter
    | - wrap_action_event_emitter: Defaults to true
    | - metadata_enrichers: A list of metadata enrichers to add to the store.
    | - plugins: A list of plugins to add to the store.
    |
    */
    'stores' => [
        'default' => 'mysql',
        'mysql' => [
            'persistence_strategy' => \Prooph\EventStore\Pdo\PersistenceStrategy\MySqlSingleStreamStrategy::class,
        ],
    ],

    /*
    |--------------------------------------------------------------------------
    | Repositories
    |--------------------------------------------------------------------------
    |
    | Each aggregate repository is configured in this structure. Each key
    | represents a different aggregate repository. Each repository is configured
    | with:
    | - store: The key of the store to use. Valid values are any key in the `stores` array above.
    | - repository_interface: An optional interface to alias the repository with.
    | - repository_class: The FQCN or service ID for the repository class.
    | - aggregate_type: The FQCN for the aggregate this store maintains.
    | - aggregate_translator: The translator for the aggregate. Defaults to \Prooph\EventSourcing\EventStoreIntegration\AggregateTranslator.
    | - stream_name: The stream name.
    | - one_stream_per_aggregate: Set this to true for an aggregate stream strategy. Default is false
    |
    */
    'repositories' => [
        'user_collection' => [
            'store'                => 'default',
            'repository_interface' => UserCollection::class,
            'repository_class'     => EventStoreUserCollection::class,
            'aggregate_type'       => User::class,
        ],
        // other repos
     ],

    /*
    |--------------------------------------------------------------------------
    | Projections
    |--------------------------------------------------------------------------
    |
    | The necessary definitions for creating projections
    | - store: The name of the store. One of mysql, maria_db or postgres
    | - event_streams_table: Defaults to event_streams
    | - projections_table: Defaults to projections
    | - projections
    |   - connection: The name of the connection to use. Defaults to the same connection as the store.
    |   - read_model: The FQCN of the projection read model.
    |   - projection: The FQCN of the projection.
    |
    */
    'projection_managers' => [
        'todo_projection_manager' => [
            'store' => 'default',
            'projections' => [
                'user_projection' => [
                    'read_model' => UserReadModel::class,
                    'projection' => UserProjection::class,
                ],
                // other projections
            ]
        ]
    ]
];

I think this can be solved by using a different projection strategy when running the projections. I am currently not using a v7 instance in production, so I can't verify easily right now.

ping @codeliner can you provide some info? I think you are running v7 in production right now.

yes, but we're using postgres. I remember that someone mentioned this or a similar issue when using MySql. Maybe it was @fritz-gerneth ? Unfortunately, I cannot find the discussion in an older issue, maybe it was in the chat :(

Anyway, you could implement your own stream strategy to change the index the way you need it: https://github.com/prooph/pdo-event-store/blob/master/src/PersistenceStrategy/MySqlSingleStreamStrategy.php#L55

and change your configuration so that your custom strategy is used instead of the one provided by prooph.

Prolic provided a bit of background in #191 .

Never ran into this issue myself due to a custom strategy (for larger IDs).

Ah thanks @fritz-gerneth that's the issue I was looking for. @atymic does it help you?

Anyway, you could implement your own stream strategy to change the index the way you need it: /src/PersistenceStrategy/MySqlSingleStreamStrategy.php@master#L55

and change your configuration so that your custom strategy is used instead of the one provided by prooph.

The issue is that this index is useful when looking up aggregates in the application - it's just not when running projections. Ideally, since it just requires removing that specification of the index it would be able to be disabled just from projections

Prolic provided a bit of background in #191 .

Yep, that's the exact issue. Will try to figure out how to use a different strategy just for projections. We're running laravel so hopefully we can sub it out in the container.

Ok, subbing out the strategy for projections only worked fine. I just added another event store to the config, and set the projections to use that instead.

Thanks for all the help everyone. Could it be worth documenting the performance issues for projections in the docs, since it seems like others have come accross this as well?

@atymic yes a pull request for documentation is always welcome! I'm closing the issue now.

@ahmed-alaa I think this is important for your set up as well ☝️

The following index hint solved this problem for us, in https://github.com/prooph/pdo-event-store/blob/master/src/PersistenceStrategy/MySqlSingleStreamStrategy.php#L95

public function indexName(): string
{
    return 'ix_query_aggregate, PRIMARY';
}

It works for both, running projections and loading aggregates. Might be worth trying before adding an additional event store configuration.

Ahh, interesting. @codeliner is this a reasonable solution to accept into the library?

No worries, i'll test shortly :)