prooph/pdo-event-store

MariaDB does not use INDEX

kochen opened this issue · 12 comments

When querying for aggregates this query is being used:

SELECT * FROM `{MY_STREAM}` USE INDEX(ix_query_aggregate)
WHERE json_value(metadata, '$._aggregate_type') = '{MY_TYPE}'  AND json_value(metadata, '$._aggregate_id') = '{MY_AGGR}'  AND `no` >= 1
ORDER BY `no` ASC
LIMIT 10000

Unlike MySQL, MariaDB doesn't use indexes in these case, therefor the actual columns (which already exists) need to be used.
This causes the query to run in ~1s. Running it multiple times for various checks... huge performance impact.

Changing manually the query to:

SELECT * FROM `{MY_STREAM}` USE INDEX(ix_query_aggregate)
WHERE aggregate_type = '{MY_TYPE}'  AND 'aggregate_id' = '{MY_AGGR}'  AND `no` >= 1
ORDER BY `no` ASC
LIMIT 10000

works as expected and runs in ~0.01s!

For reference:
https://benjaminlistwon.com/blog/working-with-json-data-in-mysql-part-1-of-3/ (INDEXING AND EFFICIENCY)

This is very much the same issue (just different strategy?): #142

@codeliner or @basz do you have time for that?

basz commented

nope, sorry

@kochen Can you try to provide a PR?
Fortunately, we decided to copy and paste some code for the pdo event store variations (yes, code duplication is sometimes the better approach). You should be able to fix the problem wihout breaking MySql and Postgres implementation. We can assist if you encounter problems but I have not that much time to work on the issue immediately.

@codeliner I could try, but the problems I see are these:

  1. As you mentioned here #142 (comment) the call to the createWhereClause method is done by the AggregateRepository which is infrastructure agnostic and it uses _aggregate_id & _aggregate_type & _aggregate_version which are the Metadata keys instead of the aggregate_id & aggregate_type & aggregate_version which are the direct columns.
  2. therefore, since the MySQLEventStore code is identical to the MariaDBEventStore , I am not sure how it generate a different form of the same query - @prolic maybe you could dig into that on the MySQL side (don't have it set up locally)?

Any input is more than welcome.

Here is a quick and dirty but simpler solution that works!
It reduced our API call times from >2s to <50ms!

oqq commented

To prevent this type of issues in future major releases, I would suggest to create queries always in "Strategies".

This way we could react to platform issues without breaking other implementations. Also, this add a simpler way to add micro optimizations in queries for prooph component users.

oqq commented

I have investigated in this issue 20 minutes and see no way to fix it without a bc break. Sorry :/

@oqq for the moment it'll be pathed for MariaDB only, and for the next Major release we might "have to" break BC - even though we could introduce an AbstractPersistencyStrategy and solve it there.

@prolic @codeliner according to this: http://rpbouman.blogspot.nl/2015/11/mysql-few-observations-on-json-type.html we should expect the same issue on MySQL as well.
Could any of you confirm this?

strike that:
image

MySQL is actually smart enough to use indexed generated columns on the json document.

MariaDB is not ;)