prooph/pdo-event-store

Select on event stream not using index

elpirata15 opened this issue · 16 comments

PHP version

  • 7.2.2

Prooph packages version

  •     "prooph/event-sourcing": "^5.0",
    
  •     "prooph/event-store-symfony-bundle": "dev-master",
    
  •     "prooph/event-store": "^7.0",
    
  •     "prooph/pdo-event-store": "^1.0",
    
  •     "prooph/service-bus-symfony-bundle": "^0.6.0",
    

Table information

Table: _4228e4a00331b5d5e751db0481828e22a2c3c8e

Create Table: CREATE TABLE `_4228e4a00331b5d5e751db0481828e22a2c3c8ef` (
  `no` bigint(20) NOT NULL AUTO_INCREMENT,
  `event_id` char(36) COLLATE utf8_bin NOT NULL,
  `event_name` varchar(100) COLLATE utf8_bin NOT NULL,
  `payload` longtext COLLATE utf8_bin NOT NULL,
  `metadata` longtext COLLATE utf8_bin NOT NULL,
  `created_at` datetime(6) NOT NULL,
  `aggregate_version` int(11) unsigned GENERATED ALWAYS AS (json_extract(`metadata`,'$._aggregate_version')) STORED,
  `aggregate_id` char(36) COLLATE utf8_bin GENERATED ALWAYS AS (json_unquote(json_extract(`metadata`,'$._aggregate_id'))) STORED,
  `aggregate_type` varchar(150) COLLATE utf8_bin GENERATED ALWAYS AS (json_unquote(json_extract(`metadata`,'$._aggregate_type'))) STORED,
  PRIMARY KEY (`no`),
  UNIQUE KEY `ix_event_id` (`event_id`),
  UNIQUE KEY `ix_unique_event` (`aggregate_type`,`aggregate_id`,`aggregate_version`),
  KEY `ix_query_aggregate` (`aggregate_type`,`aggregate_id`,`no`),
  KEY `idx_no` (`no`)
) ENGINE=InnoDB AUTO_INCREMENT=10566693 DEFAULT CHARSET=utf8 COLLATE=utf8_bin

show index from _4228e4a00331b5d5e751db0481828e22a2c3c8ef;

+-------------------------------------------+------------+--------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table                                     | Non_unique | Key_name           | Seq_in_index | Column_name       | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------------------------------+------------+--------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| _4228e4a00331b5d5e751db0481828e22a2c3c8ef |          0 | PRIMARY            |            1 | no                | A         |     9611425 |     NULL | NULL   |      | BTREE      |         |               |
| _4228e4a00331b5d5e751db0481828e22a2c3c8ef |          0 | ix_event_id        |            1 | event_id          | A         |     9450250 |     NULL | NULL   |      | BTREE      |         |               |
| _4228e4a00331b5d5e751db0481828e22a2c3c8ef |          0 | ix_unique_event    |            1 | aggregate_type    | A         |          88 |     NULL | NULL   | YES  | BTREE      |         |               |
| _4228e4a00331b5d5e751db0481828e22a2c3c8ef |          0 | ix_unique_event    |            2 | aggregate_id      | A         |     1448005 |     NULL | NULL   | YES  | BTREE      |         |               |
| _4228e4a00331b5d5e751db0481828e22a2c3c8ef |          0 | ix_unique_event    |            3 | aggregate_version | A         |     9611425 |     NULL | NULL   | YES  | BTREE      |         |               |
| _4228e4a00331b5d5e751db0481828e22a2c3c8ef |          1 | ix_query_aggregate |            1 | aggregate_type    | A         |          93 |     NULL | NULL   | YES  | BTREE      |         |               |
| _4228e4a00331b5d5e751db0481828e22a2c3c8ef |          1 | ix_query_aggregate |            2 | aggregate_id      | A         |     1114339 |     NULL | NULL   | YES  | BTREE      |         |               |
| _4228e4a00331b5d5e751db0481828e22a2c3c8ef |          1 | ix_query_aggregate |            3 | no                | A         |     9611425 |     NULL | NULL   |      | BTREE      |         |               |
+-------------------------------------------+------------+--------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Databases with the occurrence:

  • 10.3.3-MariaDB-10.3.3+maria~jessie-log (dockerized) and 10.2.1 on RDS event stream table with 300k rows.

  • Aurora MySQL compatible 5.7.12 with more than 10mln rows

    start_time: 2018-04-09 17:54:13.266981
     user_host: root[root] @  []
    query_time: 00:18:52.398348
     lock_time: 00:08:10.118294
     rows_sent: 10000
 rows_examined: 10572800
            db: dbName
last_insert_id: 0
     insert_id: 0
     server_id: 2020220249
      sql_text: SELECT * FROM `_4228e4a00331b5d5e751db0481828e22a2c3c8ef` USE INDEX(ix_query_aggregate)
WHERE `no` >= 661983
ORDER BY `no` ASC
LIMIT 10000
     thread_id: 45

MySQL []> EXPLAIN SELECT * FROM _4228e4a00331b5d5e751db0481828e22a2c3c8ef USE INDEX(ix_query_aggregate) WHERE no >= 661983 ORDER BY no ASC LIMIT 10000;

+----+-------------+-------------------------------------------+------------+------+---------------+------+---------+------+---------+----------+-----------------------------+
| id | select_type | table                                     | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra                       |
+----+-------------+-------------------------------------------+------------+------+---------------+------+---------+------+---------+----------+-----------------------------+
|  1 | SIMPLE      | _4228e4a00331b5d5e751db0481828e22a2c3c8ef | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9611425 |    33.33 | Using where; Using filesort |
+----+-------------+-------------------------------------------+------------+------+---------------+------+---------+------+---------+----------+-----------------------------+

@codeliner @basz any ideas what's going on here?

@elpirata15 Where does that query come from? I don't think it's fired from event-sourcing repo.

The index is ix_query_aggregate (aggregate_type,aggregate_id,no) - which means you need to search for all 3 fields in that exact order, otherwise the index is not used.

basz commented

Might be you simply need to PK as index as ix_query_aggregate consists of 3 columns you are not querying... Unless you select all three. But really I'm index noob...

I guess you're using SingleStreamStrategy, did you set oneStreamPerAggregate option on the repository to false?

oqq commented

The index ix_query_aggregate is not applicable for that query, since you are only query the no field here. MySQL cannot use that index in this case.

But since the no field is already an "index", you can simply remove the USE INDEX instruction and MySQL would use it automatically.

This query is not being triggered directly or modified by us, this is coming from somewhere event sourcing prooph libraries.

Temporary I sorted this out adding count($where) > 1 to the line 293 for MariaDbEventStore or Myslq, supposing that when $where is more than 1, the SQL triggered is column no + others, when $where is 1 or less, it's filtering by only no or no WHERE clause so will avoid using any index than the default (primary key by no if this is included in the query).

        if ($this->persistenceStrategy instanceof HasQueryHint && count($where) > 1) {
            $indexName = $this->persistenceStrategy->indexName();
            $queryHint = "USE INDEX($indexName)";
        } else {
            $queryHint = '';
        }

@prolic Yes, I'm using SingleStreamStrategy, could you bring more details about this, and why this can be a solution for this?

@elpirata15 did you set $oneStreamPerAggregate to true on the repository?

@elpirata15 I am asking because I need to know how you came to this query.

@prolic no, this is not set

@elpirata15
This is the place in code where the query is built. In your case the attribute $metadataMatcher should NOT be null and [$where, $values] = $this->createWhereClause($metadataMatcher); should return two where conditions with aggregate_type and aggregate_id.

Can you check that please with a debugger or var_dump?

EventStore::load() is called by the aggregate repository which is part of the prooph/event-sourcing package. Here is the place in code.

Please also use a debugger or var_dump to double check that the repository switches into the else case, meaning $this->oneStreamPerAggregate should be false.

and can you post your symfony yaml conifguration for event store and the aggregate repository?

Closing now. If you still encounter problems reopen again.

@elpirata15 make sure to update to v1.8.2, that addresses the issue directly.

Having the same issue here with SingleStreamStrategy. When the projector is loading events, it uses EventStore::load() without a MetadataMatcher, but still the generated query contains the USE INDEX hint, which prevents MySQL from using the PK index.

Here's the projector query:

SELECT * FROM `event_stream` USE INDEX(ix_query_aggregate)
WHERE `no` >= 503299
ORDER BY `no`;

It normally takes 0.19646025 to run, but without the USE INDEX it takes 0.00045325, which is a huge difference :)

Working on a PR to skip the USE INDEX hint when the MetadataMatcher is null, hope it works.

Actually, looking at the implementation, I'm not sure if such a PR would make sense. So I fixed it by replacing the PersistenceStrategy of my projectors with a custom one that does not implement HasQueryHint.

@prolic if you have any better suggestion to fix that, just ping me and I'll send a PR.