codership/mysql-wsrep

Non deterministic DDL leads to data inconsistency

Opened this issue · 0 comments

Hello Codership Team,

While doing an ALTER, I remembered that some DDL commands are non deterministic, and can lead to data difference between the nodes of the same cluster.

Scenario:

  • Assuming there is a table with existing data
  • When adding a column with a non-deterministic default value (ex: @@hostname or UUID())
  • Then the rows will be populated on each Galera node with different (local) data
  • When the new column is configured as part of the primary key
  • Then running an update on any row of the table will lead to only one node remaining in the cluster

To reproduce:
1) Install and Configure Galera with at least 2 nodes (minimal settings).
2) Create a table and insert a row:

MariaDB [oceandba]> CREATE TABLE nd_ddl (id serial, hits tinyint);
Query OK, 0 rows affected (0.041 sec)
MariaDB [oceandba]> INSERT INTO nd_ddl (hits) values (0);
Query OK, 1 row affected (0.008 sec)

3) Perform non-deterministic DDL:

MariaDB [oceandba]> ALTER TABLE nd_ddl ADD COLUMN hostname varchar(30) DEFAULT @@hostname, ADD PRIMARY KEY (id,hostname);
Query OK, 0 rows affected (0.065 sec)
Records: 0  Duplicates: 0  Warnings: 0

4) Verify data difference between nodes:

MariaDB [oceandba]> select * FROM nd_ddl;
+----+------+--------------+
| id | hits | hostname     |
+----+------+--------------+
|  1 |    0 | eaddd670b38b |
+----+------+--------------+
1 row in set (0.000 sec)
MariaDB [oceandba]> select * FROM nd_ddl;
+----+------+--------------+
| id | hits | hostname     |
+----+------+--------------+
|  1 |    0 | e199f61216be |
+----+------+--------------+
1 row in set (0.000 sec)

5) Update existing row and observe behavior:

MariaDB [oceandba]> UPDATE nd_ddl set hits = hits + 1 where id = 1;
Query OK, 1 row affected (0.006 sec)
Rows matched: 1  Changed: 1  Warnings: 0
MariaDB [oceandba]> select * FROM nd_ddl;
+----+------+--------------+
| id | hits | hostname     |
+----+------+--------------+
|  1 |    1 | eaddd670b38a |
+----+------+--------------+
1 row in set (0.000 sec)
MariaDB [oceandba]> select * FROM nd_ddl;
ERROR 1047 (08S01): WSREP has not yet prepared node for application use
MariaDB [(none)]> show global status like 'wsrep%state_%';
+---------------------------+--------------------------------------+
| Variable_name             | Value                                |
+---------------------------+--------------------------------------+
| wsrep_local_state_uuid    | 00000000-0000-0000-0000-000000000000 |
| wsrep_local_state_comment | Inconsistent                         |
| wsrep_cluster_state_uuid  | 87fce3ba-5c9b-11ea-bc33-9b152b8a843a |
+---------------------------+--------------------------------------+
3 rows in set (0.003 sec)

Notes:

  • This was reproduced on MariaDB 10.3 (galera-3) and MariaDB 10.5 (galera-4).
  • This doesn't happen when the "hostname" column is not part of PK
  • I remember Galera has a hook when a new AUTO_INCREMENT column is added (to bypass auto_increment parameters)
  • This should be added to limitations in documentation
  • Possibly a warning could be issued if such DDL is detected ?

Thanks for looking into it
Regards,
Joffrey