mariadb-operator/mariadb-operator

[Question] Database divergence after multiple isolated failures.

Opened this issue · 2 comments

I'm not sure where to ask this. Since I'm using MariaDB operator + MaxScale so I decided to ask here. Please let me know if I need to ask this question somewhere else.

I am using MariaDB operator + MaxScale to set up a database cluster with 2 instances. I've set it up and currently am testing it. My testing steps are as follows:

  1. Bring the primary database instance down. MaxScale automatically detects this and promote the other instance to master.
  2. Write some data to this new master.
  3. Bring this master down, now the cluster has no databases active.
  4. Bring the original master up, MaxScale detects this and promotes it to master. However, since the other instance is down, it is not aware of the new data written.
  5. Bring the other instance up, MaxScale detects this and adds this instance as slave to the database cluster.

Now there's new data written in the slave, but they are not present in the master. I think this outcome makes sense, but I would like to avoid it at all costs.

For example, at step 5, the primary instance should detect there's some data written to the slave and not present in the primary. It then should either copies this data over, or MaxScale should instead promote the slave to be the new master.

How can I avoid this database divergence?

Would it make sense to add logic to properly select a new master after all nodes are down except master, followed by that master is also down? For example, in my scenario, let's call my instances database A (master) and database B (slave).

  1. A is down -> B becomes master (Only 1 database left and that is also master).
  2. Applications connect to B and write some new data.
  3. B is down -> no active databases.
  4. A is up - add logic here so that since the last master is B and it was the only database left in the cluster, there's a high chance that there's new data written to B that only B knows, so A shouldn't be promoted to master
  5. B is up -> B becomes master and A replicates data from B.

Also, for what it's worth, I've also tried setting auto_rejoin = false and expected that nodes after failure shouldn't be rejoined to the cluster automatically. However, they still join the cluster and my problem still persists.

This issue is stale because it has been open 30 days with no activity.