redsnapper8t8/mysql-master-ha

Master Failover completed successfully. Error Establishing Database Connection.

Opened this issue · 2 comments

What steps will reproduce the problem?
1.Stop mysql service on Master.
2.
3.

What is the expected output? What do you see instead?
Should see a working website running off of slave. I see a database error 
connection.

What version of the product are you using? On what operating system?
0.54 ......... Centos7

Please provide any additional information below.

I have setup Replication and MHA configuration based on this tutorial. 
http://www.arborisoft.com/how-to-configure-mysql-masterslave-replication-with-mh
a-automatic-failover/

When I stop the mysql service on Master, failover starts. 
I will post the app1.log below as well as the 'Show Master Status\G' command as 
well.

I have a feeling that the VIP is not being moved to the slave when the master 
dies.

Just to understand, I shouldn't have to create the VIP on the slave, only the 
master correct?

Here is what I have from the logs that makes me curious:

It says eth1:1 is unknown. This is the interface I created to be the VIP on the 
master. I also edited the master_ip_failover_script and changed eth0 to 
represent eth1 (my interface).

IN SCRIPT TEST====sudo /sbin/ifconfig eth1:1 down==sudo /sbin/ifconfig eth1:1 
10.33.11.145===

Enabling the VIP - 10.33.11.145 on the new master - 10.33.11.142 
SIOCSIFADDR: No such device
eth1:1: unknown interface: No such device
Tue Jun  2 10:00:51 2015 - [info]  OK.
Tue Jun  2 10:00:51 2015 - [info] ** Finished master recovery successfully.
Tue Jun  2 10:00:51 2015 - [info] * Phase 3: Master Recovery Phase completed.
Tue Jun  2 10:00:51 2015 - [info] 
Tue Jun  2 10:00:51 2015 - [info] * Phase 4: Slaves Recovery Phase..
Tue Jun  2 10:00:51 2015 - [info] 
Tue Jun  2 10:00:51 2015 - [info] * Phase 4.1: Starting Parallel Slave Diff Log 
Generation Phase..
Tue Jun  2 10:00:51 2015 - [info] 
Tue Jun  2 10:00:51 2015 - [info] Generating relay diff files from the latest 
slave succeeded.
Tue Jun  2 10:00:51 2015 - [info] 
Tue Jun  2 10:00:51 2015 - [info] * Phase 4.2: Starting Parallel Slave Log 
Apply Phase..
Tue Jun  2 10:00:51 2015 - [info] 
Tue Jun  2 10:00:51 2015 - [info] All new slave servers recovered successfully.
Tue Jun  2 10:00:51 2015 - [info] 
Tue Jun  2 10:00:51 2015 - [info] * Phase 5: New master cleanup phase..
Tue Jun  2 10:00:51 2015 - [info] 
Tue Jun  2 10:00:51 2015 - [info] Resetting slave info on the new master..
Tue Jun  2 10:00:51 2015 - [debug]  Clearing slave info..
Tue Jun  2 10:00:51 2015 - [debug]  Stopping slave IO/SQL thread on 
10.33.11.142(10.33.11.142:1443)..
Tue Jun  2 10:00:51 2015 - [debug]   done.
Tue Jun  2 10:00:52 2015 - [debug]  SHOW SLAVE STATUS shows new master does not 
replicate from anywhere. OK.
Tue Jun  2 10:00:52 2015 - [info]  10.33.11.142: Resetting slave info succeeded.
Tue Jun  2 10:00:52 2015 - [info] Master failover to 
10.33.11.142(10.33.11.142:1443) completed successfully.
Tue Jun  2 10:00:52 2015 - [debug]  Disconnected from 
10.33.11.142(10.33.11.142:1443)
Tue Jun  2 10:00:52 2015 - [info] 

----- Failover Report -----

app1: MySQL Master failover 10.33.11.141(10.33.11.141:1443) to 
10.33.11.142(10.33.11.142:1443) succeeded

Master 10.33.11.141(10.33.11.141:1443) is down!

Check MHA Manager logs at MHAMonitor-01:/var/log/masterha/app1/app1.log for 
details.

Started automated(non-interactive) failover.
Invalidated master IP address on 10.33.11.141(10.33.11.141:1443)
The latest slave 10.33.11.142(10.33.11.142:1443) has all relay logs for 
recovery.
Selected 10.33.11.142(10.33.11.142:1443) as a new master.
10.33.11.142(10.33.11.142:1443): OK: Applying all logs succeeded.
10.33.11.142(10.33.11.142:1443): OK: Activated master IP address.
Generating relay diff files from the latest slave succeeded.
10.33.11.142(10.33.11.142:1443): Resetting slave info succeeded.
Master failover to 10.33.11.142(10.33.11.142:1443) completed successfully.




Here is 'Show Slave Status\G' after stopping mysql on master:
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 10.33.11.141
                  Master_User: slave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: 
          Read_Master_Log_Pos: 4
               Relay_Log_File: mysql-relay-bin.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: 
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB: blah, blah, blah, blah
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 0
              Relay_Log_Space: 125
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
1 row in set (0.00 sec)

Shouldn't master_host be updated to '10.33.11.142'?

Any help is always much appreciated!!!

Original issue reported on code.google.com by Airinag...@gmail.com on 2 Jun 2015 at 5:11

I have altered the master_ip_failover script from 'sudo /sbin/ifconfig 
eth1:$key $VIP' to 'ifup eth1:1' and I have manually created the interface 
'eth1:1' and have it turned off until master ip failover script runs to turn it 
on..... 

No matter how I alter the script I always get:

Enabling the VIP - 10.33.11.145 on the new master - 10.33.11.142 
SIOCSIFADDR: No such device
eth1:1: unknown interface: No such device

Original comment by Airinag...@gmail.com on 3 Jun 2015 at 5:19

Disregard this issue. I have found the error in the failover script.. 
Everything works as planned.

Original comment by Airinag...@gmail.com on 8 Jun 2015 at 6:33