mysql_stop does not really kills mysql in debian

Question

mysql_stop does not really kills mysql in debian

Closed this issue 9 years ago · 17 comments

HI,
Found a case when mysql_stop does not behaves as it should.

The problem is in this part of code:
pid=cat ${OCF_RESKEY_pid}.starting 2> /dev/null
/bin/kill $pid > /dev/null
rc=$?
if [ $rc != 0 ]; then
ocf_log err "MySQL couldn't be stopped"
return $OCF_ERR_GENERIC
fi

Debug:

'[' '!' -f /var/run/mysqld/mysqld.pid.starting ']'
++ cat /var/run/mysqld/mysqld.pid.starting
pid=32752
/bin/kill 32752
rc=0
'[' 0 '!=' 0 ']'
'[' 0 -eq 1 ']'
shutdown_timeout=15
'[' -n 900000 ']'
shutdown_timeout=895
count=0
'[' 0 -lt 895 ']'
kill -s 0 32752
rc=0
'[' 0 -ne 0 ']'
++ expr 0 + 1
count=1
sleep 1
ocf_log debug 'MySQL still hasn'''t stopped yet. Waiting...'
'[' 2 -lt 2 ']'

It does kill /usr/bin/mysqld_safe process but its child process /usr/sbin/mysqld stays alive till timeout is over and mysql gets killed with -KILL. That makes the process of failover nonworkable.

OS - Debian 7
Mysql - Percona 5.6.22

Answer 1 · 2015-03-05T12:30:55.000Z

tried to change
pid=cat ${OCF_RESKEY_pid}.starting 2> /dev/null
/bin/kill $pid > /dev/null
rc=$?
to
pid=cat ${OCF_RESKEY_pid} 2> /dev/null
/bin/kill $pid > /dev/null
rc=$?

and its helped.
tried only on debian

Answer 2 · 2015-03-05T14:06:28.000Z

mysqld_safe should not be used with pacemaker but I realize the defaults are not correct, I'll modify. Do you have the same if you call mysqld directly?

Answer 3 · 2015-03-05T14:15:34.000Z

you mean if start mysql through the init script ?
in my.cnf we have :
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
malloc-lib = /usr/lib/x86_64-linux-gnu/libjemalloc.so.1

Answer 4 · 2015-03-05T14:16:29.000Z

root 8713 0.0 0.0 4180 736 ? S 12:45 0:00 /bin/sh /usr/bin/mysqld_safe --defaults-file=/etc/mysql/my.cnf --enforce_gtid_consistency=1 --gtid_mode=on --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --datadir=/mdata01/mysql/db --user=mysql --skip-slave-start --read-only
mysql 10050 0.3 64.0 4226856 2602048 ? Sl 12:45 0:15 /usr/sbin/mysqld --defaults-file=/etc/mysql/my.cnf --basedir=/usr --datadir=/mdata01/mysql/db --plugin-dir=/usr/lib/mysql/plugin --user=mysql --enforce-gtid-consistency=1 --gtid-mode=on --skip-slave-start --read-only --log-error=/var/log/mysql/mysql-error.log --open-files-limit=65535 --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306

Answer 5 · 2015-03-05T14:37:53.000Z

MySQL is supposed to be started by the agent, not by the init.d script. The init.d script should be disabled with PRM. The "binary" parameter of the agent should point to mysqld. See here for more details:

https://github.com/percona/percona-pacemaker-agents/blob/master/doc/PRM-setup-guide.rst#the-mysql-resource-primitive

Answer 6 · 2015-03-05T14:44:34.000Z

Yes , and its been started via agent and runs this way .

Answer 7 · 2015-03-05T15:00:19.000Z

Just to confirm, in you pacemaker primitive, you are using: binary="/usr/sbin/mysqld" and mysql was stopped before you put the node online in pacemaker. If you did that, there's no way mysqld_safe could have been running. The issue you have is that the agent is recording the pid of the mysqld_safe script instead of mysqld. Can you show your Pacemaker primitive for mysqld?

Answer 8 · 2015-03-05T15:16:12.000Z

Yes , now i see /usr/bin/mysqld_safe:
configure primitive p_db1_mysql ocf:percona:mysql
params config="/etc/mysql/my.cnf" pid="/var/run/mysqld/mysqld.pid"
socket="/var/run/mysqld/mysqld.sock" replication_user="someuser"
replication_passwd="somepassword" max_slave_lag="60"
evict_outdated_slaves="false" binary="/usr/bin/mysqld_safe"
test_user="clusteruser" test_passwd="password"
reader_attribute="db1_readable"
op start interval="0" timeout="900s"
op stop interval="0" timeout="900s"
op monitor interval="5s" role="Master" OCF_CHECK_LEVEL="1"
op monitor interval="2s" role="Slave" OCF_CHECK_LEVEL="1"

Can you name some reasons why its better not to use mysqld_safe with pacemaker ?

Answer 9 · 2015-03-05T15:23:25.000Z

Pacemaker must know if mysqld crashes, a lot of the agent logic is built around that. With mysqld_safe, a crash of mysqld is masked and mysqld_safe restarts mysqld (instead of Pacemaker). You simply won't get the right behavior with mysqld_safe. You also noticed another issue... I'll add support for the malloc-lib in the agent, that's straightforward.

Answer 10 · 2015-03-06T13:32:53.000Z

ok, it worked now withot mysqld_safe.

when you plan to add the support for malloc-lib

Answer 11 · 2015-03-06T14:02:43.000Z

I hope to have time today.

Answer 12 · 2015-03-09T21:48:22.000Z

I have the version with the parameter in the 1.0-beta branch but it is still failing a test. I'll continue debugging tomorrow.

Answer 13 · 2015-03-17T14:52:01.000Z

Any results?

Answer 14 · 2015-03-17T19:24:25.000Z

Hi,
I'll only resume work tomorrow on it.

Regards,

Yves

Le Tue, 17 Mar 2015 07:52:04 -0700,
Vladimir Zulin-Tarelkin notifications@github.com a écrit :

Any results?

Reply to this email directly or view it on GitHub:
#50 (comment)

Answer 15 · 2015-03-26T14:26:06.000Z

Any news?
We are considering of taking the 5.6 in use. But without jemalloc its not possible.

Answer 16 · 2015-03-27T15:32:31.000Z

Hi,
sorry for the delay, I had some pertubations in personal life. For
now, a quick fix would be to add the LD_PRELOAD command to the script
directly in the mysql_start_low function like:

LD_PRELOAD=/usr/lib/libjemalloc.so ${OCF_RESKEY_binary} --defaults-file=$OCF_RESKEY_config
--pid-file=$OCF_RESKEY_pid \ --socket=$OCF_RESKEY_socket
--datadir=$OCF_RESKEY_datadir
--user=$OCF_RESKEY_user $OCF_RESKEY_additional_parameters
$mysql_extra_params >/dev/null 2>&1 &

Regards,

Yves

Le Thu, 26 Mar 2015 07:26:07 -0700,
Vladimir Zulin-Tarelkin notifications@github.com a écrit :

Any news?
We are considering of taking the 5.6 in use. But without jemalloc its
not possible.

Reply to this email directly or view it on GitHub:
#50 (comment)

Answer 17 · 2015-04-01T14:16:59.000Z

Fixed in 1.0.0