Azure_ARM fence agent - pcmk_delay_max and priority-fencing-delay
db6thomas opened this issue · 4 comments
To avoid a fence race, it is possible to use above parameters to address this issue. References can be found here:
https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/high-availability-guide-rhel-pacemaker
https://www.suse.com/support/kb/doc/?id=000019110
https://access.redhat.com/solutions/5110521
The options do not work for the Azure_ARM fence agent 4.7.1 and 4.9.1.
Both gets ignored and fencing race happens.
Is this for purpose or just missing yet?
These parameters depend on which version of pacemaker you're running.
Can you post your output of rpm -qa | grep pacemaker
?
Hello,
azr-sd01:~ # rpm -qa |grep pacemaker
libpacemaker3-2.0.4+20200616.2deceaa3a-1.2.db2pcmk.x86_64
pacemaker-remote-2.0.4+20200616.2deceaa3a-1.2.db2pcmk.x86_64
pacemaker-cli-2.0.4+20200616.2deceaa3a-1.2.db2pcmk.x86_64
libpacemaker-devel-2.0.4+20200616.2deceaa3a-1.2.db2pcmk.x86_64
pacemaker-cts-2.0.4+20200616.2deceaa3a-1.2.db2pcmk.noarch
pacemaker-2.0.4+20200616.2deceaa3a-1.2.db2pcmk.x86_64
This the Pacemaker, that comes integrated with Db2 - therefore you see db2pcmk.x86_64. In Pacemaker and Corosync, no changes where made beside new packaging.
With the same pacemaker version, we tested on AWS and there the parameters works and fencing race can be avoided.
pcmk_delay_max, pcmk_delay_base, priority-fencing-delay are being executed by pacemaker (fenced) prior to execute the action on the fence-agent while other delay-parameters are passed to the fence-agent.
The reason why fence_azure_arm
is behaving differently might be due to code in fence_aws
to avoid race conditions:
https://github.com/ClusterLabs/fence-agents/pull/323/files
Maybe you should use pcmk_delay_base
instead? That's used for base+random value.
If you have further issues you can try mailing: http://oss.clusterlabs.org/mailman/listinfo/users where users/devs of all the ClusterLabs projects can answer.