ClusterLabs/pcs

[Question] Error: Agent `ocf:heartbeat:IPaddr2` is not installed or does not provide valid metadata

Closed this issue · 8 comments

9265zl commented

Hi All

I have setted up a PCS cluster on centos stream9.And I want to add VIP to it but encountered some Error
pcs version==0.11.4

[root@node1 pcs]# pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=10.42.204.100 cidr_netmask=32 nic=ens3 op monitor interval=30s
Error: Agent 'ocf:heartbeat:IPaddr2' is not installed or does not provide valid metadata: crm_resource: Metadata query for ocf:heartbeat:IPaddr2 failed: No such device or address, Error performing operation: No such object, use --force to override
Error: Errors have occurred, therefore pcs is unable to continue

I can't find a solution online then I tried using different versions of PCS, but I got the above error.

see pcs status

[root@node1 pcs]# pcs status
Cluster name: mycluster
Status of pacemakerd: 'Pacemaker is running' (last updated 2023-03-01 14:51:20 +08:00)
Cluster Summary:
  * Stack: corosync
  * Current DC: node1 (version 2.1.5-a3f44794f94) - partition with quorum
  * Last updated: Wed Mar  1 14:51:21 2023
  * Last change:  Wed Mar  1 11:37:30 2023 by root via cibadmin on node1
  * 1 node configured
  * 0 resource instances configured

Node List:
  * Online: [ node1 ]

Full List of Resources:
  * No resources

Daemon Status:
  corosync: inactive/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Please provide some ideas for solving this problem.

Best Regards,
zl.

Hi @9265zl,
Thanks for reaching out. Can you run your resource create command with --debug and post the output? That is: pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=10.42.204.100 cidr_netmask=32 nic=ens3 op monitor interval=30s --debug. Thanks.

9265zl commented

Thanks for your reply!

[root@node1 pcs]# pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=10.42.204.100 cidr_netmask=32 nic=ens3 op monitor interval=30s --debug
Running: /usr/sbin/crm_resource --show-metadata ocf:heartbeat:IPaddr2
Environment:
  LC_ALL=C
  PATH=/usr/sbin:/bin:/usr/bin

Finished running: /usr/sbin/crm_resource --show-metadata ocf:heartbeat:IPaddr2
Return value: 105
--Debug Stdout Start--

--Debug Stdout End--
--Debug Stderr Start--
crm_resource: Metadata query for ocf:heartbeat:IPaddr2 failed: No such device or address
Error performing operation: No such object

--Debug Stderr End--

Error: Agent 'ocf:heartbeat:IPaddr2' is not installed or does not provide valid metadata: crm_resource: Metadata query for ocf:heartbeat:IPaddr2 failed: No such device or address, Error performing operation: No such object, use --force to override
Error: Errors have occurred, therefore pcs is unable to continue
[root@node1 pcs]#

Bother you again, can you help take a look at this new issue,?
when I execute this instruction pcs status --debug, my corosync aways inactive

[root@node1 pcs]# pcs status --debug
Running: /usr/sbin/crm_mon --one-shot --inactive
Environment:
  LC_ALL=C

Finished running: /usr/sbin/crm_mon --one-shot --inactive
Return value: 0
--Debug Stdout Start--
Status of pacemakerd: 'Pacemaker is running' (last updated 2023-03-01 20:00:32 +08:00)
Cluster Summary:
  * Stack: corosync
  * Current DC: node1 (version 2.1.5-a3f44794f94) - partition with quorum
  * Last updated: Wed Mar  1 20:00:32 2023
  * Last change:  Wed Mar  1 16:11:21 2023 by root via cibadmin on node1
  * 1 node configured
  * 1 resource instance configured

Node List:
  * Online: [ node1 ]

Full List of Resources:
  * http        (systemd:httpd):         Started node1

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/sbin/cibadmin --local --query
Environment:
  LC_ALL=C

Finished running: /usr/sbin/cibadmin --local --query
Return value: 0
--Debug Stdout Start--
<cib crm_feature_set="3.16.1" validate-with="pacemaker-3.9" epoch="7" num_updates="4" admin_epoch="0" cib-last-written="Wed Mar  1 16:11:21 2023" update-origin="node1" update-client="cibadmin" update-user="root" have-quorum="1" dc-uuid="1">
  <configuration>
    <crm_config>
      <cluster_property_set id="cib-bootstrap-options">
        <nvpair id="cib-bootstrap-options-have-watchdog" name="have-watchdog" value="false"/>
        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="2.1.5-a3f44794f94"/>
        <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="corosync"/>
        <nvpair id="cib-bootstrap-options-cluster-name" name="cluster-name" value="mycluster"/>
        <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="false"/>
      </cluster_property_set>
    </crm_config>
    <nodes>
      <node id="1" uname="node1"/>
    </nodes>
    <resources>
      <primitive id="http" class="systemd" type="httpd">
        <operations>
          <op name="monitor" interval="60" timeout="100" id="http-monitor-interval-60"/>
          <op name="start" interval="0s" timeout="100" id="http-start-interval-0s"/>
          <op name="stop" interval="0s" timeout="100" id="http-stop-interval-0s"/>
        </operations>
      </primitive>
    </resources>
    <constraints/>
  </configuration>
  <status>
    <node_state id="1" uname="node1" in_ccm="true" crmd="online" crm-debug-origin="do_update_resource" join="member" expected="member">
      <transient_attributes id="1">
        <instance_attributes id="status-1">
          <nvpair id="status-1-.feature-set" name="#feature-set" value="3.16.1"/>
        </instance_attributes>
      </transient_attributes>
      <lrm id="1">
        <lrm_resources>
          <lrm_resource id="http" type="httpd" class="systemd">
            <lrm_rsc_op id="http_last_0" operation_key="http_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.16.1" transition-key="1:6:7:8eafa9de-77e6-45ef-955b-54862b5a1b6d" transition-magic="0:0;1:6:7:8eafa9de-77e6-45ef-955b-54862b5a1b6d" exit-reason="" on_node="node1" call-id="5" rc-code="0" op-status="0" interval="0" last-rc-change="1677658281" exec-time="12" queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
            <lrm_rsc_op id="http_last_failure_0" operation_key="http_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.16.1" transition-key="1:6:7:8eafa9de-77e6-45ef-955b-54862b5a1b6d" transition-magic="0:0;1:6:7:8eafa9de-77e6-45ef-955b-54862b5a1b6d" exit-reason="" on_node="node1" call-id="5" rc-code="0" op-status="0" interval="0" last-rc-change="1677658281" exec-time="12" queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
            <lrm_rsc_op id="http_monitor_60000" operation_key="http_monitor_60000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.16.1" transition-key="3:7:0:8eafa9de-77e6-45ef-955b-54862b5a1b6d" transition-magic="0:0;3:7:0:8eafa9de-77e6-45ef-955b-54862b5a1b6d" exit-reason="" on_node="node1" call-id="6" rc-code="0" op-status="0" interval="60000" last-rc-change="1677658282" exec-time="10" queue-time="0" op-digest="2d296eeac3e5f7d1cfdb1557b8eb3457"/>
          </lrm_resource>
        </lrm_resources>
      </lrm>
    </node_state>
  </status>
</cib>

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/bin/systemctl is-active sbd.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-active sbd.service
Return value: 3
--Debug Stdout Start--
inactive

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/bin/systemctl is-enabled corosync.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-enabled corosync.service
Return value: 0
--Debug Stdout Start--
enabled

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/bin/systemctl is-active corosync.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-active corosync.service
Return value: 3
--Debug Stdout Start--
inactive

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/bin/systemctl is-enabled pacemaker.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-enabled pacemaker.service
Return value: 0
--Debug Stdout Start--
enabled

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/bin/systemctl is-active pacemaker.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-active pacemaker.service
Return value: 0
--Debug Stdout Start--
active

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/bin/systemctl is-enabled pacemaker_remote.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-enabled pacemaker_remote.service
Return value: 1
--Debug Stdout Start--
disabled

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/bin/systemctl is-active pacemaker_remote.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-active pacemaker_remote.service
Return value: 3
--Debug Stdout Start--
inactive

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/bin/systemctl is-enabled pcsd.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-enabled pcsd.service
Return value: 0
--Debug Stdout Start--
enabled

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/bin/systemctl is-active pcsd.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-active pcsd.service
Return value: 0
--Debug Stdout Start--
active

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Running: /usr/bin/systemctl is-enabled sbd.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-enabled sbd.service
Return value: 1
--Debug Stdout Start--

--Debug Stdout End--
--Debug Stderr Start--
Failed to get unit file state for sbd.service: No such file or directory

--Debug Stderr End--

Running: /usr/bin/systemctl is-active sbd.service
Environment:
  LC_ALL=C

Finished running: /usr/bin/systemctl is-active sbd.service
Return value: 3
--Debug Stdout Start--
inactive

--Debug Stdout End--
--Debug Stderr Start--

--Debug Stderr End--

Cluster name: mycluster
Status of pacemakerd: 'Pacemaker is running' (last updated 2023-03-01 20:00:32 +08:00)
Cluster Summary:
  * Stack: corosync
  * Current DC: node1 (version 2.1.5-a3f44794f94) - partition with quorum
  * Last updated: Wed Mar  1 20:00:32 2023
  * Last change:  Wed Mar  1 16:11:21 2023 by root via cibadmin on node1
  * 1 node configured
  * 1 resource instance configured

Node List:
  * Online: [ node1 ]

Full List of Resources:
  * http        (systemd:httpd):         Started node1

Daemon Status:
  corosync: inactive/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@node1 pcs]#

Thank you again.

It looks like you don't have the IPaddr2 resource agent installed. Please, verify that you have the resource-agents package installed and that the /usr/lib/ocf/resource.d/heartbeat/IPaddr2 file is present.

Pacemaker cannot run without corosync running. So if you see status from pacemaker, which you do, then corosync is running. Can you share output of systemctl status corosync? How did you setup and start the cluster?

9265zl commented

yeah, there is no such file of /usr/lib/ocf/resource.d/heartbeat/IPaddr2, How to install IPaddr2?

9265zl commented

This reminder of failure is Failed to get unit file state for sbd.service: No such file or directory and How to install sbd.service?

[root@node1 pacemaker]# systemctl status corosync
○ corosync.service - corosync
     Loaded: loaded (/etc/systemd/system/corosync.service; enabled; preset: disabled)
     Active: inactive (dead) since Thu 2023-03-02 16:55:48 CST; 4h 12min ago
   Duration: 58ms
    Process: 545538 ExecStart=/etc/init.d/corosync start (code=exited, status=0/SUCCESS)
   Main PID: 545538 (code=exited, status=0/SUCCESS)
      Tasks: 9 (limit: 48644)
     Memory: 140.4M
        CPU: 2min 24.091s
     CGroup: /system.slice/corosync.service
             └─545466 corosync

3月 02 16:55:48 node1 systemd[1]: Started corosync.
3月 02 16:55:48 node1 corosync[545544]:   [MAIN  ] Corosync Cluster Engine 3.1.7 starting up
3月 02 16:55:48 node1 corosync[545544]:   [MAIN  ] Corosync built-in features: pie relro bindnow
3月 02 16:55:48 node1 corosync[545545]:   [MAIN  ] Another Corosync instance is already running.
3月 02 16:55:48 node1 corosync[545545]:   [MAIN  ] Corosync Cluster Engine exiting with status 18 at main.c:1590.
3月 02 16:55:48 node1 corosync[545538]: Starting Corosync Cluster Engine (corosync): [  OK  ]
3月 02 16:55:48 node1 systemd[1]: corosync.service: Deactivated successfully.
3月 02 16:55:48 node1 systemd[1]: corosync.service: Unit process 545466 (corosync) remains running after unit stopped.

Your system seems to be a bit broken. According to systemctl status corosync, corosync has been started outside of systemd control - see Another Corosync instance is already running. That is why pcs shows corosync as inactive. Also, you are missing resource-agents package, despite the fact it is a dependency of pacemaker and should have been installed automatically when you installed pcs and / or pacemaker. I suggest stopping your cluster and reinstalling cluster packages to get the missing pieces installed.

How to install IPaddr2? How to install sbd.service?

Install the resource-agents and sbd packages, respectively.

9265zl commented

Thank you very much for your help, I managed to set up the pcs and corosync.
love u~