SAP Knowledge Base Article - Preview

3701673 - LNX: Failover due to link down

Symptom

  • Failover happened SAP application server rebooted.

  • Error messages observed in the OS log (/var/log/messages):

    Node 01

    Dec 31 09:13:58 abc01 corosync[1261]:  [KNET  ] link: host: 2 link: 0 is down
    Dec 31 09:13:58 abc01 corosync[1261]:  [KNET  ] host: host: 2 (passive) best link: 0 (pri: 1)
    Dec 31 09:13:58 abc01 corosync[1261]:  [KNET  ] host: host: 2 has no active links
    Dec 31 09:14:01 abc01 corosync[1261]:  [KNET  ] rx: host: 2 link: 0 is up
    Dec 31 09:14:01 abc01 corosync[1261]:  [KNET  ] host: host: 2 (passive) best link: 0 (pri: 1)

    Node 02

    Dec 31 09:15:56 abc02 corosync[1269]:  [TOTEM ] Token has not been received in 3750 ms
    Dec 31 09:15:58 abc02 corosync[1269]:  [TOTEM ] A processor failed, forming new configuration.


    Dec 31 09:16:04 abc02 corosync[1269]:  [TOTEM ] A new membership (2.409) was formed. Members left: 1
    Dec 31 09:16:04 abc02 corosync[1269]:  [TOTEM ] Failed to receive the leave message. failed: 1
    Dec 31 09:16:04 abc02 corosync[1269]:  [CPG   ] downlist left_list: 1 received
    Dec 31 09:16:04 abc02 pacemaker-attrd[2171]: notice: Lost attribute writer abc01
    Dec 31 09:16:04 abc02 sbd[1352]:   cluster:  warning: set_servant_health: Connected to corosync but requires both nodes present
    Dec 31 09:16:04 abc02 pacemaker-controld[2174]: notice: Our peer on the DC (abc01) is dead
    Dec 31 09:16:04 abc02 sbd[1318]: warning: inquisitor_child: cluster health check: UNHEALTHY
    Dec 31 09:16:04 abc02 sbd[1318]: warning: inquisitor_child: Servant cluster is outdated (age: 891792)
    Dec 31 09:16:04 abc02 corosync[1269]:  [QUORUM] Members[1]: 2
    Dec 31 09:16:04 abc02 pacemaker-attrd[2171]: notice: Node abc01 state is now lost
    Dec 31 09:16:04 abc02 pacemaker-attrd[2171]: notice: Removing all abc01 attributes for peer loss

    Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: notice: Watchdog will be used via SBD if fencing is required and stonith-watchdog-timeout is nonzero
    Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: warning: Cluster node abc01 will be fenced: peer is no longer part of the cluster
    Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: warning: Node abc01 is unclean
    Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: warning: Action sap_sbd_stop_0 on abc01 is unrunnable (offline)
    Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: warning: Scheduling Node abc01 for STONITH
    Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: notice:  * Fence (reboot) abc01 'peer is no longer part of the cluster'
    Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: notice:  * Move       sap_sbd     ( abc01 -> abc02 )


Read more...

Environment

  • SAP S/4 HANA or Netweaver system
  • Linux server with pacemaker cluster configured

Product

SAP NetWeaver 7.4 ; SAP NetWeaver 7.5 ; SAP S/4HANA 2023 ; SAP S/4HANA 2025

Keywords

failover, link down, token, fenced, reboot , KBA , BC-OP-LNX , Linux , Problem

About this page

This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP for Me (Login required).

Search for additional results

Visit SAP Support Portal's SAP Notes and KBA Search.