Symptom
- Failover happened SAP application server rebooted.
- Error messages observed in the OS log (/var/log/messages):
Node 01
Dec 31 09:13:58 abc01 corosync[1261]: [KNET ] link: host: 2 link: 0 is down
Dec 31 09:13:58 abc01 corosync[1261]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Dec 31 09:13:58 abc01 corosync[1261]: [KNET ] host: host: 2 has no active links
Dec 31 09:14:01 abc01 corosync[1261]: [KNET ] rx: host: 2 link: 0 is up
Dec 31 09:14:01 abc01 corosync[1261]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)Node 02
Dec 31 09:15:56 abc02 corosync[1269]: [TOTEM ] Token has not been received in 3750 ms
Dec 31 09:15:58 abc02 corosync[1269]: [TOTEM ] A processor failed, forming new configuration.
Dec 31 09:16:04 abc02 corosync[1269]: [TOTEM ] A new membership (2.409) was formed. Members left: 1
Dec 31 09:16:04 abc02 corosync[1269]: [TOTEM ] Failed to receive the leave message. failed: 1
Dec 31 09:16:04 abc02 corosync[1269]: [CPG ] downlist left_list: 1 received
Dec 31 09:16:04 abc02 pacemaker-attrd[2171]: notice: Lost attribute writer abc01
Dec 31 09:16:04 abc02 sbd[1352]: cluster: warning: set_servant_health: Connected to corosync but requires both nodes present
Dec 31 09:16:04 abc02 pacemaker-controld[2174]: notice: Our peer on the DC (abc01) is dead
Dec 31 09:16:04 abc02 sbd[1318]: warning: inquisitor_child: cluster health check: UNHEALTHY
Dec 31 09:16:04 abc02 sbd[1318]: warning: inquisitor_child: Servant cluster is outdated (age: 891792)
Dec 31 09:16:04 abc02 corosync[1269]: [QUORUM] Members[1]: 2
Dec 31 09:16:04 abc02 pacemaker-attrd[2171]: notice: Node abc01 state is now lost
Dec 31 09:16:04 abc02 pacemaker-attrd[2171]: notice: Removing all abc01 attributes for peer loss
Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: notice: Watchdog will be used via SBD if fencing is required and stonith-watchdog-timeout is nonzero
Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: warning: Cluster node abc01 will be fenced: peer is no longer part of the cluster
Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: warning: Node abc01 is unclean
Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: warning: Action sap_sbd_stop_0 on abc01 is unrunnable (offline)
Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: warning: Scheduling Node abc01 for STONITH
Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: notice: * Fence (reboot) abc01 'peer is no longer part of the cluster'
Dec 31 09:16:05 abc02 pacemaker-schedulerd[2173]: notice: * Move sap_sbd ( abc01 -> abc02 )
Read more...
Environment
- SAP S/4 HANA or Netweaver system
- Linux server with pacemaker cluster configured
Product
Keywords
failover, link down, token, fenced, reboot , KBA , BC-OP-LNX , Linux , Problem
About this page
This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP for Me (Login required).Search for additional results
Visit SAP Support Portal's SAP Notes and KBA Search.
SAP Knowledge Base Article - Preview