SAP Knowledge Base Article - Preview

3453846 - Pacemaker cluster failsover due to file-system problem with host server

Symptom

It is observed that the HANA system with system replication, managed by Pacemaker, failed over to a replication site.

In the following files collected from the primary site host, similar tracing to the below is observed:

/var/log messages

Feb 24 19:02:59 <hostname> SAPHana(SAPHana_HSU_00)[3427302]: WARNING: HANA_CALL timed out after 10 seconds running command 'HDB version'
Feb 24 19:02:59 <hostname> systemd[1]: systemd-journald.service: Killing process 10274 (systemd-journal) with signal SIGABRT.
Feb 24 19:02:59 <hostname> SAPHana(SAPHana_HSU_00)[3427302]: INFO: RA ==== begin action monitor_clone (0.154.0) ====
...
Feb 24 19:02:59 <hostname> SAPHana(SAPHana_HLU_10)[3428430]: WARNING: HANA_CALL timed out after 10 seconds running command 'HDB version'
...
Feb 24 19:02:59 <hostname> SAPHana(SAPHana_HSU_00)[3427302]: WARNING: HANA_CALL timed out after 60 seconds running command 'hdbnsutil -sr_state'
Feb 24 19:02:59 <hostname> SAPHana(SAPHana_HSU_00)[3427302]: ERROR: ACT: check_for_primary:  we didn't expect node_status to be: <>
Feb 24 19:02:59 <hostname> SAPHana(SAPHana_HSU_00)[3427302]: ERROR: ACT: check_for_primary:  we didn't expect node_status to be: DUMP <00000000  0a                                                |.|#01200000001>

Regarding the following trace line:

WARNING: HANA_CALL timed out after 60 seconds running command 'hdbnsutil -sr_state'

an additional check against the nameserver tracing for the 00000 port is made to determine if the hdbnsutil call reached the database, but no trace line like the following:

[20554] [...] i Basis CommandUtil.cpp(xxxxx) : command: hdbnsutil -sr_state

has been recorded.

nameserver_f12p7saphanau1.00000.101.trc

425732]{-1}[-1/-1] 2024-02-24 19:00:10.634986 i Basis            CommandUtil.cpp(00098) : called by user 'hsuadm' with UID: 1003 (parent process command line '/bin/sh /usr/sap/HSU/HDB00/HDBSettings.sh hdbnsutil -sr_stateHostMapping --sapcontrol=1 ' with PID: 3425709) (parent process executable /usr/bin/bash')
[3425732]{-1}[-1/-1] 2024-02-24 19:00:10.686944 i Basis            TopologyUtil.cpp(01220) : done.
[3429176]{-1}[-1/-1] 2024-02-24 19:02:59.915732 i Basis            TraceStream.cpp(00752) : ==== Starting hdbnsutil, version 2.00.059.06.1666783180 (fa/hana2sp05), build linuxx86_64 9ea86f9c88dc5d61d2980bd0360693bd9740c8ad 2022-10-26 13:30:36 ld4554 gcc (SAP release 20200227, based on SUSE gcc9-9.2.1+r275327-1.3.7) 9.2.1 20190903 [gcc-9-branch revision 275330]
[3429176]{-1}[-1/-1] 2024-02-24 19:02:59.915773 i Memory           mmEnv.cpp(00189) : Per core memory pools
[3429176]{-1}[-1/-1] 2024-02-24 19:02:59.915775 i Memory           mmEnv.cpp(00189) : Per core memory pools
[3429176]{-1}[-1/-1] 2024-02-24 19:02:59.915776 i Memory           mmEnv.cpp(00189) : Per core memory pools
[3429176]{-1}[-1/-1] 2024-02-24 19:02:59.915776 i Memory           mmEnv.cpp(00189) : Per core memory pools


Read more...

Environment

SAP HANA, platform edition 2.0

Product

SAP HANA, platform edition 2.0

Keywords

Replication, HANA_CALL, time out, secondary, high availability , KBA , HAN-DB-HA , SAP HANA High Availability (System Replication, DR, etc.) , BC-OP-LNX-SUSE , SUSE Linux , BC-OP-LNX-RH , Red Hat Linux , Problem

About this page

This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP for Me (Login required).

Search for additional results

Visit SAP Support Portal's SAP Notes and KBA Search.