Symptom
- Indexserver crashes with a signal 4 code message
[CRASH_SHORTINFO] Exception short info: (2024-04-22 01:15:27 871 Local)
SIGNAL 7 (SIGBUS) caught, thread: 44636415192[thr=144261]: JobWrk6463019, addr: 0x000076add6544000, range: 4096, time: 2024-04-22 01:15:27 871 Local
[OK]
--
[CRASH_EXTINFO] Extended exception info: (2024-04-22 01:15:27 873 Local)
----> Dump of siginfo contents <----
signal: 7(SIGBUS)
code: 4(MCEERR_AR: accessing physical memory marked as broken by the hardware/OS)
addr: 0x000076add6544000
range: 4096
----> Dump of system info <----
SID: SID
instance: 02
OS: Linux
node: Hostname
OS release: 5.3.18-150200.24.129-default
OS version: #1 SMP Tue Sep 6 13:05:38 UTC 2022 (c4cee83)
OS machid: x86_64
Max core: 0 KB
----> Dump of current processor info <----
Current NUMA node index: 0
Current NUMA node id: 0
Current logical Processor index: 2
Current physical Processor index: 2
[OK]
--
[CRASH_CONTEXT] Context info: (2024-04-22 01:15:27 873 Local)
----> Crashing context information <----
ContextStack at (0x00007fb5609a91a0)
stack: 7fb5607ad000-7fb5609a9fff, size 2084864
guard: 7fb56078d000-7fb5607acfff, size 131072
alt : 7fb5609aa000-7fb5609c9fff, size 131072
ctx addr: 0x00007c60b5314000, ctx link: 0x00007c60b5314000, ctx owner: 0x00007c60b5314000
ctx name: JobWrk6463019, ctx type: thread, ctx id: 44636415192
job creation: 0: 0x00007fbfa5f07a31 in TRexAPI::CsTablePartMerge::mergeDeltaIndex2(TrexBase::IndexName const&, TrexBase::IndexName const&, TRexConfig::IndexHandle&, TRexCommonObjects::TRexApiError&, TrexStore::UdivMgrHandle<TrexStore::UdivList> const&, int&, unsigned long&, TRexUtils::BitVector const&, ltt::vector<int> const&, TRexUtils::BitVector&, TRexUtils::BitVector&, TRexUtils::BitVector&, long&, unsigned int, ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >*, TRexAPI::TableMergeProgress&, unsigned long&, ltt::smart_ptr<UnifiedTable::MergeState, ltt::integral_constant<bool, false>, ltt::integral_constant<bool, false> >&, TRexAPI::AnnounceFinishHistoryAccessGuard&)+0x4b60 at ims_search_api/DeltaMerge/CsTableMerge.cpp:4322 (libhdbcsapi.so)
job execution: 0: 0x00007fbfa5f07a9f in TRexAPI::CsTablePartMerge::mergeDeltaIndex2(TrexBase::IndexName const&, TrexBase::IndexName const&, TRexConfig::IndexHandle&, TRexCommonObjects::TRexApiError&, TrexStore::UdivMgrHandle<TrexStore::UdivList> const&, int&, unsigned long&, TRexUtils::BitVector const&, ltt::vector<int> const&, TRexUtils::BitVector&, TRexUtils::BitVector&, TRexUtils::BitVector&, long&, unsigned int, ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >*, TRexAPI::TableMergeProgress&, unsigned long&, ltt::smart_ptr<UnifiedTable::MergeState, ltt::integral_constant<bool, false>, ltt::integral_constant<bool, false> >&, TRexAPI::AnnounceFinishHistoryAccessGuard&)+0x4bcb at ims_search_api/DeltaMerge/CsTableMerge.cpp:4325 (libhdbcsapi.so)
ctx command text:
ctx update transactionID: 9563233304
ctx transactionID : 1738
ctx connectionID : -1
ctx logical connectionID: 0
ctx statementID : 0
ctx statement hash :
ctx statementExecutionID: 844434401550471
ctx sqlusername :
ctx appusername :
[OK]
--
[CRASH_STACK] Stacktrace of crash: (2024-04-22 01:15:27 913 Local)
----> Symbolic stack backtrace <----
0: memcpy_impl + 0x79b
SFrame: IP: 0x00005606bb6a1d7f (0x00005606bb6a15e4+0x79b) FP: 0x00007fb5609a6010 SP: 0x00007fb5609a6000 RP: 0x00007fbf95ec2489
Params: 0x7ca8a12930f0, 0x76add6544610, 0x18, 0xffffffffffffff60, 0x5606bb6a1d7a, 0x7ca8a1292f0b
Regs: rax=0xe, rdx=0x18, rcx=0xffffffffffffff60, rbx=0x3, rsi=0x76add6544610, rdi=0x7ca8a12930f0, rbp=0x7fb5609a6000, r8=0x5606bb6a1d7a, r9=0x7ca8a1292f0b, r10=0x1d6888f6, r11=0x5606bb6a15e0, r12=0x7fb5609a6370, r13=0x1fd, r14=0x3, r15=0x7b9d93094200
Module: /hana/shared/H3P/exe/linuxx86_64/HDB_2.00.065.00.1665753120_6c34d45b0567c95dfd8fa5f0310fa7b91be152f1/hdbindexserver
NOTE: Missing frame information, following frames may be invalid (fallback unwinder)
-----------------------------------------
....
- indexserver trace
[44530]{-1}[-1/-1] 2024-04-22 01:14:46.483656 i Savepoint SavepointImpl.cpp(03108) : Savepoint current savepoint version: 581091, restart redo log position: 0x2c168b1a849, next savepoint version: 581092, last snapshot SP version: 581088
[144261]{-1}[1738/9563233304] 2024-04-22 01:15:27.560505 e Basis FaultProtectionImpl.cpp(01615) : SIGNAL 7 (SIGBUS) caught, thread: 44636415192[thr=144261]: JobWrk6463019, addr: 0x000076add6544000, range: 4096, time: 2024-04-22 01:15:27 560 Local
Instance SID/02, OS Linux Hostname 5.3.18-150200.24.129-default #1 SMP Tue Sep 6 13:05:38 UTC 2022 (c4cee83) x86_64
...
[144261]{-1}[1738/9563233304] 2024-04-22 01:15:27.560663 i Basis Helper.cpp(00101) : Using 'x64_64 ABI unwind' for stack tracing
NOTE: full crash dump will be written to /usr/sap/SID/HDB02/Hostname/trace/DB_XXX/indexserver_Hostname.30240.crashdump.20240422-011527.0017834.trc
Call stack of crashing context:
0: 0x00005606bb6a1d7f in memcpy_impl+0x79b (hdbindexserver)
1: 0x00007fbf95ec2489 in AttributeEngine::RoDictDefaultPages<TRexUtils::JustSensitive>::getFirst(AttributeEngine::RoDictIterator&, unsigned int) const+0x7b5 at AttributeEngine/Main/Dictionary/RoDictDefaultPages.h:250 (libhdbcs.so)
2: 0x00007fbf95763449 in AttributeEngine::DeltaMerge::MergeValueIdsStep<TrexTypes::RawAttributeValue, AttributeEngine::ValueDict<TrexTypes::RawAttributeValue>, AttributeEngine::BTreeAttribute<TrexTypes::RawAttributeValue>, AttributeEngine::DeltaMerge::DictStatUpdater<AttributeEngine::ValueDict<TrexTypes::RawAttributeValue>, TrexTypes::RawAttributeValue, AttributeEngine::BTreeAttribute<TrexTypes::RawAttributeValue>, AttributeEngine::DictInfo, void> >::doStepImpl()+0xf45 at AttributeEngine/Main/Dictionary/RoDictUnified.h:622 (libhdbcs.so)
3: 0x00007fbf95771cc5 in AttributeEngine::DeltaMerge::SingleMergePipeline<TrexTypes::RawAttributeValue, AttributeEngine::ValueDict<TrexTypes::RawAttributeValue>, AttributeEngine::BTreeAttribute<TrexTypes::RawAttributeValue> >::merge(AttributeEngine::BTreeAttribute<TrexTypes::RawAttributeValue> const&, AttributeEngine::DeltaMerge::AttributeMergeData&, AttributeEngine::ValueDict<TrexTypes::RawAttributeValue>&, AttributeEngine::DeltaMerge::PrepareNewDictCallback<AttributeEngine::ValueDict<TrexTypes::RawAttributeValue> >&, AttributeEngine::DeltaMerge::SingleMergeIndexVector&, int&)+0x2a1 at AttributeEngine/DeltaMerge/SingleMergePipeline.cpp:182 (libhdbcs.so)
4: 0x00007fbf96ed4ce3 in AttributeEngine::SingleAttribute<TrexTypes::RawAttributeValue, AttributeEngine::ValueDict<TrexTypes::RawAttributeValue> >::mergeOldIntoNew(AttributeEngine::AttributeValueContainer*, AttributeEngine::DeltaMerge::AttributeMergeData&, AttributeEngine::CAN_MERGE)+0x270 at AttributeEngine/DeltaMerge/SingleMergePipeline.cpp:132 (libhdbcs.so)
5: 0x00007fbf9b13b144 in AttributeEngine::MemoryAvc2::prepareDeltaMerge(AttributeEngine::AttributeValueContainer*, AttributeEngine::DeltaMerge::AttributeMergeData&, bool)+0xe0 at AttributeEngine/AttributeValueContainer.cpp:3154 (libhdbcs.so)
6: 0x00007fbf9b0f47d6 in AttributeEngine::AttributeApi::prepareDeltaMerge(TrexBase::IndexName const&, AttributeEngine::DeltaMerge::AttributeMergeData&, bool)+0x642 at AttributeEngine/AttributeApi.cpp:1606 (libhdbcs.so)
7: 0x00007fbfa5eec047 in TRexAPI::MergeAttributeJob::doMerge(TRexAPI::MergeAttributeInfo&, TRexAPI::DeltaMergeState&, bool)+0x53 at ims_search_api/DeltaMerge/MergeAttributeJob.cpp:163 (libhdbcsapi.so)
8: 0x00007fbfa5ef222c in TRexAPI::MergeAttributeJob::run(Execution::Context&, Execution::JobObject&)+0x208 at ims_search_api/DeltaMerge/MergeAttributeJob.cpp:270 (libhdbcsapi.so)
9: 0x00007fbf860a1080 in Execution::JobObjectImpl::run(Execution::JobWorker*)+0x15b0 at Basis/Execution/impl/JobExecutionLog.hpp:155 (libhdbbasis.so)
10: 0x00007fbf860ae811 in Execution::JobWorker::runJob(ltt::smartptr_handle<Execution::JobObjectForHandle>&)+0x710 at Basis/Execution/impl/JobExecutorThreads.cpp:366 (libhdbbasis.so)
11: 0x00007fbf860b04ab in Execution::JobWorker::run(Execution::ThreadRC&)+0x877 at Basis/Execution/impl/JobExecutorThreads.cpp:1354 (libhdbbasis.so)
12: 0x00007fbf861009de in Execution::Thread::staticMainImp(Execution::Thread*)+0x53a at Basis/Execution/impl/Thread.cpp:574 (libhdbbasis.so)
13: 0x00007fbf86108c15 in Execution::pthreadFunctionWrapper(Execution::Thread*)+0x1c1 at Basis/Execution/impl/ThreadInterposition.cpp:703 (libhdbbasis.so)
14: 0x0000000000000000 in <no symbol>+0x0 (<unknown>)
[144261]{-1}[1738/9563233304] 2024-04-22 01:16:04.734247 e Basis FaultProtectionImpl.cpp(01061) : MCE: not for us to handle, let's hope the user installed a handler with DIAG_SEH_START. Terminating with ContinueSearch.
- var/log/messages shows the actual crash here, referencing the same JobWrk6463019 thread that dumped in HANA : (OS sent message Exactly 3 hours later to the second @ 2024-04-22T04:15:27, so could be a timezone lag between the OS and HDB timezones.)
2024-04-22T04:15:27.524646+00:00 Hostname kernel: [11607325.863014] mce: Uncorrected hardware memory error in user-access at 26819f09000
2024-04-22T04:15:27.524673+00:00 Hostname kernel: [11607325.863934] mce: [Hardware Error]: Machine check events logged
2024-04-22T04:15:27.527929+00:00 Hostname kernel: [11607325.867046] Memory failure: 0x26819f09: Sending SIGBUS to hdbindexserver:17834 due to hardware memory corruption
2024-04-22T04:15:27.527935+00:00 Hostname kernel: [11607325.867052] Memory failure: 0x26819f09: recovery action for dirty LRU page: Recovered
2024-04-22T04:16:04.737972+00:00 Hostname kernel: [11607363.074869] MCE: Killing JobWrk6463019:144261 due to hardware memory corruption fault at 76add6544610
2024-04-22T04:16:04.737985+00:00 Hostname kernel: [11607363.075368] MCE: Killing JobWrk6463019:144261 due to hardware memory corruption fault at 76add6544590
2024-04-22T04:16:04.737993+00:00 Hostname kernel: [11607363.075386] MCE: Killing JobWrk6463019:144261 due to hardware memory corruption fault at 76add6544591
2024-04-22T04:16:04.737994+00:00 Hostname kernel: [11607363.075402] MCE: Killing JobWrk6463019:144261 due to hardware memory corruption fault at 76add6544592
2024-04-22T04:16:04.857910+00:00 Hostname kernel: [11607363.194882] MCE: Killing JobWrk6463019:144261 due to hardware memory corruption fault at 76add6544ffc
2024-04-22T04:16:04.857913+00:00 Hostname kernel: [11607363.194899] MCE: Killing JobWrk6463019:144261 due to hardware memory corruption fault at 76add6544ffd
2024-04-22T04:16:04.857915+00:00 Hostname kernel: [11607363.194916] MCE: Killing JobWrk6463019:144261 due to hardware memory corruption fault at 76add6544ffe
2024-04-22T04:16:04.857915+00:00 Hostname kernel: [11607363.194934] MCE: Killing JobWrk6463019:144261 due to hardware memory corruption fault at 76add6544fff
Read more...
Environment
SAP HANA Platform Edition 2.0
Product
SAP HANA, platform edition 2.0
Keywords
accessing physical memory marked as broken by the hardware/OS), CsTablePartMerge::mergeDeltaIndex2, SavepointImpl.cpp, FaultProtectionImpl.cpp , KBA , HAN-DB , SAP HANA Database , BC-OP-LNX , Linux , Problem
About this page
This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP for Me (Login required).Search for additional results
Visit SAP Support Portal's SAP Notes and KBA Search.
SAP Knowledge Base Article - Preview