Symptom
1. File-based and backint-based HANA DB data backups fail.
2. Generic IO subsystem related errors when reading from the persistence, like in the examples below, could be found in HANA DB backup logs, respective service traces and OS logs.
2.1. backup.log:
"
<timestamp> <OID> <recovery_id> INFO BACKUP state of service: indexserver, <hostname>:<port>, volume: 3, BackupError
<timestamp> <OID> <recovery_id> INFO BACKUP state of service: xsengine, <hostname>:<port>, volume: 2, BackupCancelPending
<timestamp> <OID> <recovery_id> INFO BACKUP state of service: indexserver, <hostname>:<port>, volume: 3, BackupCancelPending
<timestamp> <OID> <recovery_id> INFO BACKUP state of service: indexserver, <hostname>:<port>, volume: 3, BackupAbortSavepointInProgress
<timestamp> <OID> <recovery_id> INFO BACKUP state of service: xsengine, <hostname>:<port>, volume: 2, BackupAbortSavepointInProgress
<timestamp> <OID> <recovery_id> INFO BACKUP state of service: indexserver, <hostname>:<port>, volume: 3, BackupAbortSavepointFinished
<timestamp> <OID> <recovery_id> INFO BACKUP state of service: xsengine, <hostname>:<port>, volume: 2, BackupAbortSavepointFinished
<timestamp> <OID> <recovery_id> ERROR BACKUP SAVE DATA finished with error: [447] backup could not be completed, [2000008] Error during asynchronous file transfer (io_getevents), rc=5: Input/output error; $fileCallback$=[R] , buffer= 0x00007f097f9a9000, offset= 69746544640, size= 0/4096, file= "<root>/datavolume_0000.dat" ((open, mode= RW, access= rw-rw-r--, flags= ASYNC|DIRECT|MUST_EXIST|LOCK), factory= (root= "/hana/data/<SID>/mnt00001/hdb00003/" (access= rw-rw-r--, flags= AUTOCREATE_PATH, usage= DATA, fs= xfs, config= (async_write_submit_active=on,async_write_submit_blocks=all,async_read_submit=on,num_submit_queues=1,num_completion_queues=1,size_kernel_io_queue=512,max_parallel_io_requests=64,min_submit_batch_size=16,max_submit_batch_size=64))) {shortRetries= 0, fullRetries= 10 (10/10)}
"
2.2. indexserver_<hostname>.<port>.<counter>.trc:
"
[145463]{-1}[-1/-1] <timestamp> e FileIO LocalFileCallback.cpp(00319) : Error in async. file transfer: 5 (Input/output error) - still failing after 10 retries, ABORT - [R] , buffer= 0x00007f147e6ad000, offset= 69726109696, size= 0/67108864, file= "<root>/datavolume_0000.dat" ((open, mode= RW, access= rw-rw-r--, flags= ASYNC|DIRECT|MUST_EXIST|LOCK), factory= (root= "/hana/data/<SID>/mnt00001/hdb00003/" (access= rw-rw-r--, flags= AUTOCREATE_PATH, usage= DATA, fs= xfs, config= (async_write_submit_active=on,async_write_submit_blocks=all,async_read_submit=on,num_submit_queues=1,num_completion_queues=1,size_kernel_io_queue=512,max_parallel_io_requests=64,min_submit_batch_size=16,max_submit_batch_size=64))) {shortRetries= 0, fullRetries= 10 (10/10)}
[145463]{-1}[-1/-1] <timestamp> e FileIO LocalFileCallback.cpp(00358) : ERROR in async. file transfer: exception 1: no.2000008 (Basis/IO/FileAccess/impl/LocalFileCompletionThread.cpp:315) TID: 145463
Error during asynchronous file transfer (io_getevents), rc=5: Input/output error; $fileCallback$=[R] , buffer= 0x00007f147e6ad000, offset= 69726109696, size= 0/67108864, file= "<root>/datavolume_0000.dat" ((open, mode= RW, access= rw-rw-r--, flags= ASYNC|DIRECT|MUST_EXIST|LOCK), factory= (root= "/hana/data/<SID>/mnt00001/hdb00003/" (access= rw-rw-r--, flags= AUTOCREATE_PATH, usage= DATA, fs= xfs, config= (async_write_submit_active=on,async_write_submit_blocks=all,async_read_submit=on,num_submit_queues=1,num_completion_queues=1,size_kernel_io_queue=512,max_parallel_io_requests=64,min_submit_batch_size=16,max_submit_batch_size=64))) {shortRetries= 0, fullRetries= 10 (10/10)}
exception throw location:
1: 0x00007f7d0f38f5d8 in FileAccess::LocalFileIOControlBlock::reportError(char const*, unsigned long, char const*, int)+0x34 at LocalFileCallback.cpp:351 (libhdbbasis.so)
2: 0x00007f7d0f395fd9 in FileAccess::LocalFileCompletionThread::run(void*&)+0xe05 at LocalFileCompletionThread.cpp:315 (libhdbbasis.so)
3: 0x00007f7d0f3e3565 in Execution::Thread::staticMainImp(Execution::Thread*)+0x531 at Thread.cpp:523 (libhdbbasis.so)
4: 0x00007f7d0f3d38e8 in Execution::pthreadFunctionWrapper(Execution::Thread*)+0x1e4 at ThreadInterposition.cpp:650 (libhdbbasis.so)
5: 0x00007f7d0e9e04f9 in start_thread+0xd5 (libpthread.so.0)
6: 0x00007f7d0d088ecf in __GI___clone+0x3b (libc.so.6)
exception type information:
- 0: public ltt::exception@0x7f7d0e946ba8
"
2.3. /var/log/messages:
"
<timestamp> <hostname> kernel: [113152523.375175] megaraid_sas 0000:01:00.0: 173946 (781251031s/0x0002/FATAL) - Unrecoverable medium error during recovery on PD 0e(e0x05/s10) at <address>
<timestamp> <hostname> kernel: [113152523.409272] megaraid_sas 0000:01:00.0: 173948 (781251031s/0x0002/FATAL) - Unrecoverable medium error during recovery on PD 0e(e0x05/s10) at <address>
<timestamp> <hostname> kernel: [113152523.421913] megaraid_sas 0000:01:00.0: 173950 (781251031s/0x0002/FATAL) - Unrecoverable medium error during recovery on PD 0e(e0x05/s10) at <address>
<timestamp> <hostname> kernel: [113152523.440877] megaraid_sas 0000:01:00.0: 173952 (781251031s/0x0002/FATAL) - Unrecoverable medium error during recovery on PD 0e(e0x05/s10) at <address>
<timestamp> <hostname> kernel: [113152523.457253] megaraid_sas 0000:01:00.0: 173954 (781251031s/0x0002/FATAL) - Unrecoverable medium error during recovery on PD 0e(e0x05/s10) at <address>
<timestamp> <hostname> kernel: [113152523.474016] megaraid_sas 0000:01:00.0: 173956 (781251031s/0x0002/FATAL) - Unrecoverable medium error during recovery on PD 0e(e0x05/s10) at <address>
<timestamp> <hostname> kernel: [113152523.482607] scsi_io_completion: 12 callbacks suppressed
<timestamp> <hostname> kernel: [113152523.482627] sd 0:2:1:0: [<device>] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
<timestamp> <hostname> kernel: [113152523.482633] sd 0:2:1:0: [<device>] tag#0 CDB: Read(16) 88 00 00 00 00 00 1c f9 b7 00 00 00 02 00 00 00
<timestamp> <hostname> kernel: [113152523.482639] print_req_error: 12 callbacks suppressed
<timestamp> <hostname> kernel: [113152523.482642] print_req_error: I/O error, dev <device>, sector <sector>
"
Read more...
Environment
SAP HANA, platform edition 1.0
SAP HANA, platform edition 2.0
Product
Keywords
bacup, rc=5, FileAccess::LocalFileIOControlBlock::reportError , KBA , HAN-DB-BAC , SAP HANA Backup & Recovery , HAN-DB-PER , SAP HANA Database Persistence , BC-OP-LNX , Linux , Problem
About this page
This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP for Me (Login required).Search for additional results
Visit SAP Support Portal's SAP Notes and KBA Search.