SAP Knowledge Base Article - Preview

2712064 - SAP HANA System Replication Error port 4#### already in use

Symptom

  • You are setting up SAP HANA System Replication;
  • The secondary site registration command fails with error:
    nameserver <secondary_site_hostname>:30001 not responding.
    collecting information ...
    unable to contact primary site host xx.xxx.xx.xx (<primary_site_hostname>):40006. internal error,location=xx.xxx.xx.xx:40002. Trying old-style port (port offset +100)...xx.xxx.xx.xx (<primary_site_hostname>):40006
    error: unable to contact primary site; to xx.xxx.xx.xx (<primary_site_hostname>):30106; original error: internal error,location=xx.xxx.xx.xx:30102;
    failed. trace file nameserver_<secondary_site_hostname>.00000.000.trc may contain more error details.
  • In the trace file nameserver_<hostname>.00000.000.trc you can see similar entries as follows:
    [119501]{-1}[-1/-1] 2020-06-21 08:09:56.980152 e commlib commlibImpl.cpp(00969) : ERROR: comm::connect to Host: 127.0.0.1, port: 30001, Error: exception 1: no.2110017 (Basis/IO/Stream/impl/NetworkChannel.cpp:3038)
    System error: SO_ERROR has pending error for socket. rc=111: Connection refused. channel={<NetworkChannelSSLFilter>={<NetworkChannelBase>={this=140527165233176, fd=4, refCnt=1, local=127.0.0.1/52257_tcp, remote=127.0.0.1/30001_tcp, state=ConnectWait, pending=[----]}}}; $Context$=[e6249f376cbd0016,127.0.0.1:52257,127.0.0.1:30001,TRN,0]; $Context$=[e6249f376cbd0016,127.0.0.1:52257,127.0.0.1:30001,TRN,0]; $channel$={<NetworkChannelSSLFilter>={<NetworkChannelBase>={this=140527165233176, fd=INVALID, refCnt=1, local=127.0.0.1/52257_tcp, remote=127.0.0.1/30001_tcp, state=Closed, pending=[----]}}}
    exception throw location:
    1: 0x00007fcf3126b279 in .LTHUNK27.lto_priv.2295+0x4c5 at NetworkChannel.cpp:3038 (libhdbbasis.so)
    2: 0x00007fcf31265e5f in Stream::NetworkChannelSSLFilter::initiateConnection()+0x5b at NetworkChannelSSLFilter.cpp:204 (libhdbbasis.so)
    3: 0x00007fcf31265eb9 in Stream::NetworkChannelSSLFilter::initClientChannel()+0x5 at NetworkChannelSSLFilter.cpp:75 (libhdbbasis.so)
    4: 0x00007fcf3129680e in Stream::NetworkChannelManager::connect(NetworkAccess::NetworkAddress const*, NetworkAccess::NetworkAddress const&, Stream::NetworkChannelParameters const&, bool, ltt::smartptr_handle<Stream::ChannelCallback>*, int, Stream::NetworkChannelComponent)+0x18a at NetworkChannelManager.cpp:180 (libhdbbasis.so)
    5: 0x00007fcf31296f86 in Stream::NetworkChannelManager::connect(NetworkAccess::NetworkAddress const*, NetworkAccess::NetworkAddress const&, Stream::NetworkChannelParameters const&, ltt::smartptr_handle<Stream::ChannelCallback>*, int, Stream::NetworkChannelComponent)+0x12 at NetworkChannelManager.cpp:127 (libhdbbasis.so)
    6: 0x00007fcf33632cf6 in comm::connect(void*, char const*, unsigned short, int, Crypto::Configuration*, Stream::NetworkChannelComponent)+0x252 at commlibImpl.cpp:957 (libhdbbasement.so)
    7: 0x00007fcf338513a1 in TrexNet::Channel::open(char const*, ltt::smartptr_handle<Crypto::Configuration>&)+0x2e0 at Channel.cpp:298 (libhdbbasement.so)
    8: 0x00007fcf33852370 in TrexNet::ServerRep::openNewChannel(char const*, ltt::smartptr_handle<Crypto::Configuration>&)+0x60 at EndPoint.cpp:292 (libhdbbasement.so)
    9: 0x00007fcf336db9bc in TrexNet::Requestor::getChannel(char const*, unsigned short, char, ltt::smartptr_handle<Crypto::Configuration>&)+0x128 at Requestor.cpp:174 (libhdbbasement.so)
    10: 0x00007fcf3384de4c in TrexNet::Request::Request(char const*, TRexUtils::HostAndPort const&, char, ltt::smartptr_handle<Crypto::Configuration>)+0x2c8 at Request.cpp:550 (libhdbbasement.so)
    11: 0x00007fcf3384e728 in TrexNet::Request::Request(char const*, char const*, unsigned short, char)+0xd4 at Request.cpp:502 (libhdbbasement.so)
    12: 0x00007fcf33992adc in NameServer::TNSInfo::sendRequestTo(NameServer::Request const&, NameServer::Response&, ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> > const&, unsigned short, NameServer::TrexNetRequestHolder*, bool, char)+0xe8 at TNSInfo.cpp:472 (libhdbbasement.so)
    13: 0x00007fcf33993bc4 in NameServer::TNSInfo::processRequest(NameServer::Request const&, NameServer::Response&)+0x290 at TNSInfo.cpp:381 (libhdbbasement.so)
    14: 0x00007fcf33998230 in NameServer::TNSClient::processRequest(NameServer::Request const&, NameServer::Response&)+0x20 at TNSClient.cpp:519 (libhdbbasement.so)
    15: 0x00007fcf339b6a7e in NameServer::TNSClient::storeTrees(ltt_adp::vector<NameServer::TNode, ltt::integral_constant<bool, true> > const&)+0x10a at TNSClient.cpp:625 (libhdbbasement.so)
    16: 0x000055bb05004177 in NameServerCmd::TopologyCmdAction::isNsActive(ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >&, ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> > const&, bool) [clone .constprop.185]+0x93 at TopologyCmdAction.cpp:86 (hdbnsutil)
    17: 0x000055bb04f71cc5 in registerNewDatacenter(ltt_adp::map<ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >, ltt_adp::vector<ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >, ltt::integral_constant<bool, true> >, NameServerCmd::CommandUtil::CaseInsensitiveArgmapCompare, ltt::integral_constant<bool, true> >&)+0x15f1 at TopologyUtil.cpp:1918 (hdbnsutil)
    18: 0x000055bb04f80c24 in main+0x11b0 at TopologyUtil.cpp:3172 (hdbnsutil)
    19: 0x00007fcf3137b516 in System::mainWrapper(int, char**, char**)+0x72 at IsInMain.cpp:333 (libhdbbasis.so)
    20: 0x00007fcf2f702725 in __libc_start_main+0xf1 (libc.so.6)
    21: 0x000055bb04f85fdd in global constructors keyed to 65535_0_TREXNameserverAllocator.cpp.o.212682+0x109 at start.S:103 (hdbnsutil)
    [119501]{-1}[-1/-1] 2020-06-21 08:09:56.998982 e TrexNet EndPoint.cpp(00299) : ERROR: failed to open channel 127.0.0.1:30001! reason: (internal error)
  • Nameserver trace file of primary site has similar messages as below:
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.174749 i sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01180) : checkAndStartListener(): no listener found...
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.174801 i sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01184) : checkAndStartListener(): try start listener...
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.174876 i sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01224) : Start listen to global interface port:40001
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175451 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01259) : Listener cannot be started, because port 40001 is already in use!
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175462 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01260) : A system replication primary uses replication ports in the range of instance number(s) from 00 to 00
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175466 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01261) : Please check, that there is no other system on this machine using instancenr 00! This is just a hint and possibly not the root cause ..
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175469 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01262) : In general the port range 40000-40099 must not be used by any other process when system replication is turned on!
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175472 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01263) : You may need to set ip_local_port_range as Multitenant Database, please check "System Replication with Tenant Databases" section in admin guide and SAP note 2382421, 401162
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175484 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01192) : checkAndStartListener(): listener start failed: exception 1: no.2110008 (Basis/IO/Stream/impl/NetworkChannel.cpp:1261)
    Error address in use: $msg$, rc=98: Address already in use; $[1]$=NetworkChannelBase::bindLocal. bind failed; $Context$=[2514cd6e9cfc7bdc,0.0.0.0:40001,-,UNK,0]; $channel$={<NetworkChannelBase>={this=140174047475464, fd=134, refCnt=1, local=0.0.0.0/40001_tcp, remote=(invalid), state=New, pending=[----]}}
    exception throw location:
    1: 0x00007f8132f9774e in Stream::NetworkChannelBase::bindLocal()+0x13a at NetworkChannel.cpp:1261 (libhdbbasis.so)
    2: 0x00007f8132fa3119 in Stream::NetworkChannelBase::NetworkChannelBase(Stream::NetworkChannelCompletionHandler&, Stream::NetworkChannelParameters const&, NetworkAccess::NetworkAddress const&, Stream::CompletionThreadType)+0x235 at NetworkChannel.cpp:730 (libhdbbasis.so)
    3: 0x00007f8132fa3e6e in Stream::NetworkListener::NetworkListener(Stream::NetworkChannelCompletionHandler&, Stream::NetworkChannelParameters const&, int, NetworkAccess::NetworkAddress const&, ltt::smartptr_handle<Stream::ConnectionCallback>&, Stream::CompletionThreadType)+0x3a at NetworkChannel.cpp:3390 (libhdbbasis.so)
    4: 0x00007f8132fbdd23 in Stream::NetworkChannelManager::listen(NetworkAccess::NetworkAddress const&, Stream::NetworkChannelParameters const&, unsigned int, ltt::smartptr_handle<Stream::ConnectionCallback>&, Stream::CompletionThreadType, Stream::NetworkChannelComponent)+0xf0 at NetworkChannelSSLFilter.hpp:300 (libhdbbasis.so)
    5: 0x00007f8134181e6d in DataAccess::DisasterRecoveryPrimaryHandlerImpl::startListener()+0x549 at DisasterRecoveryPrimaryImpl.cpp:1239 (libhdbdataaccess.so)
    6: 0x00007f81341e92fc in DataAccess::DisasterRecoveryPrimaryHandlerImpl::checkAndStartListener()+0x258 at DisasterRecoveryPrimaryImpl.cpp:1187 (libhdbdataaccess.so)
    7: 0x00007f81341e95a1 in DataAccess::PrimaryTimerCallback::timeoutReached()+0x20 at DisasterRecoveryTimers.cpp:60 (libhdbdataaccess.so)
    8: 0x00007f8132f7f45a in Execution::TimerThread::TimerCallback::execProcessTime()+0x16 at LockedScope.hpp:54 (libhdbbasis.so)
    9: 0x00007f8132f4c4b7 in Execution::JobObjectImpl::run(Execution::JobWorker*)+0x1463 at JobExecutorImpl.cpp:1136 (libhdbbasis.so)
    10: 0x00007f8132f4ea76 in Execution::JobWorker::run(void*&)+0x6e2 at JobExecutorThreads.cpp:327 (libhdbbasis.so)
    11: 0x00007f8132f0e416 in Execution::Thread::staticMainImp(void**)+0x3f2 at Thread.cpp:540 (libhdbbasis.so)
    12: 0x00007f8132f623d6 in Execution::Thread::staticMain(void*)+0x22 at ThreadMain.cpp:31 (libhdbbasis.so)
    13: 0x00007f8132f0c9e9 in Execution::pthreadFunctionWrapper(Execution::PthreadWrapperInfo*)+0x375 at Thread.cpp:1083 (libhdbbasis.so)
    14: 0x00007f8132a14724 in start_thread+0xc0 (libpthread.so.0)
    15: 0x00007f81316a9e8d in __clone+0x69 (libc.so.6)
    , try later

"Image/data in this KBA is from SAP internal systems, sample data, or demo systems. Any resemblance to real data is purely coincidental."


Read more...

Environment

  • SAP HANA, Platform Edition;
  • SAP HANA Database;

Product

SAP HANA, platform edition all versions

Keywords

secondary, primary, hsr, hana_system_replication, replication, site, registration, fails, failing, err, can't register, registry, syncmem, sync, async, log_replay, logreplay , KBA , HAN-DB-HA , SAP HANA High Availability (System Replication, DR, etc.) , Problem

About this page

This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP for Me (Login required).

Search for additional results

Visit SAP Support Portal's SAP Notes and KBA Search.