SAP Knowledge Base Article - Preview

1982488 - Xi3.1 Errors in System Logs: "There is a significant problem with socket communication"

Symptom

At peak load, the following errors can be found:

  • In the /var/logs/messages file (equivilant to the system logs):
Dec 31 08:04:32 boesvr       boe_cmsd[7533]: There is a significant problem with socket communication
Dec 31 08:04:43 boesvr       last message repeated 68964 times
Dec 31 08:04:48 boesvr       boe_cmsd[7533]: Internal error while trying to send a job to jobserver PRD_BOESVR.JobServer3
Dec 31 08:04:48 boesvr       boe_cmsd[7533]: Jobserver 117955885 returned an error while trying to create a job for the object identified by PRD_BOESVR.JobServer3.
Dec 31 08:04:50 boesvr       boe_cmsd[7533]: There is a significant problem with socket communication
Dec 31 08:04:51 boesvr       last message repeated 14120 times
Dec 31 08:10:09 boesvr       boe_cmsd[7533]: Cluster connection with boesvr:6400 has been broken on notifier side for reason: (Notification resulted in exception IDL:omg.org/CORBA/COMM_FAILURE:1.0, minor code 1330577418). Check network connections to CMS machines, and test responsiveness of system database.
  • In the CMS trace logs:
2014/01/02 11:49:30.097|==| | |31497|1515207568| |||||||||||||||SResourceSource::LoadString There is a significant problem with socket communication
  • In the 'strace' logs:
410   <... write resumed> )             = 2011 <0.001910>
1839  recv(920,  <unfinished ...>
1797  select(1011, [41 42 43 44 46 51 52 53 54 55 56 57 58 59 63 64 66 67 69 70 71 72 74 75 79 80 82 83 84 85 88 90 91 92 93 94 95 97 98 101 105 106 107 114 117 118 121 127 128 138 140 142 143 145 149 150 151 153 156 157 158 161 165 166 167 170 172 173 175 177 179 180 181 183 186 195 199 200 204 205 208 212 218 220 222 224 225 227 232 235 238 240 246 248 251 252 256 259 260 261 265 267 268 269 270 271 273 277 278 281 282 287 288 289 290 293 294 295 303 304 306 308 311 313 316 317 319 324 325 328 329 330 332 335 339 344 345 347 348 349 350 354 355 356 358 361 364 366 372 374 378 379 381 385 388 390 391 397 398 402 413 415 416 417 420 421 423 425 426 427 431 432 433 434 436 439 440 441 444 446 448 451 452 455 456 458 459 462 465 467 468 474 475 476 477 478 479 480 481 483 484 486 487 489 491 493 495 497 498 500 502 505 507 508 511 513 514 515 517 519 523 530 531 532 533 534 536 541 542 548 550 551 553 557 558 562 565 570 572 576 577 579 581 582 584 588 590 594 596 600 601 602 606 608 610 611 615 616 617 619 625 627 631 632 633 637 641 642 646 647 649 653 655 656 657 659 660 661 664 666 668 669 670 672 674 675 677 680 681 683 687 690 693 694 695 698 703 704 705 706 707 713 715 717 718 719 730 733 734 743 744 745 746 748 749 751 753 755 757 760 761 764 770 771 774 775 777 778 782 783 788 789 790 792 796 798 800 802 803 804 805 806 810 812 815 816 819 821 822 823 825 826 830 831 832 833 834 836 838 839 840 841 842 843 844 851 852 856 861 862 863 864 871 873 875 876 878 882 883 885 886 887 891 892 893 894 895 896 898 910 915 924 925 926 929 930 934 936 937 938 941 943 946 947 948 949 952 956 957 958 959 960 964 965 968 970 972 974 975 976 977 979 980 981 982 986 990 992 995 997 999 1000 1001 1002 1003 1004 1010], NULL, NULL, {0, 20000} <unfinished ...>
1796  <... accept resumed> 0x5b733220, [588254148382687360]) = -1 EMFILE (Too many open files) <0.002141>
1793  <... read resumed> "\3\313\0\0\6\0\0\0\0\0\10\3y\0\1\1\f\10\0\0\1\0\0\0\1\0\0%EL\343\0"..., 2064) = 971 <0.002150>
1792  read(36,  <unfinished ...>

The highlighted lines identify the source of the problem. The errors that follow in the 1st example are a result of the CMS not being able to communicate to certain resources due to the inability to read certain file descriptors that hold the information for the respective socket. These types of errors will vary depending on the resource the CMS is attempting to communicate to during the time of the issue.  In the last example, 'strace' shows the 'select()' system call trying to write an additional file descriptor (#1011) and is unable to.  The OS returns the error "too many open files".  In the upcoming fix, the CMS will no longer use the 'select()' function to read/write these file handles and will use 'poll()' instead.


Read more...

Environment

SAP BusinessObjects XI 3.1 SP 5

Linux only

 

Product

SAP BusinessObjects Enterprise XI 3.1

Keywords

socket communication ADAPT01710391 ulimit linux 1024 , KBA , BI-BIP-ADM , BI Servers, security, Crystal Reports in Launchpad , Problem

About this page

This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP for Me (Login required).

Search for additional results

Visit SAP Support Portal's SAP Notes and KBA Search.