添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

Hello,

I have 3 indexers. After one of them was restarted then Master Node crash and create crash log every minutes (when indexer try connect to cluster)

Below crash log:

[build cd0848707637] 2022-03-29 17:48:34
Received fatal signal 6 (Aborted) on PID 3183981.
 Cause:
   Signal sent by PID 3183981 running under UID 1004.
 Crashing thread: CMAddPeerWorker-5
 Registers:
    RIP:  [0x00007FDB3792137F] gsignal + 271 (libc.so.6 + 0x3737F)
    RDI:  [0x0000000000000002]
    RSI:  [0x00007FDB121F9860]
    RBP:  [0x00007FDB37A74698]
    RSP:  [0x00007FDB121F9860]
    RAX:  [0x0000000000000000]
    RBX:  [0x0000000000000006]
    RCX:  [0x00007FDB3792137F]
    RDX:  [0x0000000000000000]
    R8:  [0x0000000000000000]
    R9:  [0x00007FDB121F9860]
    R10:  [0x0000000000000008]
    R11:  [0x0000000000000246]
    R12:  [0x0000555F4AA9B818]
    R13:  [0x0000555F4A93BC02]
    R14:  [0x00000000000003C2]
    R15:  [0x00007FDB16506238]
    EFL:  [0x0000000000000246]
    TRAPNO:  [0x0000000000000000]
    ERR:  [0x0000000000000000]
    CSGSFS:  [0x002B000000000033]
    OLDMASK:  [0x0000000000000000]
 OS: Linux
 Arch: x86-64
 Backtrace (PIC build):
  [0x00007FDB3792137F] gsignal + 271 (libc.so.6 + 0x3737F)
  [0x00007FDB3790BDB5] abort + 295 (libc.so.6 + 0x21DB5)
  [0x00007FDB3790BC89] ? (libc.so.6 + 0x21C89)
  [0x00007FDB37919A76] ? (libc.so.6 + 0x2FA76)
  [0x0000555F497B294F] _ZN8CMBucket14setRASummariesERK4GuidRKSt3mapI3Str15CMBucketSummarySt4lessIS4_ESaISt4pairIKS4_S5_EEE + 623 (splunkd + 0x28C694F)
  [0x0000555F496C13C8] _ZN15CMAddPeerWorker15finishAddBucketERP8CMBucketR15BucketCSVStruct + 136 (splunkd + 0x27D53C8)
  [0x0000555F496C2320] _ZN15CMAddPeerWorker19addStandaloneBucketERK13IndexDataTypeR15BucketCSVStruct + 128 (splunkd + 0x27D6320)
  [0x0000555F496C24B3] _ZN15CMAddPeerWorker20processBucketBatchesEv + 291 (splunkd + 0x27D64B3)
  [0x0000555F48757588] _ZN15CMAddPeerWorker4mainEv + 552 (splunkd + 0x186B588)
  [0x0000555F4959B917] _ZN6Thread8callMainEPv + 135 (splunkd + 0x26AF917)
  [0x00007FDB37CB717A] ? (libpthread.so.0 + 0x817A)
  [0x00007FDB379E6DC3] clone + 67 (libc.so.6 + 0xFCDC3)
 Linux / splunk-master-prod-01.local.ad / 4.18.0-240.1.1.el8_3.x86_64 / #1 SMP Fri Oct 16 13:36:46 EDT 2020 / x86_64
 Libc abort message: splunkd: /opt/splunk/src/clustering/CMBucket.cpp:962: void CMBucket::setRASummaries(const Guid&, const CMBucketSummaries&): Assertion `hasPeer(peer)' failed.
 /etc/redhat-release: Red Hat Enterprise Linux release 8.5 (Ootpa)
 glibc version: 2.28
 glibc release: stable
Last errno: 0
Threads running: 103
Runtime: 56.398836s
argv: [splunkd --under-systemd --systemd-delegate=yes -p 8089 _internal_launch_under_systemd]
Regex JIT enabled
RE2 regex engine enabled
using CLOCK_MONOTONIC
Thread: "CMAddPeerWorker-5", did_join=0, ready_to_run=Y, main_thread=N, token=140578878629632
MutexByte: MutexByte-waiting={none}
x86 CPUID registers:
         0: 0000000D 756E6547 6C65746E 49656E69
         1: 000306F0 07040800 FFFA3203 1F8BFBFF
         2: 76036301 00F0B5FF 00000000 00C30000
         3: 00000000 00000000 00000000 00000000
         4: 00000000 00000000 00000000 00000000
         5: 00000000 00000000 00000000 00000000
         6: 00000004 00000000 00000000 00000000
         7: 00000000 00000000 00000000 00000000
         8: 00000000 00000000 00000000 00000000
         9: 00000000 00000000 00000000 00000000
         A: 07300401 000000FF 00000000 00000000
         B: 00000000 00000000 00000047 00000007
         C: 00000000 00000000 00000000 00000000
          00000000 00000000 00000000 00000000
  80000000: 80000008 00000000 00000000 00000000
  80000001: 00000000 00000000 00000021 2C100800
  80000002: 65746E49 2952286C 6F655820 2952286E
  80000003: 55504320 2D354520 30383632 20347620
  80000004: 2E322040 48473034 0000007A 00000000
  80000005: 00000000 00000000 00000000 00000000
  80000006: 00000000 00000000 01006040 00000000
  80000007: 00000000 00000000 00000000 00000100
  80000008: 0000302B 00000000 00000000 00000000
terminating...

And indexer-1 (that one that was rebooted) cannot join to cluster.

Has anyone had this problem and how to deal with it?

If more info needed im able to send it.

Did you ever figure out this issue? I'm experiencing a very similar issue. 3 Indexers, restarted 1 of the indexers and the Master node crashed. I even got the same error message as you:

splunkd: /opt/splunk/src/clustering/CMBucket.cpp:962: void CMBucket::setRASummaries(const Guid&, const CMBucketSummaries&): Assertion 'hasPeer(peer)' failed.

Splunk, Splunk>, Turn Data Into Doing, Data-to-Everything, and D2E are trademarks or registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners.