erthink/ReOpenLDAP

very rare SEGFAULT in syncprov_operational()

Closed this issue · 4 comments

It is likely that bug is in the bdb/hdb backends (contextCSN attribute(s) was freed early).

test050-syncrepl-multimaster for hdb/KFA:

running defines.sh
Initializing server configurations...
Starting server 1 on TCP/IP port 6437...
Using ldapsearch to check that 1 slapd is running (port 6437)...
Waiting 0.2 seconds for 1 slapd to start...
Inserting syncprov overlay on server 1...
Starting server 2 on TCP/IP port 6438...
Using ldapsearch to check that 2 slapd is running (port 6438)...
Waiting 0.2 seconds for 2 slapd to start...
Configuring syncrepl on server 2...
Starting server 3 on TCP/IP port 6439...
Using ldapsearch to check that 3 slapd is running (port 6439)...
Waiting 0.2 seconds for 3 slapd to start...
Configuring syncrepl on server 3...
Starting server 4 on TCP/IP port 6440...
Using ldapsearch to check that 4 slapd is running (port 6440)...
Waiting 0.2 seconds for 4 slapd to start...
Configuring syncrepl on server 4...
Adding schema and databases on server 1...
Using ldapadd to populate server 1...
Waiting while syncrepl replicates a changes (cn=config between 6437 and 6438)...ldapsearch failed at consumer (255, csn=)!
 done
Found some CORE(s): '/ramfs/reopenldap-ci-test/srv2/core', collect it...

gdb:

Core was generated by `/sandbox/tests/../servers/slapd/slapd -D -s0 -d sync,stats,args,trace,conns -F'.
Program terminated with signal 6, Aborted.
#0  0x00007fbdb84575e5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64      return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
Missing separate debuginfos, use: debuginfo-install cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64 db4-4.7.25-20.el6_7.x86_64 elfutils-libelf-0.164-2.el6.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-57.el6.x86_64 libcom_err-1.41.12-22.el6.x86_64 libgcc-4.4.7-17.el6.x86_64 libiodbc-3.52.7-1.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 libtool-ltdl-2.2.6-15.5.el6.x86_64 libuuid-2.17.2-12.24.el6.x86_64 openssl-1.0.1e-48.el6_8.1.x86_64 perl-libs-5.10.1-141.el6_7.1.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) #0  0x00007fbdb84575e5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007fbdb8458dc5 in abort () at abort.c:92
#2  0x00007fbdb845070e in __assert_fail_base (fmt=<value optimized out>, assertion=0x697668 "source and destination MUST NOT overlap", file=0x6976fd "hipagut.c", line=<value optimized out>, function=<value optimized out>) at assert.c:96
#3  0x00007fbdb84507d0 in __assert_fail (assertion=0x697668 "source and destination MUST NOT overlap", file=0x6976fd "hipagut.c", line=654, function=0x6977e0 "ber_memcpy_safe") at assert.c:105
#4  0x00000000005f582a in __ldap_assert_fail (assertion=0x697668 "source and destination MUST NOT overlap", file=0x6976fd "hipagut.c", line=654, function=0x6977e0 "ber_memcpy_safe") at hipagut.c:815
#5  0x00000000005f6500 in ber_memcpy_safe (dest=<value optimized out>, src=<value optimized out>, n=<value optimized out>) at hipagut.c:653
#6  0x00000000005f3724 in ber_dupbv_x (dst=0x7fbd88102f90, src=0x7fbda41b4830, ctx=0x0) at memory.c:434
#7  0x00000000004504a7 in attr_dup2 (tmp=0x7fbda421f780, a=0x25c20e0) at attr.c:223
#8  0x0000000000450962 in attrs_dup (a=0x25c20e0) at attr.c:282
#9  0x00000000004512ff in entry_dup2 (dest=0x7fbda419e398, source=0x25ada28) at entry.c:935
#10 0x000000000045a24f in rs_entry2modifiable (op=0x7fbd88000920, rs=0x7fbd9b7fd520, on=<value optimized out>) at result.c:277
#11 0x00000000005a4317 in syncprov_operational (op=0x7fbd88000920, rs=0x7fbd9b7fd520) at syncprov.c:3338
#12 0x00000000004aeea2 in overlay_op_walk (op=0x7fbd88000920, rs=0x7fbd9b7fd520, which=op_aux_operational, oi=0x7fbda4190610, on=0x7fbda4102ea0) at backover.c:681
#13 0x00000000004af9d7 in over_op_func (op=0x7fbd88000920, rs=<value optimized out>, which=<value optimized out>) at backover.c:751
#14 0x0000000000453d76 in fe_aux_operational (op=0x7fbd88000920, rs=0x7fbd9b7fd520) at backend.c:1951
#15 0x0000000000453946 in backend_operational (op=0x7fbd88000920, rs=<value optimized out>) at backend.c:1972
#16 0x0000000000458d47 in slap_send_search_entry (op=0x7fbd88000920, rs=0x7fbd9b7fd520) at result.c:1027
#17 0x00000000004371ee in config_send (op=0x7fbd88000920, rs=0x7fbd9b7fd520, ce=0x25c13c0, depth=0) at bconfig.c:4536
#18 0x00000000004372c1 in config_back_search (op=0x7fbd88000920, rs=0x7fbd9b7fd520) at bconfig.c:6570
#19 0x00000000004aeede in overlay_op_walk (op=0x7fbd88000920, rs=0x7fbd9b7fd520, which=<value optimized out>, oi=0x7fbda4190610, on=0x0) at backover.c:697
#20 0x00000000004af9d7 in over_op_func (op=0x7fbd88000920, rs=<value optimized out>, which=<value optimized out>) at backover.c:751
#21 0x0000000000447581 in fe_op_search (op=0x7fbd88000920, rs=0x7fbd9b7fd520) at search.c:404
#22 0x0000000000447d71 in do_search (op=0x7fbd88000920, rs=0x7fbd9b7fd520) at search.c:247
#23 0x0000000000445065 in connection_operation (ctx=0x7fbd9b7fd670, arg_v=0x7fbd88000920) at connection.c:1213
#24 0x000000000044587d in connection_read_thread (ctx=0x7fbd9b7fd670, argv=<value optimized out>) at connection.c:1359
#25 0x00000000005c5800 in ldap_int_thread_pool_wrapper (xpool=0x24e45a0) at tpool.c:703
#26 0x00007fbdb98a8aa1 in start_thread (arg=0x7fbd9b7fe700) at pthread_create.c:301
#27 0x00007fbdb850daad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb)   11 Thread 0x7fbdbb298c00 (LWP 19927)  0x00007fbdb98a92fd in pthread_join (threadid=140452737984256, thread_return=0x0) at pthread_join.c:89
  10 Thread 0x7fbd9bfff700 (LWP 20028)  0x00007fbdb8504283 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=<value optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
  9 Thread 0x7fbdb08dc700 (LWP 20025)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
  8 Thread 0x7fbdb10dd700 (LWP 20021)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
  7 Thread 0x7fbdb20df700 (LWP 19945)  pthread_rwlock_wrlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:83
  6 Thread 0x7fbdb28e0700 (LWP 19932)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
  5 Thread 0x7fbdb30e1700 (LWP 19931)  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
  4 Thread 0x7fbdb38e2700 (LWP 19928)  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
  3 Thread 0x7fbd9affd700 (LWP 20049)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
  2 Thread 0x7fbdb18de700 (LWP 19946)  0x00000000005d3bb4 in strval2str (val=0x7fbda010be80, str=0x7fbda019d4de "exam\314\314\314\314\314\314\314\314\314\314\314\314\032\202\375\202\237/ boo\261\031", flags=<value optimized out>, len=0x7fbdb18dca18) at getdn.c:2141
##teamcity[publishArtifacts './@ci-test-1.KFA/test050-syncrepl-multimaster-hdb-KFA*.core* => test050-syncrepl-multimaster-hdb-KFA-cores.tar.gz']
Collect result(s) from test050-syncrepl-multimaster-hdb-KFA...
##teamcity[publishArtifacts './@ci-test-1.KFA/test050-syncrepl-multimaster-hdb-KFA.dump => test050-syncrepl-multimaster-hdb-KFA-dump.tar.gz']
<<<<< test050-syncrepl-multimaster failed for hdb/KFA (code 134, coredump)

Seems that was fixed by 75b3cc1.
However, requires thorough testing.

Apparently here is the same reason as in #89

Close as duplicate of #89