flipkart-incubator/dkv

Shutdown Panics Leads to invalid entry in Discovery Service

kingster opened this issue · 0 comments

  • A panic happens during shutdown if get requests are still being processed.
  • A panic leads to stale entry in the discovery server
root@dkv-dev-kinshuk1-8379923:/home/kinshuk.bairagi# date ; service dkv start
Tue Aug 17 22:06:09 IST 2021
root@dkv-dev-kinshuk1-8379923:/home/kinshuk.bairagi# /var/lib/fk-3p-dkv/dkvctl -dkvAddr localhost:8082 -getClusterInfo ''
Connecting to DKV service at localhost:8082...DONE
Current DKV cluster nodes:
dcID:"in-hyderabad-1"  nodeAddress:"a.a.a.a:8080"  database:"default"  vBucket:"shard0"  status:PRIMARY_FOLLOWER
dcID:"in-hyderabad-1"  nodeAddress:"b.b.b.b:8080"  database:"default"  vBucket:"shard0"  status:PRIMARY_FOLLOWER
dcID:"in-hyderabad-1"  nodeAddress:"c.c.c.c:8080"  database:"default"  vBucket:"shard0"  status:LEADER
root@dkv-dev-kinshuk1-8379923:/home/kinshuk.bairagi# date ; service dkv stop
Tue Aug 17 22:06:23 IST 2021
root@dkv-dev-kinshuk1-8379923:/home/kinshuk.bairagi# /var/lib/fk-3p-dkv/dkvctl -dkvAddr localhost:8082 -getClusterInfo ''
Connecting to DKV service at localhost:8082...DONE
Current DKV cluster nodes:
dcID:"in-hyderabad-1"  nodeAddress:"a.a.a.a:8080"  database:"default"  vBucket:"shard0"  status:PRIMARY_FOLLOWER
dcID:"in-hyderabad-1"  nodeAddress:"b.b.b.b:8080"  database:"default"  vBucket:"shard0"  status:PRIMARY_FOLLOWER
dcID:"in-hyderabad-1"  nodeAddress:"c.c.c.c:8080"  database:"default"  vBucket:"shard0"  status:LEADER
root@dkv-dev-kinshuk1-8379923:/home/kinshuk.bairagi# /var/lib/fk-3p-dkv/dkvctl -dkvAddr localhost:8082 -getClusterInfo ''
Connecting to DKV service at localhost:8082...DONE
Current DKV cluster nodes:
dcID:"in-hyderabad-1"  nodeAddress:"a.a.a.a:8080"  database:"default"  vBucket:"shard0"  status:PRIMARY_FOLLOWER
dcID:"in-hyderabad-1"  nodeAddress:"b.b.b.b:8080"  database:"default"  vBucket:"shard0"  status:PRIMARY_FOLLOWER
dcID:"in-hyderabad-1"  nodeAddress:"c.c.c.c:8080"  database:"default"  vBucket:"shard0"  status:LEADER 

Panic Logs

Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: [WARN] Caught signal: terminated. Shutting down...
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopping peer 52aef036aa18baac...
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: closed the TCP streaming connection with peer 52aef036aa18baac (stream MsgApp v2 writer)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopped streaming with peer 52aef036aa18baac (writer)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: closed the TCP streaming connection with peer 52aef036aa18baac (stream Message writer)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopped streaming with peer 52aef036aa18baac (writer)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopped HTTP pipelining with peer 52aef036aa18baac
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: lost the TCP streaming connection with peer 52aef036aa18baac (stream MsgApp v2 reader)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: failed to read 52aef036aa18baac on stream MsgApp v2 (context canceled)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: peer 52aef036aa18baac became inactive (message send to peer failed)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopped streaming with peer 52aef036aa18baac (stream MsgApp v2 reader)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: lost the TCP streaming connection with peer 52aef036aa18baac (stream Message reader)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopped streaming with peer 52aef036aa18baac (stream Message reader)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopped peer 52aef036aa18baac
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopping peer e102d26e19408c80...
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: fatal error: unexpected signal during runtime execution
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: closed the TCP streaming connection with peer e102d26e19408c80 (stream MsgApp v2 writer)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopped streaming with peer e102d26e19408c80 (writer)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: closed the TCP streaming connection with peer e102d26e19408c80 (stream Message writer)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopped streaming with peer e102d26e19408c80 (writer)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkvsrv[19805]: stopped HTTP pipelining with peer e102d26e19408c80
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x88 pc=0xcba499]
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: runtime stack:
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: runtime.throw(0x12ea560, 0x2a)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]:         /usr/local/go/src/runtime/panic.go:1117 +0x72
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: runtime.sigpanic()
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]:         /usr/local/go/src/runtime/signal_unix.go:718 +0x2e5
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: goroutine 31744 [syscall]:
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: runtime.cgocall(0xc04480, 0xc0081cd098, 0x0)
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]:         /usr/local/go/src/runtime/cgocall.go:154 +0x5b fp=0xc0081cd068 sp=0xc0081cd030 pc=0x42d2bb
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: github.com/flipkart-incubator/gorocksdb._Cfunc_rocksdb_multi_get_cf(0x3a49980, 0x3a41080, 0xc00eb68f00, 0x2, 0xc00b1e6f9
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]:         _cgo_gotypes.go:2232 +0x45 fp=0xc0081cd098 sp=0xc0081cd068 pc=0xb89745
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: github.com/flipkart-incubator/gorocksdb.(*DB).MultiGetCFMultiCF.func1(0xc00006ad80, 0xc000036c48, 0xc0081cd448, 0x2, 0x2
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]:         /root/go/pkg/mod/github.com/flipkart-incubator/gorocksdb@v0.0.0-20210507064827-a2162cb9a3f7/db.go:371 +0x325 fp=
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: github.com/flipkart-incubator/gorocksdb.(*DB).MultiGetCFMultiCF(0xc00006ad80, 0xc000036c48, 0xc0081cd448, 0x2, 0x2, 0xc0
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]:         /root/go/pkg/mod/github.com/flipkart-incubator/gorocksdb@v0.0.0-20210507064827-a2162cb9a3f7/db.go:371 +0x2ae fp=
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]: github.com/flipkart-incubator/dkv/internal/storage/rocksdb.(*rocksDB).getSingleKey(0xc000203bc0, 0xc000036c48, 0xc00aff1
Aug 17 22:06:23 dkv-dev-kinshuk1-8379923 dkv[19805]:         /mnt/disks/build_infra/repos/dkv-build-301062/git-repo/fk-3p-dkv/dkv/internal/storage/rocksdb/store.go:743 +0x1c