bitcask:init_keydir(): Avoid unnecessary wait by bitcask:poll_deferred_delete_queue_empty/0
Closed this issue · 3 comments
It is possible for Bitcask instance A's open(), which calls init_keydir(), to block for long periods of time while Bitcask instance B performs a long-running merge. In turn, such a long-blocking open() can make the riak_core_vnode_manager
process very unhappy, since it is the process that starts new Riak vnode instances which then might open a Bitcask instance.
Below is a stack trace from a blocked Bitcask instance (instance A in the description above). The same cluster-info
report shows that the bitcask_merge_delete
server has a queue of hundreds of files from instance B to delete. The report shows that there's a 3rd instance which is doing a long-running fold (repair handoff) that is preventing the bitcask_merge_delete
server from deleting B's old merge files.
https://github.com/basho/bitcask/blob/1.6.3-release/src/bitcask_merge_delete.erl#L109
One possible solution is to split the monolithic bitcask_merge_delete
server into per-bitcask-instance servers?
=proc:<0.9668.5878>
State: Waiting
Spawned as: proc_lib:init_p/5
Spawned by: <0.247.0>
Started: Thu Mar 20 11:25:11 2014
Message queue length: 1
Message queue: [{'$gen_sync_event',{<0.9667.5878>,#Ref<0.0.4474.115775>},wait_for_init}]
Number of heap fragments: 0
Heap fragment data: 0
Link list: []
Dictionary: [{'$initial_call',{riak_core_vnode,init,1}},{'$ancestors',[riak_core_vnode_sup,riak_core_sup,<0.244.0>]},{random_seed,{26833,7398,23484}}]
Reductions: 128073
Stack+heap: 987
OldHeap: 4181
Heap unused: 683
OldHeap unused: 3540
Stack dump:
Program counter: 0x00007f8539fccc28 (bitcask:poll_deferred_delete_queue_empty/0 + 88)
CP: 0x0000000000000000 (invalid)
arity = 0
0x00007f83b256aeb8 Return addr 0x00007f8539fc9570 (bitcask:init_keydir/3 + 632)
0x00007f83b256aec0 Return addr 0x00007f8539fc3c78 (bitcask:open/2 + 568)
y(0) <<0 bytes>>
y(1) Catch 0x00007f8539fc9660 (bitcask:init_keydir/3 + 872)
y(2) true
y(3) <<0 bytes>>
y(4) "/var/lib/riak/bitcask/702205864811332261480676696969151607100311339008"
0x00007f83b256aef0 Return addr 0x00007f853baef280 (riak_kv_bitcask_backend:start/2 + 1336)
y(0) []
y(1) []
y(2) 2147483648
y(3) fresh
y(4) [{data_root,"/var/lib/riak/bitcask"},{read_write,true}]
y(5) "/var/lib/riak/bitcask/702205864811332261480676696969151607100311339008"
0x00007f83b256af28 Return addr 0x00007f853b1d4160 (riak_cs_kv_multi_backend:start_backend/4 + 152)
y(0) []
y(1) []
y(2) [{data_root,"/var/lib/riak/bitcask"},{read_write,true}]
y(3) "/var/lib/riak/bitcask"
y(4) "702205864811332261480676696969151607100311339008"
y(5) 702205864811332261480676696969151607100311339008
0x00007f83b256af60 Return addr 0x00007f853b1d9208 (riak_cs_kv_multi_backend:'-start_backend_fun/1-fun-0-'/3 + 256)
y(0) be_blocks
y(1) riak_kv_bitcask_backend
y(2) Catch 0x00007f853b1d4258 (riak_cs_kv_multi_backend:start_backend/4 + 400)
0x00007f83b256af80 Return addr 0x00007f855c674960 (lists:foldl/3 + 120)
y(0) riak_kv_bitcask_backend
y(1) []
y(2) [{be_default,riak_kv_eleveldb_backend,{state,<<0 bytes>>,"/var/lib/riak/leveldb/702205864811332261480676696969151607100311339008",[{create_if_missing,true},{max_open_files,50},{use_bloomfilter,true},{write_buffer_size,59927635}],[{create_if_missing,true},{data_root,"/var/lib/riak/leveldb"},{included_applications,[]},{max_open_files,50},{use_bloomfilter,true},{write_buffer_size,59927635}],[],[],[{fill_cache,false}],true,false}}]
y(3) Catch 0x00007f853b1d9320 (riak_cs_kv_multi_backend:'-start_backend_fun/1-fun-0-'/3 + 536)
0x00007f83b256afa8 Return addr 0x00007f853b1d3e90 (riak_cs_kv_multi_backend:start/2 + 808)
y(0) #Fun<riak_cs_kv_multi_backend.1.32308858>
y(1) []
0x00007f83b256afc0 Return addr 0x00007f853ae86778 (riak_kv_vnode:init/1 + 872)
y(0) be_default
y(1) [{<<3 bytes>>,be_blocks}]
y(2) []
y(3) []
0x00007f83b256afe8 Return addr 0x00007f853ae6d220 (riak_core_vnode:do_init/1 + 248)
y(0) Catch 0x00007f853ae86778 (riak_kv_vnode:init/1 + 872)
y(1) 3000
y(2) <<8 bytes>>
y(3) 10
y(4) 100
y(5) 100
y(6) 1000
y(7) true
y(8) riak_cs_kv_multi_backend
y(9) 702205864811332261480676696969151607100311339008
0x00007f83b256b040 Return addr 0x00007f853ae6ce28 (riak_core_vnode:started/2 + 200)
y(0) []
y(1) []
y(2) []
y(3) []
y(4) []
y(5) []
y(6) []
y(7) []
y(8) []
y(9) []
y(10) undefined
y(11) 702205864811332261480676696969151607100311339008
y(12) riak_kv_vnode
y(13) {state,702205864811332261480676696969151607100311339008,riak_kv_vnode,undefined,undefined,none,undefined,undefined,undefined,undefined,undefined,0}
0x00007f83b256b0b8 Return addr 0x00007f853ae64e98 (gen_fsm:handle_msg/7 + 224)
y(0) 0
0x00007f83b256b0c8 Return addr 0x00007f855c6675b8 (proc_lib:init_p_do_apply/3 + 56)
y(0) undefined
y(1) Catch 0x00007f853ae64e98 (gen_fsm:handle_msg/7 + 224)
y(2) riak_core_vnode
y(3) {state,702205864811332261480676696969151607100311339008,riak_kv_vnode,undefined,undefined,none,undefined,undefined,undefined,undefined,undefined,0}
y(4) started
y(5) <0.9668.5878>
y(6) <0.247.0>
y(7) {'$gen_event',timeout}
0x00007f83b256b110 Return addr 0x0000000000847c38 (<terminate process normally>)
y(0) Catch 0x00007f855c6675d8 (proc_lib:init_p_do_apply/3 + 88)
See also: #95
Why do we even wait on the deferred delete queue to be empty? It seems that that should be entirely asynchronous.