nginx/njs

njs_js_ext_shared_dict_items is not releasing read lock on error

Closed this issue · 1 comments

I have a rarely occurring issue when all nginx worker processes start to use 100% of cpu core, constantly calling sched_yield(). pstack shows that they are trying to acquire a lock on shared_dict:

Thread 1 (Thread 0x7f699e45c800 (LWP 6919)):
#0  0x000000000048aa20 in ngx_rwlock_wlock ()
#1  0x00007f699b85c1b3 in ngx_js_dict_set (flags=0, value=0x537a828, key=0x7ffe6762a4e0, dict=0x610f428, vm=0x3130090) at ./njs-0.8.1/nginx/ngx_js_shared_dict.c:1104
#2  njs_js_ext_shared_dict_set (vm=0x3130090, args=0x537a808, nargs=3, flags=0, retval=0x537a6c8) at ./njs-0.8.1/nginx/ngx_js_shared_dict.c:944
#3  0x00007f699b8ab3e4 in njs_function_native_call (retval=<optimized out>, vm=0x3130090) at src/njs_function.c:648
#4  njs_function_frame_invoke (vm=vm@entry=0x3130090, retval=<optimized out>) at src/njs_function.c:684
#5  0x00007f699b873fde in njs_vmcode_interpreter (vm=vm@entry=0x3130090, pc=0x206d0028 "\r\002l ", rval=rval@entry=0x113ebf00, promise_cap=promise_cap@entry=0x0, async_ctx=async_ctx@entry=0x0) at src/njs_vmcode.c:1451
#6  0x00007f699b8ab376 in njs_function_lambda_call (vm=vm@entry=0x3130090, retval=0x113ebf00, promise_cap=promise_cap@entry=0x0) at src/njs_function.c:611
#7  0x00007f699b8ab413 in njs_function_frame_invoke (vm=vm@entry=0x3130090, retval=retval@entry=0x113ebf00) at src/njs_function.c:687
#8  0x00007f699b86aca6 in njs_vm_invoke (vm=vm@entry=0x3130090, function=<optimized out>, args=args@entry=0x113ebf10, nargs=nargs@entry=1, retval=retval@entry=0x113ebf00) at src/njs_vm.c:622
#9  0x00007f699b85481d in ngx_js_invoke (vm=0x3130090, fname=fname@entry=0x610f818, log=0x4bfacc0, args=args@entry=0x113ebf10, nargs=nargs@entry=1, retval=retval@entry=0x113ebf00) at ./njs-0.8.1/nginx/ngx_js.c:242
#10 0x00007f699b84fb10 in ngx_http_js_variable_set (r=0x113eb060, v=0x6b6a1b0, data=101775384) at ./njs-0.8.1/nginx/ngx_http_js_module.c:1274
#11 0x00000000004bf672 in ngx_http_get_indexed_variable ()
#12 0x00000000004bf6f5 in ngx_http_get_flushed_variable ()
#13 0x00000000004c04ea in ngx_http_script_copy_var_len_code ()
#14 0x00000000004c2189 in ngx_http_script_complex_value_code ()
#15 0x00000000004f01c2 in ngx_http_rewrite_handler ()
#16 0x00000000004afc42 in ngx_http_core_rewrite_phase ()
#17 0x00000000004abb73 in ngx_http_core_run_phases ()
#18 0x00000000004abc7e in ngx_http_handler ()
#19 0x00000000004b5f8c in ngx_http_process_request ()
#20 0x00000000004b62e2 in ngx_http_process_request_headers ()
#21 0x00000000004b65d6 in ngx_http_process_request_line ()
#22 0x00000000004b696b in ngx_http_wait_request_handler ()
#23 0x000000000049f2de in ngx_epoll_process_events ()
#24 0x000000000049653a in ngx_process_events_and_timers ()
#25 0x000000000049d772 in ngx_worker_process_cycle ()
#26 0x000000000049bfb2 in ngx_spawn_process ()
#27 0x000000000049cb5d in ngx_start_worker_processes ()
#28 0x000000000049e3cb in ngx_master_process_cycle ()
#29 0x0000000000478351 in main ()

I can't manually reproduce this issue yet. But it seems like it's triggered by request to url which uses items() method of shared_dict. In ngx_js_shared_dict.c there's one place where read lock is acquired on dict and not released before return, also in items() handler. Lock is set here:

ngx_rwlock_rlock(&dict->sh->rwlock);

And here's return without unlock:

rc = njs_vm_array_alloc(vm, kv, 2);
if (rc != NJS_OK) {
return NJS_ERROR;
}

xeioex commented

Hi @eugen-ukraine,

Thanks for reporting the problem, will be fixed.