ledgetech/lua-resty-qless

When using job:complete, ERR Error running script

saucisson opened this issue · 5 comments

When i try to use job:complete () (or job:cancel ()) from within a job, i get the following error:

api_1       | 2016/09/24 14:03:45 [error] 46#0: [lua] qless.lua:305: call(): ERR Error running script (call to f_ad2dfd82189c92739c79fc30a2af6fd46c8a2793): @user_script:410: user_script:410: Complete(): Job 136793b7b1ab2ee200309398b62986cf is not currently running: complete , context: ngx.timer
api_1       | 2016/09/24 14:03:45 [error] 46#0: [lua] job.lua:213: complete(): ERR Error running script (call to f_ad2dfd82189c92739c79fc30a2af6fd46c8a2793): @user_script:410: user_script:410: Complete(): Job 136793b7b1ab2ee200309398b62986cf is not currently running: complete , context: ngx.timer
api_1       | 2016/09/24 14:03:45 [error] 46#0: [lua] qless.lua:305: call(): ERR Error running script (call to f_ad2dfd82189c92739c79fc30a2af6fd46c8a2793): @user_script:410: user_script:410: Complete(): Job 136793b7b1ab2ee200309398b62986cf is not currently running: complete , context: ngx.timer
api_1       | 2016/09/24 14:03:45 [error] 46#0: [lua] job.lua:213: complete(): ERR Error running script (call to f_ad2dfd82189c92739c79fc30a2af6fd46c8a2793): @user_script:410: user_script:410: Complete(): Job 136793b7b1ab2ee200309398b62986cf is not currently running: complete , context: ngx.timer

The job does not seem to be completed, as it is then executed 4 times.
The jobs are run in an openresy/lapis environment, provided by Docker image erikcw/lapis.
Jobs are run from the following code in nginx.conf:

  init_worker_by_lua_block {
    local Qless  = require "resty.qless"
    local Worker = require "resty.qless.worker"
    local worker = Worker.new (Config.redis)
    worker:start {
      interval    = 1,
      concurrency = 5,
      reserver    = "ordered",
      queues      = { "myqueue" },
    }
  }

It seems related to #6 .

Can you post a more complete example to reproduce? If you are using the builtin worker, your job should return a boolean to indicate success / failure and the worker will manage the rest.

Looks to me like the job's status has not been updated to "running", which happens when the reserver calls queue:pop().

You are right, the problem was that the worker was trying to call complete () instead of returning true|false. I cannot find where the documentation specified this behavior.

Yeah, sorry looks like the docs are lacking. Also, I think the Ruby worker implementation might be a bit more robust in terms of dealing with the status changing before it tries to complete.

There is definitely some ambiguity here that could use clearing up. I think I decided upon jobs returning boolean status, because the Ruby version instead uses exceptions to handle much of the logic, which wasn't very lua-ish. In that process, I failed to document the differences.

I'll try to look at this properly soon, as a clear interface for completing / failing jobs is pretty important.

I've made some changes to the develop branch which deal with this. We no longer need to return "true" from job.perform to indicate success, but returning "nil, err-type, err-msg" will fail the job and log the error.

Are you able to try this branch and confirm it works for you?

https://github.com/pintsized/lua-resty-qless/tree/develop

It works on my project. Thanks!