luvit/luvit

problems with uv.try_write and signals

joerg-krause opened this issue ยท 17 comments

Since commit 7cab2d9 the test signal usr1,usr2,hup fails on my Arch Linux system:

# Starting Test: signal usr1,usr2,hup
  /home/joerg/Development/git/luvit/tests/libs/tap.lua:83: Missing 3 expected calls
  stack traceback:
  	[C]: in function 'error'
  	/home/joerg/Development/git/luvit/tests/libs/tap.lua:83: in function </home/joerg/Development/git/luvit/tests/libs/tap.lua:64>
  	[C]: in function 'xpcall'
  	/home/joerg/Development/git/luvit/tests/libs/tap.lua:64: in function 'run'
  	/home/joerg/Development/git/luvit/tests/libs/tap.lua:165: in function </home/joerg/Development/git/luvit/tests/libs/tap.lua:142>
  	/home/joerg/Development/git/luvit/tests/test-process.lua:6: in function 'fn'
  	[string "bundle:deps/require.lua"]:310: in function 'require'
  	/home/joerg/Development/git/luvit/main.lua:118: in function 'main'
  	/home/joerg/Development/git/luvit/init.lua:49: in function </home/joerg/Development/git/luvit/init.lua:47>
  	[C]: in function 'xpcall'
  	/home/joerg/Development/git/luvit/init.lua:47: in function 'fn'
  	[string "bundle:deps/require.lua"]:310: in function <[string "bundle:deps/require.lua"]:266>
not ok 3 signal usr1,usr2,hup

Edit: Tested with lates luvi:

$ luvi --version
luvi v2.11.0
rex: 8.37 2015-04-28
libuv: 1.34.2
ssl: OpenSSL 1.1.1d  10 Sep 2019, lua-openssl 0.7.7

$ cd luvit
$ luvi . -- tests/test-process.lua

Just to be sure I'm understanding, reverting the changes in 7cab2d9 fixes the test for you?

Yes, I did a git bisect to identify the commit. Commit 4437c31 passes the test:

# Starting Test: signal usr1,usr2,hup
ok 3 signal usr1,usr2,hup

Strange, it's unclear to me how print/prettyPrint would be able to affect this test.

The signal test for its own passes the test:

require('tap')(function(test)
  test('signal usr1,usr2,hup', function(expect)
    local onHUP, onUSR1, onUSR2
    function onHUP() process:removeListener('sighup', onHUP) end
    function onUSR1() process:removeListener('sigusr1', onUSR1) end
    function onUSR2() process:removeListener('sigusr2', onUSR2) end
    process:on('sighup', expect(onHUP))
    process:on('sigusr1', expect(onUSR1))
    process:on('sigusr2', expect(onUSR2))
    process.kill(process.pid, 'sighup')
    process.kill(process.pid, 'sigusr1')
    process.kill(process.pid, 'sigusr2')
  end)
end)

Output:

$ luvi . -- tests/test-signal.lua
1..1

# Starting Test: signal usr1,usr2,hup
ok 1 signal usr1,usr2,hup
# All tests passed

Adding the process argv test case makes the test fail:

require('tap')(function(test)
  test('process argv', function()
    p('process argv', process.argv)
    assert(process.argv)
  end)
  test('signal usr1,usr2,hup', function(expect)
    local onHUP, onUSR1, onUSR2
    function onHUP() process:removeListener('sighup', onHUP) end
    function onUSR1() process:removeListener('sigusr1', onUSR1) end
    function onUSR2() process:removeListener('sigusr2', onUSR2) end
    process:on('sighup', expect(onHUP))
    process:on('sigusr1', expect(onUSR1))
    process:on('sigusr2', expect(onUSR2))
    process.kill(process.pid, 'sighup')
    process.kill(process.pid, 'sigusr1')
    process.kill(process.pid, 'sigusr2')
  end)
end)

Output:

luvi . -- tests/test-signal.lua
1..2

# Starting Test: process argv
'process argv'	{ [0] = 'luvi', 'tests/test-signal.lua' }
ok 1 process argv

# Starting Test: signal usr1,usr2,hup
  /home/joerg/Development/git/luvit/tests/libs/tap.lua:83: Missing 3 expected calls
  stack traceback:
  	[C]: in function 'error'
  	/home/joerg/Development/git/luvit/tests/libs/tap.lua:83: in function </home/joerg/Development/git/luvit/tests/libs/tap.lua:64>
  	[C]: in function 'xpcall'
  	/home/joerg/Development/git/luvit/tests/libs/tap.lua:64: in function 'run'
  	/home/joerg/Development/git/luvit/tests/libs/tap.lua:165: in function </home/joerg/Development/git/luvit/tests/libs/tap.lua:142>
  	/home/joerg/Development/git/luvit/tests/test-signal.lua:1: in function 'fn'
  	[string "bundle:deps/require.lua"]:310: in function 'require'
  	/home/joerg/Development/git/luvit/main.lua:118: in function 'main'
  	/home/joerg/Development/git/luvit/init.lua:49: in function </home/joerg/Development/git/luvit/init.lua:47>
  	[C]: in function 'xpcall'
  	/home/joerg/Development/git/luvit/init.lua:47: in function 'fn'
  	[string "bundle:deps/require.lua"]:310: in function <[string "bundle:deps/require.lua"]:266>
not ok 2 signal usr1,usr2,hup
#1 failed test

Swapping the two test cases passes the test:

require('tap')(function(test)
  test('signal usr1,usr2,hup', function(expect)
    local onHUP, onUSR1, onUSR2
    function onHUP() process:removeListener('sighup', onHUP) end
    function onUSR1() process:removeListener('sigusr1', onUSR1) end
    function onUSR2() process:removeListener('sigusr2', onUSR2) end
    process:on('sighup', expect(onHUP))
    process:on('sigusr1', expect(onUSR1))
    process:on('sigusr2', expect(onUSR2))
    process.kill(process.pid, 'sighup')
    process.kill(process.pid, 'sigusr1')
    process.kill(process.pid, 'sigusr2')
  end)
  test('process argv', function()
    p('process argv', process.argv)
    assert(process.argv)
  end)
end)

Output:

luvi . -- tests/test-signal.lua
1..2

# Starting Test: signal usr1,usr2,hup
ok 1 signal usr1,usr2,hup

# Starting Test: process argv
'process argv'	{ [0] = 'luvi', 'tests/test-signal.lua' }
ok 2 process argv
# All tests passed

Strange to me too

The issue is, when running the test as below, that in https://github.com/luvit/luvit/blob/master/tests/libs/tap.lua#L105-L108 the signal handles are closed before the signal callbacks are called.

require('tap')(function(test)
  test('process argv', function()
    p('process argv', process.argv)
    assert(process.argv)
  end)
  test('signal usr1,usr2,hup', function(expect)
    local onHUP, onUSR1, onUSR2
    function onHUP() process:removeListener('sighup', onHUP) end
    function onUSR1() process:removeListener('sigusr1', onUSR1) end
    function onUSR2() process:removeListener('sigusr2', onUSR2) end
    process:on('sighup', expect(onHUP))
    process:on('sigusr1', expect(onUSR1))
    process:on('sigusr2', expect(onUSR2))
    process.kill(process.pid, 'sighup')
    process.kill(process.pid, 'sigusr1')
    process.kill(process.pid, 'sigusr2')
  end)
end)

This is the output of a modified tap.lua file for debugging purpose (which you can find below) for running only test process argv:

uv walk process argv
close handle uv_tty_t: 0x55d024663510
close handle uv_tty_t: 0x55d024663650
close handle uv_tty_t: 0x55d0246637c0
close handle uv_check_t: 0x55d024663960
close handle uv_idle_t: 0x55d0246639e0
close handle uv_signal_t: 0x55d024663440

This is the output of the modified tap.lua file for running only test signal usr1,usr2,hup:

uv walk signal usr1,usr2,hup
close handle uv_tty_t: 0x556104957510
close handle uv_tty_t: 0x556104957650
close handle uv_tty_t: 0x5561049577c0
close handle uv_check_t: 0x556104957960
close handle uv_idle_t: 0x5561049579e0
close handle uv_signal_t: 0x556104957440

This is the output of the modified tap.lua file for running both tests process argv first, and signal usr1,usr2,hup second:

uv walk process argv
uv walk signal usr1,usr2,hup
flush handle uv_signal_t: 0x55c237a65b40
flush handle uv_signal_t: 0x55c237a65be0
flush handle uv_signal_t: 0x55c237a65c80
close handle uv_tty_t: 0x55c237a4b510
close handle uv_tty_t: 0x55c237a4b650
close handle uv_tty_t: 0x55c237a4b7c0
close handle uv_check_t: 0x55c237a4b960
close handle uv_idle_t: 0x55c237a4b9e0
close handle uv_signal_t: 0x55c237a4b440
This is the modifed `tap.lua` file

--[[

Copyright 2014 The Luvit Authors. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS-IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

--]]

exports.name = "luvit/tap"
exports.version = "0.1.1"
exports.dependencies = {
  "luvit/pretty-print@1.0.2"
}
exports.license = "Apache 2"
exports.homepage = "https://github.com/luvit/luvit/blob/master/tests/libs/tap.lua"
exports.description = "Libuv loop based test runner with tap output."
exports.tags = {"test", "tap"}

local uv = require('uv')
local colorize = require('pretty-print').colorize

-- Capture output from global print and prefix with two spaces
local print = _G.print
_G.print = function (...)
  local n = select('#', ...)
  local arguments = {...}
  for i = 1, n do
    arguments[i] = tostring(arguments[i])
  end

  local text = table.concat(arguments, "\t")
  text = "  " .. string.gsub(text, "\n", "\n  ")
  print(text)
end

local tests = {};

local function run()
  local passed = 0

  if #tests < 1 then
    error("No tests specified!")
  end

  print("1.." .. #tests)
  for i = 1, #tests do
    local test = tests[i]
    local cwd = uv.cwd()
    local preexisting = {}
    uv.walk(function (handle)
      preexisting[handle] = true
    end)
    print("\n# Starting Test: " .. colorize("highlight", test.name))
    local pass, err = xpcall(function ()
      local expected = 0
      local err
      local function expect(fn, count)
        expected = expected + (count or 1)
        return function (...)
          expected = expected - 1
          local success, ret = pcall(fn, ...)
          if not success then err = ret end
          collectgarbage()
          return ret
        end
      end
      test.fn(expect)
      collectgarbage()
      uv.run()
      collectgarbage()
      if err then error(err) end
      if expected > 0 then
        error("Missing " .. expected .. " expected call" .. (expected == 1 and "" or "s"))
      elseif expected < 0 then
        error("Found " .. -expected .. " unexpected call" .. (expected == -1 and "" or "s"))
      end
      collectgarbage()
      local unclosed = 0
      uv.walk(function (handle)
        if preexisting[handle] then return end
        unclosed = unclosed + 1
        print("UNCLOSED", handle)
      end)
      if unclosed > 0 then
        error(unclosed .. " unclosed handle" .. (unclosed == 1 and "" or "s"))
      end
      if uv.cwd() ~= cwd then
        error("Test moved cwd from " .. cwd .. " to " .. uv.cwd())
      end
      collectgarbage()
    end, debug.traceback)

    -- Flush out any more opened handles
    uv.stop()
    local file = io.open ("/tmp/luvit.test", "a+")
    file:write("uv walk " .. test.name .. "\n")
    file:close()
    uv.walk(function (handle)
      if preexisting[handle] or uv.is_closing(handle) then return end
      local file = io.open ("/tmp/luvit.test", "a+")
      file:write("flush handle " .. tostring(handle) .. "\n")
      file:close()
      uv.close(handle)
    end)
    -- Wait for the close calls to finish
    uv.run()
    -- Reset the cwd if the script changed it.
    uv.chdir(cwd)

    if pass then
      print("ok " .. i .. " " .. colorize("success", test.name))
      passed = passed + 1
    else
      _G.print(colorize("err", err))
      print("not ok " .. i .. " " .. colorize("failure", test.name))
    end
  end

  local failed = #tests - passed
  if failed == 0 then
    print("# All tests passed")
  else
    print("#" .. failed .. " failed test" .. (failed == 1 and "" or "s"))
  end

  -- Close all then handles, including stdout
  uv.walk(function(handle)
    if not uv.is_closing(handle) then
      local file = io.open ("/tmp/luvit.test", "a+")
      file:write("close handle " .. tostring(handle) .. "\n")
      file:close()  
      uv.close(handle)
    end
  end)
  uv.run()

  os.exit(-failed)
end

local single = true
local prefix

local function tap(suite)

  if type(suite) == "function" then
    -- Pass in suite directly for single mode
    suite(function (name, fn)
      if prefix then
        name = prefix .. ' - ' .. name
      end
      tests[#tests + 1] = {
        name = name,
        fn = fn
      }
    end)
    prefix = nil
  elseif type(suite) == "string" then
    prefix = suite
    single = false
  else
    -- Or pass in false to collect several runs of tests
    -- And then pass in true in a later call to flush tests queue.
    single = suite
  end

  if single then run() end

end

return tap

close by revert 7cab2d9

We should still try to understand what was going on here, since it seems like it could still happen (just not via Luvit's print).

sure, eyes here. loop try_write will make signal lost, need another way to handle truncate output

try_write will make signal lost

But why does this happen? It seems like a bug.

Now I have no ideas, Let's do more

Think about this

  test('signal usr1,usr2,hup', function(expect)
    local onHUP, onUSR1, onUSR2, iCount
    if los.type() == 'win32' then return end
    iCount = 0
    function onHUP() iCount=iCount+1 process:removeListener('sighup', onHUP) end
    function onUSR1() iCount=iCount+1 process:removeListener('sigusr1', onUSR1) end
    function onUSR2() iCount=iCount+1 process:removeListener('sigusr2', onUSR2) end
    process:on('sighup', expect(onHUP))
    process:on('sigusr1', expect(onUSR1))
    process:on('sigusr2', expect(onUSR2))
    process.kill(process.pid, 'sighup')
    process.kill(process.pid, 'sigusr1')
    process.kill(process.pid, 'sigusr2')
    local function setTimeout(timeout, callback)
      local timer = uv.new_timer()
      timer:start(timeout, 0, function ()
        timer:stop()
        timer:close()
        callback()
      end)
      return timer
    end
    setTimeout(10, function()
      assert(iCount==3)
    end)
  end)

This will make test pass.

Reason: uv loop stop before system fire signal, so a timer will make all signal handle as expect.

This is just bug in test. not caused by 7cab2d9

ok, please look #1109

Ok, installed an Arch Linux VM and was able to reproduce and play around with this. I think I finally understand what's going on.

Normally, signal handlers will keep the libuv event loop going until they are closed, but Luvit calls uv.unref on its signals which means that the loop will not wait for them before stopping. This means that its possible for uv.run() to return before the signals are handled. This does not seem related to 7cab2d9, it just so happened to be one way to (somehow) get it to be consistent. So @zhaozg's solution is on the right track, we just need to make sure to keep the loop alive long enough for the signals to go through during this test.

Ok, installed an Arch Linux VM and was able to reproduce and play around with this. I think I finally understand what's going on.

So meticulous and responsible.

Closed by #1114