crystal-lang/crystal

CI failure: Recursion while initializing class variables and/or constants

straight-shoota opened this issue · 2 comments

This random CI failure happened in x86_64-windows-test: https://github.com/crystal-lang/crystal/actions/runs/10741662759/job/29793143971

  1) IO::FileDescriptor opens STDERR in binary mode
     Failure/Error: io.to_s.should eq("foo\n")

       Expected: "foo\n"
            got: "Unhandled exception: Recursion while initializing class variables and/or constants (Exception)\n" +
       "  from src\\raise.cr:219 in 'raise'\n" +
       "  from src\\raise.cr:244 in 'raise'\n" +
       "  from src\\crystal\\once.cr:21 in 'once'\n" +
       "  from src\\crystal\\once.cr:50 in '__crystal_once'\n" +
       "  from src\\crystal\\system\\win32\\fiber.cr:11 in '__crystal_main'\n" +
       "  from src\\crystal\\main.cr:118 in 'main_user_code'\n" +
       "  from src\\crystal\\main.cr:104 in 'main'\n" +
       "  from src\\crystal\\main.cr:130 in 'main'\n" +
       "  from src\\crystal\\system\\win32\\wmain.cr:37 in 'wmain'\n" +
       "  from D:\\a\\_work\\1\\s\\src\\vctools\\crt\\vcstartup\\src\\startup\\exe_common.inl:288 in '__scrt_common_main_seh'\n" +
       "  from C:\\Windows\\System32\\KERNEL32.DLL +85168 in 'BaseThreadInitThunk'\n" +
       "  from C:\\Windows\\SYSTEM32\\ntdll.dll +519403 in 'RtlUserThreadStart'\n"

     # spec\std\io\file_descriptor_spec.cr:106

I haven't found any repeated occurences yet, so this might just be a fluke. But it seems like a very odd thing to fail and could indicate an issue in the initialization logic.

Sometimes there is a different exception:

Unhandled exception: Index out of bounds (IndexError)
  from src\raise.cr:238:1 in 'raise'
  from src\array.cr:1315:11 in 'pop'
  from src\crystal\once.cr:28:7 in 'once'
  from src\crystal\once.cr:50:3 in '__crystal_once'
  from src\kernel.cr:601:1 in '__crystal_main'
  from src\crystal\main.cr:118:5 in 'main_user_code'
  from src\crystal\main.cr:104:7 in 'main'
  from src\crystal\main.cr:130:3 in 'main'
  from src\crystal\system\win32\wmain.cr:42:3 in 'wmain'
  from C:/M/B/src/mingw-w64/mingw-w64-crt/crt\crtexe.c:260:8 in '__tmainCRTStartup'
  from C:/M/B/src/mingw-w64/mingw-w64-crt/crt\crtexe.c:182:3 in 'mainCRTStartup'
  from C:\WINDOWS\System32\KERNEL32.DLL in 'BaseThreadInitThunk'
  from C:\WINDOWS\SYSTEM32\ntdll.dll in 'RtlUserThreadStart'
  from ???

What I found is that Crystal::System::FileDescriptor#@@reader_thread is initialized before Crystal::System::Fiber::RESERVED_STACK_SIZE, so there is probably some kind of race condition going on. If we partially revert #14947 then the error seems to go away:

module Crystal::System::FileDescriptor
  private def self.read_console(handle : LibC::HANDLE, slice : Slice(UInt16)) : Int32
    read_console_blocking(handle, slice)
  end

  # remove `@@reader_thread` entirely
end

The following doesn't work because in general there is no way to manipulate the evaluation order of top-level code:

module Crystal::System::FileDescriptor
  @@reader_thread : ::Thread?

  def self.init_reader_loop
    @@reader_thread = ::Thread.new { reader_loop }
  end
end

Crystal::System::FileDescriptor.init_reader_loop

I suspect the same can also happen with threads spawned from C libraries invoking Crystal callbacks, without -Dpreview_mt, so maybe Crystal::OnceState should be MT-safe by default?

This seems to be a regression from #14947