tweag/asterius

Unfound symbol error message has regressed

TerrorJack opened this issue · 2 comments

Describe the bug
Previously, when --verbose-err is enabled for ahc-link, when a C function symbol is not found, attempts to call that function at runtime will result in an error message containing the function name. This is implemented with the Barf construct in IR and related handling in the linker.

Recently, this feature has regressed. The error message no longer contains the unfound function symbol name.

To Reproduce

main :: IO ()
main = yolo

foreign import ccall unsafe "yolo" yolo :: IO ()

Running ahc-link --verbose-err --run --input-hs test.hs for the example above, the error message is something like:

test: JSException "RuntimeError: unreachable\n    at Main_main1_entry (<anonymous>:wasm-function[29]:0x279a)\n    at scheduleTSO (<anonymous>:wasm-function[2426]:0x8045e)\n    at scheduleTSO_wrapper (<anonymous>:wasm-function[2427]:0x8048b)\n    at Scheduler.tick (file:///workspaces/asterius/test/rts.scheduler.mjs:346:22)\n    at Immediate.<anonymous> (file:///workspaces/asterius/test/rts.scheduler.mjs:381:29)\n    at processImmediate (internal/timers.js:458:21)"

Expected behavior
The error message should contain the yolo string.

Environment

  • OS name + version: Docker dev image
  • Version of the code: latest master revision

Additional context
Add any other context about the problem here.

Come to think of it, the logic to handle Barf error messages in the IR is a bit complex & fragile:

  • Each Barf error message is converted to a data segment at link time, segment name contains barf
  • When --verbose-err flag is off (default), the data segments are dropped
  • We pass that data segment symbol as the barf parameter, hoping it is present in the final linker output when we need it

Generating special data segments at link time and dealing with them is error-prone (as shown in this PR). I think we can get rid of all linker logic related to barf data sections, and implement an alternative approach:

  • Keep the old barf runtime function since it's used by runtime cmm files; add a new barf_push function
  • barf_push keeps an internal buffer starting empty
  • When --verbose-err is enabled, each Barf gets translated to a block containing multiple barf_push calls and ending with an unreachable. Each barf_push call pushes a single byte (or Char) of the error message in Barf. Pushing \0 signals the end of the current error message.
  • barf_push reconstructs the complete error message and crashes appropriately.

Compared to putting the error message in a data segment, the new approach will bloat the .wasm file size, but it should be OK since it's only used for --verbose-err. Furthermore, it reduces complexity of linker and is more robust than the previous approach.

cc @gkaracha

I wonder which commit broke this 🤔

I agree that the current approach complicates the linker; I'd be happy to replace it with the approach you suggest. FWIW, the error message above is over 400 characters long, which means quite some bloating indeed. Then again, (a) there are not supposed to be barf messages around most of the time, and (b) the error message size is always in the vicinity of 400 characters I think.

Unless we end up exceeding a limit (which I doubt), your suggestion sounds better to me than what we have.