Snaipe/Criterion

How to improve debugging UX? Resuscitate `--single`?

martijnthe opened this issue · 4 comments

Ideally, I would like to be able to run the main executable in gdb/lldb directly to debug a test, instead of having to open gdb/lldb in another terminal, run target remote :1234 etc. The way it's set up now adds a bit of friction. It's unfortunate because the rest of the framework is super nice and awesome!

The way debugging is set up also does not jive well with IDEs that expect to attach to the main executable (I'm using CLion for example).

Reading through the Github issues, I learned that there used to be a --single option that got removed after a refactoring. I'm assuming that --single ran the test inside the runner's process, correct?

What's the rationale behind removing --single? I realize you'd loose the sandboxing of tests, but when debugging a test, perhaps that's an OK price to pay for removing the friction (esp. because you're only running a single test).

Is there anything fundamentally impeding --single from being implemented again?

The reason why --single was removed is that is was hard to keep the option with the new i/o layer changes (we switched to a nanomsg + protobuf stack) and the new sandboxing code (BoxFort was introduced around that time). While convenient, this option was an undocumented feature that was added for a short time, and was a dirty hack that called the test function directly. The problem is that now, while we could re-add such an option, this is still a hack on the code base that would consist in wrapping a lot of code in if (!single) everywhere in the code base, which is less than ideal on a maintainability standpoint. Worse than this, with the inheritable heaps that boxfort uses to pass context, and the introduction of cr_malloc/free/realloc to transfer context from the runner to the test, cutting short of all this means that we somehow have to keep a separate heap implementation for this express purpose, which again is hardly maintainable.

I agree that I initially disliked making remote debugging mandatory, but the advantages of an isolated address space at all times greatly outweighs an extra step during debugging.

Also, gdb allows running background process as children from its prompt (something like shell ./test & iirc), so maybe writing a user command that spawns the test with a specific filter and immediately attach is possible, and could be added to the documentation.

On a side note, CLion should be able to remote debug though -- perhaps making a debugging target that runs the tests and remote debug into them would be possible?

One more thing: I haven't tested it since --single was there, but I think that multi-inferior debugging might still be working. Maybe it'll integrate better with CLion?

To set up multi-inferior capabilities in gdb, you'd have to run the following commands:

$ gdb -q ./samples/simple.c.bin
Reading symbols from ./samples/simple.c.bin...done.
(gdb) set follow-fork-mode child
(gdb) set detach-on-fork off
(gdb) set non-stop on
(gdb) set target-async on
(gdb) set schedule-multiple on
(gdb) handle SIGSTOP SIGCONT nostop noprint pass

I just tried the multi-inferior setup, and it worked surprisingly well! I've cooked up a small gdb script to help make the initial setup less painful:

set follow-fork-mode child
set detach-on-fork off
set non-stop on
set target-async on
set schedule-multiple on
set print inferior-events off
set print thread-events off
handle SIGSTOP SIGCONT nostop noprint pass

define hookpost-start
    set criterion_options.crash = 1
end

define hook-run
    set breakpoint pending on
    tbreak main
    commands
        silent
        set criterion_options.crash = 1
        continue
    end
end

Save this as criterion.gdb and start gdb with gdb ./yourtest -x criterion.gdb.

Then, from the gdb prompt, just call run, list children with info inferiors, switch inferiors with inferior <n>, and debug just as usual.

The above script isn't perfect (notably I'd like to get rid of the repeated "Reading symbols from..." messages, and print a summary of which tests are ready for debugging after a run), but I'll put it in dev/ once I get back home. Anyone wanting to improve it can just submit a PR :)

Cool, thanks, will try!