mdb_v8: postmortem debugging for Node.js

This repository contains the canonical source for mdb_v8, an mdb debugger module ("dmod") for debugging both live processes and core dumps of programs using Google's V8 JavaScript engine, and particularly Node.js. This module fully supports Node versions 4.0, 0.12, and 0.10. Basic functionality (stack traces, printing out objects, and using findjsobjects) should also work on Node versions 0.8, 0.6, and 0.4, but those versions are not regularly tested.

Downstream versions of mdb_v8 exist in both Node.js and SmartOS. See CHANGES.md for details.

Using mdb_v8

For information about using these tools, see the usage guide.

Building from source

You can build mdb_v8 by cloning this repository and running make. It will only build and run on illumos-based systems. See the usage guide for details on system support.

Binary downloads

Binaries for mdb_v8 can be found at https://us-east.manta.joyent.com/Joyent_Dev/public/mdb_v8. If you have the Manta command-line tools installed, you can list the latest binaries with:

$ mfind -t o $(mget -q /Joyent_Dev/public/mdb_v8/latest)
/Joyent_Dev/public/mdb_v8/v1.1.1/mdb_v8_amd64.so
/Joyent_Dev/public/mdb_v8/v1.1.1/mdb_v8_ia32.so

You can fetch a specific binary like this (in this case, the 32-bit version 1.1.1 binary):

$ mget -O /Joyent_Dev/public/mdb_v8/v1.1.1/mdb_v8_ia32.so

or using curl:

$ curl -O https://us-east.manta.joyent.com/Joyent_Dev/public/mdb_v8/v1.1.1/mdb_v8_ia32.so

This one-liner will get you the latest 32-bit binary:

$ mget -O $(mget -q /Joyent_Dev/public/mdb_v8/latest)/mdb_v8_ia32.so

Design and implementation notes

Design constraints

An important design constraint on this tool is that it should not rely on assistance from the JavaScript runtime environment (i.e., V8) to debug Node.js programs. This is for many reasons:

In production, it's extremely valuable to be able to save a core file and then restart the program (or let it keep running, but undisturbed by a debugger). This allows you to restore service quickly, but still debug the problem later.
There are many important failure modes where support from the runtime is not available, including crashes in the VM itself, crashes in native libraries and add-ons, and cases where the threads that could provide that support are stuck, as in a tight loop (or blocked on other threads that are looping).
By not requiring runtime support, it's possible to stop the program at very specific points of execution (using other tools), save a core file, and then set the program running again with minimal disruption. With tools like DTrace, you can stop the program at points that the VM can't know about, like when a thread is taken off-CPU.
Many issues span both native code and JavaScript code (e.g., native memory leaks induced by JavaScript calls), where it's useful to have both native and JavaScript state available.

In short, there are many kinds of problems that cannot be debugged with a debugger that relies on the running process to help debug itself. The ACM Queue article Postmortem Debugging in Dynamic Environments outlines the history and motivation for postmortem debugging and the challenges underlying postmortem debugging in higher-level languages.

Implementation notes

We built this tool on mdb for two reasons:

mdb provides a rich plugin interface through which dmods (debugger modules) can define their own walkers and commands. These commands can function in a pipeline, sending and receiving output to and from other commands. These commands aren't just macros -- they're documented, have options similar to Unix command-line tools, they can build up their own data structures, and so on. Plugins run in the address space of the debugger, not the program being debugged.
mdb abstracts the notion of a target, so the same dmod can be used to debug both live processes and core files. mdb_v8 uses mdb's built in facilities for safely listing symbols, reading memory from the core file, emitting output, and so on, without knowing how to do any of that itself.

In order to provide postmortem support, mdb_v8 has to grok a number of internal implementation details of the V8 VM. Some algorithms, like property iteration, are (regrettably) duplicated inside mdb_v8. But many pieces, particularly related to the structure of V8 internal fields, are dynamically configured based on the process being debugged. It works like this:

As part of the V8 build process, a C++ file is generated that defines a number of global ints that describe the class hierarchy that V8 uses to represent Heap objects. The class names, their inheritance hierarchy, and their field names are described by the names of these constants, and the values describe offsets of fields inside class instances. This C++ file is linked into the final V8 binary.

You can think of the debug metadata as debug information similar to DWARF or CTF, but it's considerably lighter-weight than DWARF and much easier to interpret. Besides that, because of the way V8 defines heap classes, traditional DWARF or CTF would not have been sufficient to encode the information we needed because many of the relevant classes and nearly all of the class members are totally synthetic at compile-time and not present at all in the final V8 binary.

Because of this approach (rather than, say, attempting to parse the C++ header files that describe up the V8 heap), the values of these constants are always correct for the program being debugged, whether it's 32-bit, 64-bit, or has any compile-time configuration options altered that would affect these structures.
When mdb_v8 starts up, it reads the values of these symbols from the program being debugged and uses that information to traverse V8 heap structures.

An ideal solution would avoid duplicating any VM knowledge in the debugger module. There are two obvious approaches for doing that:

In addition to encoding heap structure in the binary at build-time, encode algorithmic pieces as well. This could use a mechanism similar to the DTrace ustack helper, which allows VMs to encode deep internal details in a way that even the kernel can safely use, even in delicate kernel contexts. To get to this point would require figuring out all the kinds of information a debugger might need and figuring out how the VM could encode it in production binaries (i.e., efficiently) for execution by an arbitrary debugger.
Alternatively, VMs could provide their own standalone postmortem debugging tools that could reconstituting a program's state from a core file and then providing a normal debugging interface. Those debuggers wouldn't necessarily help with issues that span both native and JavaScript code.

Both of these approach require considerable first-class support from the VM (and team, who would have to maintain these implementations), which does not seem to exist for the case of V8 (or any other VM we know of). The existing approach requires minimal maintenance because much of the metadata is created through the same mechanisms in the build process that define the classes and fields themselves.

Contributing

Contributions welcome, but please help us review your changes (and keep code quality high) by following these guidelines. If you have any questions, feel free to ask. Don't let these guidelines be a barrier to contributing!

If you're not sure exactly what change you want to make, create an issue to discuss it. Once you've got a change ready to integrate, submit a pull request.

Formatting nits: Pull requests should include text explaining the suggested change. (Do not put this text in the commit message. The commit message should consist of one line per logical change, each consisting of the issue number and synopsis. See previous commit messages for examples.)

Completeness: Pull requests should include relevant updates to the documentation, including CHANGES.md. New commands, walkers, and non-private options must have associated documentation, both in the dmod (so that "::help DCMD" works) and in the usage guide.

Testing: Code changes should be "make prepush" clean. Major changes should be manually tested on:

core files from each of the three latest major release of Node (e.g., Node v4, Node v0.12, and Node v0.10)
core files from both 32-bit and 64-bit programs
core files generated by abort(3c) and core files generated by gcore(1M)
an existing collection of representative core files

There's a tool to help test on multiple Node versions in tools/runtests_node.

License

With the exception of the "cstyle.pl" tool, all components in this repo are licensed under the MPL 2.0. "cstyle.pl" is licensed under the CDDL. (Various pieces in this repo were historically released under other open source licenses as well.)

sbahra/mdb_v8