/linux-2.6-scribe

Scribe: the record-replay mechanism

Primary LanguageCOtherNOASSERTION

Scribe: The record-replay mechanism

Abstract

Scribe is a low-overhead multi-threaded application record-replay mechanism.

Scribe introduces new lightweight operating system mechanisms, rendezvous and sync points, to efficiently record nondeterministic interactions such as related system calls, signals, and shared memory accesses. Rendezvous points make a partial ordering of execution based on system call dependencies sufficient for replay, avoiding the recording overhead of maintaining an exact execution ordering. Sync points convert asynchronous interactions that can occur at arbitrary times into synchronous events that are much easier to record and replay.

For more details about the theory behind it, you can read the Scribe paper.

Scribe is general purpose record replay framework.

You can find a quick video showing the basic scribe capabilities here.

We used it to build Racepro, a process race detection mechanism.

Project Organisation

The Scribe project is divided in four different ones:

Installing Scribe

Prerequisites:

  • GCC and its friends
  • CMake
  • Python (version 2.6)
  • Cython (version 0.13 or 0.14, not 0.12 neither 0.15)

Instructions:

  1. Install the kernel

     git clone git://github.com/nviennot/linux-2.6-scribe.git
     cd linux-2.6-scribe
     make menuconfig
     make
     make install
    
  2. Install the C library

     git clone git://github.com/nviennot/libscribe.git
     cd libscribe
     cd build
     cmake ..
     make install
    
  3. Install the python library and userspace tools

     git clone git://github.com/nviennot/py-scribe.git
     cd py-scribe
     ./setup install
    
  4. (Optional) Install the test suite

     git clone git://github.com/nviennot/tests-scribe.git
    

Using Scribe

py-scribe provides three scripts: record, replay, profiler.

1. Record an application

The record command line tool allows the user to record an execution.

The verbosity level of the recorded log file can be provided. It allows the user to record only the bare minimum to guarantee a deterministic replay (highest performance), or to record the execution with debugging information so the log file can be easily interpreted by getting a execution trace similar to strace.

By sending a SIGUSR1 signal to the recording tool, Scribe detaches itself from the application while it continues running. A SIGUSR2 signal bookmarks an execution point in time. The user can then replay the application up to that point and the application state will be guaranteed to be exactly the same as during the recording.

Example:

    # record date
    Mon Aug  8 04:18:33 EDT 2011
    # ls -lh date.log
    -rw-r--r-- 1 root root 4.2K Aug  8 04:18 date.log

2. Replay an execution from a log file

The replay tool allows the user to replay a previously recorded execution.

The user can provide the backtrace size in case the replay fails and diverge (for instance, the system got out of memory and the replay cannot continue).

A SIGUSR1 signal can be sent to detach Scribe at any point in time and let the application continue a normal execution.

A bookmark id can be given as well to let the application go live at a specific point in time.

Example:

    # replay date.log
    Mon Aug  8 04:18:33 EDT 2011

3. Look at the recorded log file in a human readable format

The profiler tool allows the user to display the recorded log file in a human readable format.

Example:

    # profiler date.log | grep Mon -B3
    [02] write() = 29
    [02]   resource lock, type = files_struct, serial = 31
    [02]     resource lock, type = file, serial = 1, desc = /dev/pts/0
    [02]       data: size = 29, "Mon Aug  8 04:18:33 EDT 2011\n"

The provided command line tools use the Scribe Python library internally. The user can use the libraries to achieve a lot more by building its own logic around the Scribe API.

Detailed documentation