/hwf

A code doodle implementing an actor-like system for GObject

Primary LanguageCMIT LicenseMIT

H Web Framework

This is a partly-finished project I was playing with, that I decided not to continue with for the time being. I'm putting it online in case someone else is interested.

There's some code here that may be useful in other contexts, in particular the actor-like HrtTask system found in src/lib/hrt/hrt-task.h is fairly well-developed and could be useful especially to anyone working with GObject-based code, whether web-related or no.

The idea of the project overall is not unlike node.js (in fact, most likely this would be better built on top of node.js and maybe someone's even done so already). I want to support request handlers something like this:

var someBackendService = require('someBackendService');
var someCacheThing = require('someCacheThing');

var id = request.queryParams['id'];

var promiseOfFoo = someCacheThing.lookupFoo(id);

// yield saves continuation and suspends
var foo = yield promiseOfFoo;

if (!foo) {
    promiseOfFoo = someBackendService.fetchFoo(id);

    // again suspend while waiting on IO
    foo = yield promiseOfFoo;

    var promiseOfSaveFoo = someCacheThing.storeFoo(id, foo);

    // wait for cache to complete, in case there's an exception
    yield promiseOfSaveFoo;
}

// this write would be async via event loop also of course
request.response.write(foo);

In other words, this is sequential code that does IO, but there's no manual juggling of main loop callbacks. Instead, the file is the body of a generator function that generates "promises" (see this bug). The web container would drive this generator by requesting a new promise each time the previous promise completed.

This generator-based design for async code was invented by C Scott Ananian for the gjs runtime project (gjs is used for GObject-based client-side development).

Each request handler as shown above would be single-threaded from the JavaScript perspective (no threads visible in JavaScript) but would not monopolize a thread while waiting on a promise.

  • No callback spaghetti
  • Threads aren't visible in JavaScript; thread bugs are only possible in native modules with global variables
  • All cores are automatically maxed out, whether you're doing IO or computation
  • No need to have a special 'worker threads' facility, just use the normal request scheduler and allow communication (via messages) among the request handlers

The setup here is similar to "Actor" frameworks. Kilim for Java also uses a suspend-with-continuation style to get sequential code that waits on a main loop. Other actor frameworks (Scala, Akka, Jetlang, etc.) share the theme of making each actor object single-threaded while still using all CPU cores. Actors are always confined to a single thread at a time, but they don't use up a thread while idle.

Remaining Work

To have a working prototype of the above request handler, the following major pieces are missing:

  • Implement HTTP. The code already has http-parser (as written for node.js), but it's just a parser. What's missing is to actually do what the parsed headers, etc. say to do. Supporting the basics should be straightforward.
  • Implement some simple JavaScript-to-C binding system. The request object (for starters) needs to be made available to JavaScript. It's pretty easy to hand-code bindings for some basic objects, but this rapidly becomes unmanageable, and something more autogenerated (see gjs) is needed. gjs itself doesn't work because it isn't thread safe at all, but perhaps some modification of gjs. I also suspect that static bindings (or dynamically-generated "JIT" bindings using LLVM perhaps, as Johan Dahlin has experimented with for Python) would be important for server-side performance. Another issue with gjs is that server-side developers are likely to insist on something more CommonJS-looking (e.g. the module system).
  • Implement the basic "container" functionality, to map paths to a JS files in some directory, reload them when they change, that kind of stuff.

Existing Code

Directories:

  • deps - imported third party stuff, mostly unmodified
  • src/lib/hrt - "H Runtime"
  • src/lib/hio - "H IO"
  • src/lib/hjs - "H JavaScript"
  • src/container - container

Use "make coverage-report" to get a sense of test coverage.

HRT Features:

  • HrtTask is the most useful and complete feature of this codebase. A task is a collection of event sources, where event handlers in the same Task do not run concurrently but handlers in different Task may run concurrently. The Task ends when it has no outstanding event sources. A task is more or less the same thing as an Actor, though I haven't implemented the "mailbox" feature that you'd expect Actors to have (it'd be nice to do so). Event sources are called "watchers" as in libev.
  • HrtTaskRunner is the thing that manages Task and runs handlers in a thread pool.
  • HrtBuffer is supposed to minimize copying from JavaScript strings, an immutable buffer that can be passed around for request and reply data. This is half-baked.

The HrtTaskRunner implemention supports both a GSource and a libev backend. GSource does not have the performance to be reasonable on the server side, due to some O(n) algorithms, see this bug for the most important issue.

It might be interesting to add a mailbox to each HrtTask and allow passing messages among the tasks, similar to actor frameworks. This would replace the current, more limited ability to spawn a subtask with arguments and wait for it to return a value.

HIO:

This is an ad-hoc library that would just do whatever the HTTP container needs. You might ask "why not use GIO" - I believe it would have too much overhead on the server side, its design is much more oriented toward robustness and flexibility in a desktop application such as Nautilus.

HJS:

This is meant to encapsulate SpiderMonkey and would end up being similar to gjs but different to support the use-case.

It needs a binding system, probably using gobject-introspection and borrowing liberally from gjs, but forked to be sane in the threaded environment.

Third-party Code

Under deps/ you can find:

  • spidermonkey and narcissus (JavaScript implementation from Mozilla)
  • libev (fast, scalable event loop)
  • http-parser (http parser from node.js project)
  • trucov (nice code coverage tool)

The versions of these components are kept in deps/VERSIONS