jmickle66666666/wad-js

UDMF support

Opened this issue · 4 comments

Unlike doom and hexen format maps, UDMF is a text-based parser, meaning a different kind of parser will need to be created for it. However, aside from the parser there shouldn't be too much work.

You may have already started this, but I'm tossing some ideas out there because it sounds like something I want to try myself. Looking at the SLADE source code there are three primary components:

  • Tree implementation
  • Tokenizer
  • Parser

The tokenizer handles things like strings, comments, keywords, identifiers, etc. The parser handles things like block expressions, assignment statements and so on. The parser's goal is to output a tree of nodes so that it can be trivial to read the properties. I don't know if the UDMF allows nested blocks or not, but if it doesn't that will simplify things a bit.

To avoid reinventing the wheel here are two very small libraries I'm looking at:

Both can be installed via npm which is convenient since we are using it anyways.

I've avoided starting work on both UDMF and other text formats like MAPINFO and DECORATE because I'm not experienced in writing parsers/tokenizers.

The UDMF Specification https://github.com/coelckers/gzdoom/blob/master/specs/udmf.txt doesn't mention nested blocks iirc.

I guess a new set of objects can be created for the UDMF data in situations where the standard doom (or hexen) ones aren't capable, as long as the naming scheme is the same then the map renderer will automatically support them.

Here's what I got so far, even though you can barely call it progress.

var TreeModel = require('tree-model');
var StrScan = require('strscan').StringScanner;

// Tokens in SLADE are either a word or
// these list of special characters
var regex = /\w+|[;,:|={}\/]/;

var example = "Test {\n"
        + "test = 1\n"
        + "}";

var s = new StrScan(example);

while (!s.hasTerminated()) {
        console.log(s.scan(regex));
        s.scan(/\s+/);
} 

This outputs:

Test
{
test
=
1
}

So this shows that the tokenizer library is working as intended at least.

Alright, got some progress. See this gist here.

The thing I like about SLADE's approach is that the parser implementation is independent of UDMF specific stuff, which would make it really easy to integrate. As you can see, getting all the properties is simply a matter of walking the root node and grabbing the values.

I really need to find a way to avoid consuming whitespace because the tokenizer doesn't do it automatically. Their check is basically peek but takes a regex instead of a length.

Creating nodes needs to be done with tree.parse. The root node needs to have a children property or you'll get an error.

Accessing properties of a node is done on node.model.