Hejsil/mecha

How to parse bareword literal?

Closed this issue · 2 comments

I want to parse a bareword literal:

  • it must start with uppercase
  • then any of upper- or lowercase can follow

I tried to use the following code:

const UpperCase = mecha.utf8.range('A', 'Z');
const LowerCase = mecha.utf8.range('a', 'z');

/// A widget literal starts with an upper case letter.
/// Then any number of upper-, or lowercase letters can follow.
var WidgetLiteral = mecha.combine(.{
  // Starts with uppercase.
  // It's an u21 in the result.
  UpperCase,
  // Then other chars follow.
  // It's an []u8 in the result.
  mecha.many(
    mecha.oneOf(.{
      UpperCase,
      LowerCase
    }),
    .{ .collect = true}
  )
});

The problem is that it's not easy to parse into a single []u8 or []u21, as the .combine() output gets parsed into a struct.
The first, single UpperCase is an u21, but then the following chars are []u8.
Is it possible somehow with a clean solution?

I would recommend having a look at mecha.asStr. After the child parser returns a result, asStr will return the input range that was actually parsed. This slice will point into the input and will not be allocated.

Thank you!