compose-regexp/compose-regexp.js

[TypeScript help wanted] infer the capture names in `result.groups`

pygy opened this issue · 0 comments

pygy commented

I'm not even sure it is possible, but it would be nice to get the captured group names properly typed, like @anuraghazra did here:

https://github.com/anuraghazra/type-trident/blob/main/src/typed-regex-named-groups
https://tsplay.dev/N5LV2w

such that

const fooMatcher = namedCapture('foo', /fo+/).exec('fooooo')

would return a restult whose groups has foo defined, ready for error detection and autocompletion.

Doing it for a single capture name is straightforward, but then...

const barMatcher = namedCapture('bar', /ba+r/).exec('fooooo')
const fooBarMatcher = either(fooMatcher, barMatcher)

would have to merge the capture groups.

To make things harder, the compose-regexp combinators are variadic.

I haven't dabbed in advanced typings in a while, and even at my peak TS I don't think I could have written this off the top of my head. I may get to it at some point, but help would be welcome.

I suppose that a starting point would be to create an interface like this:

interface CmpRegExp<T, U> extends RegExp {
  exec(s:string): 
    | null
    | (Omit<RegExpExecArray, "length" | keyof Array<any>> & T & U)
  // ...
}

then make the combinator type-aware and have capture() and namedCapture() grow the length of the RegExpIndices, and namedCapture() add the name to the groups array (or give a type error when trying to override a name that's already defined).

Another tricky issue is the possibility of rejecting invalid ref() at compile time.

sequence(capture("a"), ref(2)) should throw a type error.

sequence(ref(1), capture("a")), while nonsensical (it never matches), is a valid JS RegExp. I don't think that TS can be made aware of the direction of matching (in look behind blocks) like compose-regexp can, so this looks like a dead end on the typing front.