metadevpro/ts-pegjs

Autogenerated types?

siefkenj opened this issue · 15 comments

I have been working on a project that automatically generates typescript types from a Peggy grammar.

Source is here: https://github.com/siefkenj/peggy-to-ts

Playground is here: https://siefkenj.github.io/peggy-to-ts

If you think this would be good to integrate into ts-pegjs, I would be happy to make a PR (but I'd need some guidance about exactly how to integrate it with ts-pegjs, since Peggy doesn't normally produce multiple output files...).

Yes please would be very useful for my Textmate parser

Great idea! Please do. Feel free to ask whatever you need.

Param types for rule generator code can be switched from any to the correct type. Hopefully if we can also turn char scanning into string | void, we can get rid of any. (That type is technically a code smell)

Okay. I have some questions.

  1. It appears that you duplicate most of the bytecode generation from Peggy. Is there a reason you don't just strip the typescript annotations from each Peggy action, let Peggy generate the source, and then re-export the Peggy-generated parser with types?
  2. a. Would you be happy with a two-file solution where Peggy generates the JS and then it is re-exported in a typescript file with the proper types attached?
    b. Can a Peggy plugin even do this? I.e., is a Peggy plugin allowed to save separate files?
  3. I notice that ts-pegjs is written in JS instead of TS. Would you be opposed to a port?

Hi @siefkenj:

  1. The plugin was created for pegjs some years ago. pegjs was not maintained property and was then Peggy was forked as a result. At that time, that was doable in terms of time/effort.

  2. a. Sounds sensible.

  3. b. Last time I checked, there was no support for multiple generation files in PegJS/Peggy. But can receive any extra options as needed and use the NodeJS API to write any additional files as needed.

  4. It was written as JS to make it as simple as possible to be consumed from pegjs/peggy. I am not opposed to a TS port as long as it is working.

Feel free to create PRs and let's explore it if you want to.

nene commented

I think this might be a good time to rethink the architecture of ts-pegjs.

I've been looking into doing a PR to add Peggy 3.0 support, but so far I have mostly learned that the code generation in Peggy has changed quite a lot since 2.0. So much so, that it's probably simpler to do a complete rewrite of ts-pegs than to attempt merging in changes from Peggy 3.0.

Is there a way to convert the pegjs output to an estree & add types to that, then convert back to source?

nene commented

Should be possible with recast or jscodeshift.

Is there a requirement that ts-pegjs be runnable from the browser, or can it be a node-only application?

  1. ts-pegjs is fine to be a NodeJS cli tool.
  2. On the contrary, the output parser should run server-side (NodeJS, Deno) and client-side (browser).

(If ts-pegjs does end up getting browser support, we can make a playground 💯 - but that's secondary to the need for re-impl)

There are limits on TS side on how advanced the generated types can get until they can't be typechecked. The main problem is that whenever you have type-level recursion that goes through a typeof from an action, and there's not enough interface-boundaries, you get an infinite type instantiation error.

There are two solutions:

  • Avoid semantic actions in the grammar. The PEG language might be extended with "this rule creates an AST node" annotations, and it turns to be enough knowledge of generated types to avoid infinite instantiation.
  • Add ability to explicitly annotate the grammar with types.

In my own parser generator I chose the first approach, because it gives all the AST types for free. There is even a limited support for semantic actions by heavily abusing TS's type system (to the best of my knowledge, this is the only valid use case for @ts-ignore). This example might help anyone who's implementing autogenerated types too.

@pjmolina It appears that the current implementation of returnTypes effectively does nothing. That is, the return types are annotated, but the only exported type lists the parsted AST as any, which means any annotations on individual return types are effectively ignored.

Can you confirm my understanding is correct?

This sounds correct... For now! Even if your start rule has a defined return type, the parser still emits any.

As of #98 ts-pegjs now infers types using Typescript (via the ts-morph library)!