Tact frontend API: Improvements for tooling

Question

Tact frontend API: Improvements for tooling

Opened this issue 8 months ago · 6 comments

Answer 1 · 2024-05-01T12:04:28.000Z

Save types to AST or improve an access API to type information.

This would be insanely useful in tools like language server and such!

In general, I agree with all those points, but implementation of some may just require us to make our own lexer→parser→semantic analysis pipeline, suitable for compiler AND external tooling depending on various steps in it. I'd say this depends on #286, but I may be wrong and we could pull off such feat without compromising the Ohm's toolkit.

Answer 2 · 2024-05-03T14:11:31.000Z

I would also suggest creating named union types for AST entries used in fields. For example:

export type ASTContract = {
    kind: 'def_contract';
    origin: TypeOrigin;
    id: number;
    name: string;
    traits: ASTString[];
    attributes: ASTContractAttribute[];
    declarations: (ASTField | ASTFunction | ASTInitFunction | ASTReceive | ASTConstant)[];
    ref: ASTRef;
};

Could be rewritten as:

export type ASTContractDeclaration = (ASTField | ASTFunction | ASTInitFunction | ASTReceive | ASTConstant);
export type ASTContract = {
    kind: 'def_contract';
    origin: TypeOrigin;
    id: number;
    name: string;
    traits: ASTString[];
    attributes: ASTContractAttribute[];
    declarations: ASTContractDeclaration[];
    ref: ASTRef;
};

This will simplify the life of tooling developers by enabling them to reuse these type definitions from the compiler. Otherwise, I find myself copy-pasting these entries in my projects. Here is an example of a function with such a signature implemented in the static analyzer internals:

function getMethodInfo(
    decl: ASTField | ASTFunction | ASTInitFunction | ASTReceive | ASTConstant,
  ): [string | undefined, FunctionKind | undefined] {

Answer 3 · 2024-05-23T18:58:09.000Z

Added three more points:

Add more context to every internal Error to be thrown in compiler internals. This is crucial for debugging third-party tools.
Add AST iterators that perform functional map and fold over nested nodes of the AST. This will enable API users to inspect the AST in a more convenient way, for example, sorting nodes of a certain type.
Add an API that provides equivalence checks between AST nodes of the same type. This is needed, for example, in #335 to implement unit tests.

Answer 4 · 2024-07-18T04:09:18.000Z

Added while working on tests for #559:

Refactoring: Extract methods from the build function (src/pipeline/build.ts) to make it more modular. We need to separate the build functionality from CLI parsing and use different methods to create context, compile, and precompile. This is important to implement in order to enable third-party tools to hook into the compilation pipeline in the most flexible way. Perhaps, the best way to achieve this functionality is to create the Builder class with public methods defining the pipeline.

Answer 5 · 2024-12-30T09:43:02.000Z

Every point here makes a lot of sense, except for

Optionally, we should consider introducing mappings from ASTNode ID to ASTNode

In fact, we should remove id from AST nodes. They are just disguised references, and there is neither GC nor type safety to ensure AST ids stored elsewhere won't become dangling references.

While we could patch createNode to ensure all the ids would be in that Map, we wouldn't ever know they should have been removed from that map. Even though it would be possible to find any node by its id, we'd end up performing some actions on nodes that aren't even relevant anymore.

Answer 6 · 2025-01-02T09:09:23.000Z

They are just disguised references, and there is neither GC nor type safety to ensure AST ids stored elsewhere won't become dangling references.

I would argue against any idea of mutating the AST. If we need to transform AST, we should create a second AST to ensure a clean design for both the compiler and API users. Otherwise, the AST should be available at all stages of compilation since we need to access that information from different places; therefore, it will never be GCed.

Additionally, having unique IDs is essential for maintaining symbol tables that are useful at different stages of compilation, particularly for analysis and accessing the AST from other IRs what is used in Misti.