`--isolatedDeclarations` for standalone DTS emit
mprobst opened this issue Β· 31 comments
Suggestion
TypeScript supports relying on type inference to produce (parts of) the API of a module. E.g. users can write (sorry, slightly contrived):
// in counter.ts:
import {Splitter} from 'textutils/splitter';
export function countParts(x: string) {
return new Splitter(x).splitWords().size();
}
Note how you need to specify the type of x
(otherwise it degenerates to any
), but can leave out the return type of lengthOf
if you like. TypeScript will infer the type, potentially using type information from the file's (transitive) dependencies.
This causes two problems.
Readability. It is difficult to understand what the return type of countParts
will be. This is purely a stylistic issue that could be fixed with a lint check (though there is some complexity e.g. due to typeof
).
Compilation performance/parallelism.
Imagine you're using project references, and you have a dependency structure:
app <-- counter <-- textutils/splitter
To compile, we need to first compile textutils/splitter
, wait for that to complete, e.g. 5s, then compile counter
, wait e.g. 3s, then compile app
(6s). Total compilation wall time is |app| + |counter| + |textutils/splitter|
, in our example 5s + 3s + 6s = 16 seconds
.
Now assume we could produce .d.ts
files without requiring transitive inputs. That'd mean we could, in parallel, produce the .d.ts
files for textutils/splitter
, counter
(and app
, though we don't need that). After that, we could, in parallel, type check and compile textutils/splitter
, counter
, and app
. Assuming sufficient available parallelism (which seems reasonable, given how common multicore CPUs are), total compilation wall time is the maximum time to extract .d.ts
files, plus the time for the slowest compile. Assuming .d.ts
extraction is purely syntactical, i.e. does not need type checking nor symbol resolution, it shouldn't add more overhead than a few hundred ms. Under these assumptions, the wall time to wait for the project to compile would be 500 ms + 6s = 6.5 seconds
, i.e. more than a x2 speedup.
The problem with that is that we cannot produce .d.ts
files without running full type checking, I believe purely due to type inference.
RFC thus: I wonder if this would sufficiently motivate the ability to restrict using type inference in exported API?
E.g. we could have a noTypeInferenceOnExports
compiler flag, that would allow TypeScript to parallelise emitting .d.ts
across project references, and then parallelize type checking.
The counter point is that projects that experience slow builds in edit refresh situations might instead want to turn off type checking entirely for their emit, at least on critical paths. However that means users do not see compilation results, and produces additional complexity (e.g. when and how to report type checking failures).
Impact
We've run some statistics internally at Google on build operations.
As one would expect, this change has little impact on most incremental "hot inner loop" builds, as those typically just re-type check a single file and produce no .d.ts
change at all, so caching saves us from long build chains. We're seeing ~20% wall time improvements in the 90th percentile across all builds involving TypeScript.
However the impact on slower builds is more substantial. For a sample "large" project that sees both slow individual compiles and a long dependency chain, we see ~50% improvement in the 90th percentile, and 75% in the 99th percentile (which is representative of CI style "cold" builds with little caching).
π Search Terms
performance compilation parallelism inference declarations
β Viability Checklist
My suggestion meets these guidelines:
- This wouldn't be a breaking change in existing TypeScript/JavaScript code
- This wouldn't change the runtime behavior of existing JavaScript code
- This could be implemented without emitting different JS based on the types of the expressions
- This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, new syntax sugar for JS, etc.)
- This feature would agree with the rest of TypeScript's Design Goals.
Even if you could force explicit type annotations on everything, I donβt think that would be enough to enable parallel compilation, since the structural typing means the compiler has to know what the imported types contain in order to check them.
Thanks for raising this issue. This is a feature/area I had been considering prototyping for a while. It sounds like we're thinking of the same thing but I'll elaborate here so you can verify.
Feature Description
The key goal is to allow a declaration *.d.ts
to be generated from a single *.ts
file, without the need to examine the dependency tree. This implies that any types needed from dependencies will be imported by the generated *.d.ts
. That's a major change from today's declaration generation that sometimes inlines (duplicates) those resolved types into the generated declaration rather than referencing them via imports.
This kind of standalone declaration generation isn't possible for the full set of TypeScript source files today. An example problematic case is when an exported type is computed based on types originated in dependencies. Declaration files can't express transitive concepts such as typeof a + typeof b
so the emitted declaration is always a resolved type today.
import { a, b } from "./dependency";
export const sum = a + b; // declaration emit will resolve this to a singular string or number
Potentially new declaration syntax could be added to mitigate this. But a general solution is to constrain the set of TypeScript that can be authored. Think of it as introducing stronger linting rules. When this need arose previously to permit standalone per-file TS->JS compilation, the isolatedModules
option was introduced to constrain the source, forcing the user to write more explicit code in some cases. So I think the natural option name for this new constrained mode applied to declaration generation would be isolatedDeclarations
Pros & Cons
As you say, this feature will permit parallelisation of declaration emit. It will also reduce the blast radius of declaration regeneration needed when editing a single file. Another less obvious benefit is that it will improve type-checking performance for consumers of otherwise bloated declaration files - by eliminating the (increasingly rare) edge cases where excessive inlining super-sizes the declaration files. And when declaration emit is decoupled from type-checking, it opens the door for high-performance declaration emit tools in other languages, e.g. Rust/Go/Zig.
A downside of forcing isolation of declaration files is that it may increase the count of file accesses for anyone consuming that declaration file set during type-checking. Per-file overheads are noticeable particularly on Windows and particularly when malware scanners intercept the file access. This can be mitigated by declaration bundling.
@robpalme yes, what you're writing here matches my understanding. And indeed, isolatedDeclarations
is a good name.
One thing I don't follow on: why do you think this will increase the number of file accesses needed during type checking? Shouldn't the compiler just read the exact same set of .d.ts
files as before?
One thing I don't follow on: why do you think this will increase the number of file accesses needed during type checking? Shouldn't the compiler just read the exact same set of .d.ts files as before?
My fault for not being clear. This proposal is really two things:
- Stricter linting errors
- Achieved by either
--isolatedDeclarations
intsc
, or perhaps externally via a standalone ESlint rule
- Achieved by either
- A new per-file declaration generator
- Achieved either in
tsc
, or perhaps externally via a new standalone tool
- Achieved either in
It is possible to deliver (1) without (2). But delivering (2) requires (1).
(2) may entail more file accesses during checking, as it guarantees no inlining of types. Last I checked, the checker is lazy and only loads imports that are truly needed to check something. Inlining, as opposed to referencing, will lead to more cases where that laziness pays off and means a dependency file does not need to be loaded. This is a minor/rare case that shouldn't really sway anything.
I would frame it this way:
isolatedModules
means a syntax-only tool can be guaranteed to do single-file transpilation correctlyisolatedDeclarations
means a syntax-only tool can be guaranteed to do single-file .d.ts emit correctly
The potential perf gains here are indeed extremely large if we think about non-error-checking scenarios.
We'll have to consider what the edge cases are where the syntactic rules might be sufficient to capture this invariant, but it's a very interesting proposal.
This looks like type-first mode of Flow. They obtained a great perf boost.
I started doing some experimentation with isolatedDeclarations
and here are my findings so far
One more pro that wasn't explicitly mentioned yet is that it would allow full emits for monorepos using project references even when there is a type error.
This is a significant benefit; I can't count the number of times I felt trolled by tsc
because I was running my unit tests against older emitted code because of an unused local variable error or some such folly. π§
This was pointed out to me in this tweet: https://twitter.com/robpalmer2/status/1521973705404043268?t=WUpGGePVAL0ZEjlJGfVc_g&s=19
I realized today that this is somewhat dependent on #13626 because these are the only declarations that do not permit an annotation, but permit arbitrary expressions. I'd be surprised if there wasn't someone who needed an export default
with this flag.
In theory we don't have to do #13626. we could say that this mode only works for export default function
and export default class
. If that's too heavy-handed, we could make a rule that under isolatedDeclarations
, in export default SomeExpr
, SomeExpr
must resolve to a local annotated variable, and tools would have to do a little bit of local resolution to copy over the annotation.
@DanielRosenwasser It will create a restriction for default exports. But there are probably other limitations as to what you can write in this mode.
For example there is no place to annotate a class that extends an expression either:
class Base {}
function id<T> (cls: T){ return cls;}
// No place to annotate id(Base)
class Derived extends id(Base) {}
I had a similar pain point where the 3rd party dependency didn't have its inner type exported causing TS cannot even compile under composite mode, and I proposed restricting the list of types generated in the .d.ts
file: #47111
It seems that they're highly related and can complement each other. Restricting what will be generated in .d.ts can probably improve the perf as well in the way that less things needs to be checked against a dependent module.
FWIW this issue came up a bit in the context of "type-mutating decorators" #51347
The issue was part of #49074 (TypeScript 4.8 Iteration Plan) and #50457 (TypeScript 4.9 Iteration Plan) but is not part of #51362 (TypeScript 5.0 Iteration Plan).
@DanielRosenwasser Do you still have plans to investigate this or did this drop from the roadmap for the foreseeable future?
Sorry for the long time since the last update. I have been partially working on this, but other duties have made progress a bit slow. This will become my main project starting next week, and I expect to continue working on it for at least the next couple of months.
So far I have a partial implementation of the flag in TypeScript which I have pushed here.
I have also been working on an independent implementation of declaration emit. This is based on the TypeScript emitter - the code is literally extracted. The idea with this external implementation is to:
- Allow us to find and remove any dependencies between declaration emit and type-checking
- Allow us to have a set of tests that we can run between versions of TS to see if declarations are still independent from type checking (avoid adding dependencies to type checking in isolated declarations mode in future versions of TypeScript)
- Assess the amount effort that needs to be put into implementing the new declaration emit
Some preliminary findings:
- Paths are rewritten for module specifiers in the existing declaration emit. I am investigating what specifically these are and why they are needed.
- When determining the optionality of a parameter the type checker is used, although I donβt think it is strictly necessary for it to be.
- Expando functions (functions that have extra properties added to them through assignments) canβt be allowed when using isolated declarations (explicit namespaces will be needed). These currently rely on inference but there is a workaround.
@dragomirtitian Do you mind sharing some docs about expando functions? Never heard about that.
( Search this notes page for "expando" to get some details )
For the latest status, including documentation on the consequential restrictions this flag places on source code, please see the longest description in world on @dragomirtitian's work-in-progress PR.
Another status report has been posted on the work-in-progress PR. It contains data collected after trying out this feature on production codebases by @dragomirtitian (Bloomberg) and @h-joo (Google) as well as some initial performance data collected from using the feature to unlock parallel builds.
I'm probably the least qualified person here, but I think this is not the right way to approach this problem.
What if we evaluate inferred types from the source file when they're being created, and write the declaration file right next the source on the disk so later can be used in the compiler for type checking instead of running full control flow analysis. This also opens parallelism possibilities. Something like how the OCaml compiler works.
If you ask folks what's the lest part of golang, it'll be most likely writing verbose types. Of course it allows a super fast type checking but in TS types are not as simple as in go. TS types can be very complex, and hand writing them is just a burden.
We should require folks to write less verbose code, and opt in for more inference, then the TypeScript will be loved by all!
What if we evaluate inferred types from the source file when they're being created, and write the declaration file right next the source on the disk so later can be used in the compiler for type checking instead of running full control flow analysis.
Maybe I am missing something, but this is how declaration emit works today.
When you write --declaration
, it writes .d.ts
files to disk which can then be used by other projects. It requires an unbounded amount of type-checking work to do that for a single file. The --incremental
flag keeps track of this at an even more granular level so that fewer files ever need to be re-checked and re-emitted - but again, this is possibly-unbounded work. What this all means at a project boundary is that all dependencies need to be fully type-checked and emitted to declaration files before initiating a compilation.
With isolated declaration emit, you can emit .d.ts
files with no type-checking at all because the declarations either have "trivial" initializers or are fully annotated. TypeScript can then spend less time emitting .d.ts
files, and because .d.ts
emit does not require any sort of type-checking, it removes the need to check in the order of the dependency graph of your projects. What that means is that you could execute TypeScript in parallel across projects
With isolated declaration emit, you can emit .d.ts files with no type-checking at all because the declarations either have "trivial" initializers or are fully annotated. TypeScript can then spend less time emitting .d.ts files, and because .d.ts emit does not require any sort of type-checking, it removes the need to check in the order of the dependency graph of your projects. What that means is that you could execute TypeScript in parallel across projects
Maybe I'm misunderstanding the problem. What's the difference between having the types hand written, or generated by the compiler? If we could generate the same declaration that this proposal is asking for by the compiler, shouldn't it solve the problem?
Apologies in advanced the thread is too long to look for answers. For my own sakes, how will this proposal deal with global types, or import declarations such as The PR explains everything.import('some-module').loadConfigs
?
Update:
From what I understand the TypeScript compiler is really capable, it gives a perfect analysis when types don't match, hell it gives specific errors including all the details of what's the issue. Imagine it generates an isolated declaration file for a single file today, it will continue to work, and boost performance gradually by doing that. This even keeps it backward compatible. Of course it won't give the same performance boost working with existing node_module/@types but that's the same for this proposal, and folks can gradually update their packages.
Time will tell but I think it will take lots of time, and efforts for all TS users to adapt this proposal. It will reduce DX, spiral controversies, and spread confusions by generating errors for existing types in the community.
Fully annotating is an opt-in for faster build performance and easier integration with other build tools like bundlers (which could themselves also create declaration file bundles!), so it's on a team-by-team basis to decide whether or not it's a worthwhile tradeoff.
If we could generate the same declaration that this proposal is asking for by the compiler, shouldn't it solve the problem?
It wasn't clear to me, but let me take a different stab at this. It sounds like what you're asking is "why can't the compiler just separately infer these things and print them out?". For a codebase where B
and C
depends on A
...
flowchart BT
B -->|Depends On| A
C -->|Depends On| A
could one perform declaration emit with the compiler on A
while checking A
in parallel, and then eagerly check and perform .d.ts
emit on B
and C
while A
might still be getting type-checked?
I suppose technically one could! It's not entirely necessary for the compiler to perform a full type check to perform declaration emit if you perform it separately, though in the worst (maybe somewhat-uncommon) case, you could still end up doing a full type-check across the program to perform declaration emit.
It doesn't fully solve the problem of enabling a fully parallel type-checking build across projects - but maybe that is a good middle-ground for some people, especially if they're willing to mostly annotate across projects.
I think this sort of build approach is something that technically could be done today with our APIs.
I suppose technically one could! It's not entirely necessary for the compiler to perform a full type check to perform declaration emit if you perform it separately, though in the worst (maybe somewhat-uncommon) case, you could still end up doing a full type-check across the program to perform declaration emit.
Interesting. I'm wondering how much performance do we gain if TSC implements this with the incremental builds. This is quite important as it has a minimal impact on the existing source codes.
Referring to this:
The problem with that is that we cannot produce .d.ts files without running full type checking, I believe purely due to type inference.
This can be done separately outside of this proposal. If the problem of parallelizing the type checker is not having exported inferred types, can't we solve this in a more elegant way to avoid impacting the developer experience?
First but probably the most naive solution is to keep the .d.ts
files along with the source, that will allow a static type check for the compiler from the start while the declaration emitter is verifying/correct, and update them.
Another solution would be using an auto-inferred type annotation syntax.
If TS offered producing auto-inferred types for exported declarations, it would address the DX issues of this proposal. Assuming if we have a function a()
that has no explicit return type annotation while being created, TypeScript could place an auto-generated inferred type, and annotate it on file saves. Using a syntax to differentiate between an explicit type annotation vs an auto-generated one allow the tooling to deliver the best experience while reasoning with the code. This should have a minimum performance penalty as users mostly change a hand full of files at once, and as mentioned the declaration emit can be done in parallel.
An example scenario that might help visualize the flow (syntax aside):
// step #1 - user saves
export function A(): ? {
return 'validation' as const;
}
// step #2 - file saved
export function A(): ?'validation' {
return 'validation' as const;
}
// step #3 - user saves
import {x} from 'external-validation';
export function a<T extends string>(prop: T): ?'validation' {
return {
validation: x(prop),
};
}
// step #4 - file saved
import {x} from 'validation'; // declared function x<T>(value: T): ComplexType<T>
export function a<T extends string>(prop: T): ?{ validation: ComplexType<T>} {
return {
validation: x(prop),
};
}
Isolated Declarations with Unified Emit
a.k.a Isolated Declarations (Take Two)
Here's a brief update on where we are with the project after taking a change of approach at the end of January.
Previous Approach
The implementation of Isolated Declarations in #53463 intentionally avoided changing existing declaration emit. Keeping the new emit independent seemed like a good way to de-risk introduction of the feature, but led to various points of unfortunate duplication. This was most visible in the high volume of duplicate tests that dominated the size of the PR.
Current Approach
The feedback was that TypeScript's existing declaration emit should be enhanced to match the new emit produced by an isolated emitter where possible. So we are enhancing the classic declaration emit process to attempt a more verbatim extraction of the type declarations from the source files, and only falling back to multi-file crawling via the checker when isolated extraction fails. Crucially it means there will only be one declaration emit for a given source file. This simplifies the implementation, significantly reduces the testing burden, and allows third-party declaration emitters to use TypeScript's declaration emit as the reference target they need to match.
Once this in-place declaration emit upgrade is complete, we shall add the --isolatedDeclarations
flag to trigger error checks for all the cases where the single-file extraction failed. The declaration emitter will error instead of falling back to checker-based emit. This error will prompt the user to explicitly annotate their source file with quick-fix suggestions. Whilst it may seem untidy for checks to live outside the checker, this is a continuation of the existing pattern for declaration emit blockers to originate from the declaration emitter. This optimizes for implementation simplicity.
Delivery Plan
There are a series of changes to unify declaration emit labeled [ID-Prep]
that mostly build on each other.
- 1. Literal Types Issue1 Issue2 Issue3 PR
- 2. Triple slash references Issue PR
- 3. Preserve type for fields/parameter with implicit any Issue PR
- 4. Preserve type from type assertions Issue PR
- 5. Preserve type from function expressions Issue PR
- 6. Preserve type from const array literals Issue TODO PR TODO
- 7. Preserve type from object literals Issue TODO PR TODO
- N.
--isolatedDeclarations
Preview branch of final state PR TODO
@robpalme thanks for all the work you've put into this. Do you folks have any idea if we can auto-fix returned inferred types in a project using a script?
100% of errors reported by --isolatedDeclarations
have automatic quickfix suggestions thanks to @h-joo's work. That was a guiding principle. So naturally that includes the missing return type errors.
@dragomirtitian created an interactive CLI script (not public yet) to convert a whole codebase using the quickfixes. Where more than one quickfix is offered, it allows the user to select which one to apply. Sometimes you need a human with taste to pick the best choice.
https://github.com/microsoft/ts-fix should also work.
https://github.com/microsoft/ts-fix should also work.
@jakebailey, Tried to build, and run that locally with the isolated declaration branch, and the whole process of building them locally, and make ts-fix
use the local custom version of TypeScript is just not as straightforward as I thought.
ts-fix
ended up saying there is no error to fix even though when I use the local built version of isolated declaration I see 2k errors. I must have made a mistake linking the typescript version for ts-fix
, but overall a more straightforward approach would be appreciated.
β ts-fix git:(main) node dist/cli.js -e 9007 -p /home/web/packages/tsconfig.json
The project is being created...
Using TypeScript 5.5.0-dev
No diagnostics found with code 9007
Found 0 codefixes
No changes remaining for ts-fix
Done
Thanks to all the folks who worked on this - but a special shout-out to @dragomirtitian who put in a ton into making this happen in TypeScript 5.5! πππ