Proposal for generators design
JsonFreeman opened this issue · 107 comments
A generator is a syntactic way to declare a function that can yield. Yielding will give a value to the caller of the next() method of the generator, and will suspend execution at the yield point. A generator also supports yield *
which means that it will delegate to another generator and yield the results that the inner generator yields. yield
and yield *
are also bi-directional. A value can flow in as well as out.
Like an iterator, the thing returned by the next method has a done property and a value property. Yielding sets done to false, and returning sets done to true.
A generator is also iterable. You can iterate over the yielded values of the generator, using for-of, spread or array destructuring. However, only yielded values come out when you use a generator in this way. Returned values are never exposed. As a result, this proposal only considers the value type of next() when the done property is false, since those are the ones that will normally be observed.
Basic support for generators
Type annotation on a generator
A generator function can have a return type annotation, just like a function. The annotation represents the type of the generator returned by the function. Here is an example:
function *g(): Iterable<string> {
for (var i = 0; i < 100; i++) {
yield ""; // string is assignable to string
}
yield * otherStringGenerator(); // otherStringGenerator must be iterable and element type assignable to string
}
Here are the rules:
The type annotation must be assignable to.Iterable<any>
- This has been revised:
IterableIterator<any>
must be assignable to the type annotation instead.
- This has been revised:
- The operand of every yield expression (if present) must be assignable to the element type of the generator (string in this case)
- The operand of every
yield *
expression must be assignable toIterable<any>
- The element type of the operand of every
yield *
expression must be assignable to the element type of the generator. (string is assignable to string) - The operand of a
yield
(if present) expression is contextually typed by the element type of the generator (string) - The operand of a
yield *
expression is contextually typed by the type of the generator (Iterable<string>
) - A
yield
expression has type any. - A
yield *
expression has type any. The generator is allowed to have return expressions as well, but they are ignored for the purposes of type checking the generator type.The generator cannot have return expressions- Open question: Do we want to give an error for a return expression that is not assignable to the element type? If so, we would also contextually type it by the element type.
- Answer: we will give an error on all return expressions in a generator. Consider relaxing this later.
- Open question: Should we allow void generators?
- Answer: no
Inferring the type of a generator
A generator function with no type annotation can have the type annotation inferred. So in the following case, the type will be inferred from the yield statements:
function *g() {
for (var i = 0; i < 100; i++) {
yield ""; // infer string
}
yield * otherStringGenerator(); // infer element type of otherStringGenerator
}
- Rather than inferring Iterable, we will infer IterableIterator, with some element type. The reason is that someone can call next directly on the generator without first getting its iterator. A generator is in fact an iterator as well as an iterable.
- The element type is the common supertype of all the yield operands and the element types of all the
yield *
operands. - It is an error if there is no common supertype.
- As before, the operand of every
yield *
expression must be assignable toIterable<any>
yield
andyield *
expressions again have type any- If the generator is contextually typed, the operands of
yield
expressions are contextually typed by the element type of the contextual type - If the generator is contextually typed, the operands of
yield *
expressions are contextually typed by the contextual type. Again, return expressions are allowed, but not used for inferring the element type.Return expressions are not allowed. Consider relaxing this later, particularly if there is no type annotation.- Open question: Should we give an error for return expressions not assignable to element type (same as the question above)
- Answer: no return expressions.
- If there are no yield operands and no
yield *
expressions, what should the element type be?- Answer: implicit any
The *
type constructor
Since the Iterable type will be used a lot, it is a good opportunity to add a syntactic form for iterable types. We will use T*
to mean Iterable<T>
, much the same as T[]
is Array<T>
. It does not do anything special, it's just a shorthand. It will have the same grammatical precedence as []
.
Question: Should it be an error to use *
type if you are compiling below ES6.
The good things about this design is that it is super easy to create an iterable by declaring a generator function. And it is super easy to consume it like you would any other type of iterable.
function *g(limit) {
for (var i = 0; i < limit; i++) {
yield i;
}
}
for (let i of g(100)) {
console.log(i);
}
var array = [...g(50)];
var [first, second, ...rest] = g(100);
Drawbacks of this basic design
- The type returned by a call to next is not always correct if the generator has a return expression.
function *g() {
yield 0;
return "";
}
var instance = g();
var x = instance.next().value; // x is number, correct
var x2 = instance.next().value; // x2 is given type number, but it's actually a string!
This implies that maybe we should give an error when return expressions are not assignable to the element type. Though if we do, there is no way out.
2. The types of yield
and yield *
expressions are just any. Many users will not care about these, but the type of the yield
expression is useful if for example, you are implementing await on top of yield.
3. If you type your generator with the *
type, it does not allow someone to call next directly on the generator. Instead they must cast the generator or get the iterator from the generator.
function *g(): number* {
yield 0;
}
var gen = g();
gen.next(); // Error, but allowed in ES6 (preferred in fact)
(<IterableIterator<number>>gen).next(); // works, but really ugly
gen[Symbol.iterator]().next(); // works, but pretty ugly as well
To clarify, issue 3 is not an issue for for-of, spread, and destructuring. It is only an issue for direct calls to next. The good thing is that you can get around this by either leaving off the type annotation from the generator, or by typing it as an IterableIterator.
Advanced additions to proposal
To help alleviate issue 2, we can introduce a nominal Generator type (already in es6.d.ts today). It is an interface, but the compiler would have a special understanding of its type arguments. It would look something like this:
interface Generator<TYield, TReturn, TNext> extends IterableIterator<TYield /*| TReturn*/> {
next(n: TNext): IteratorResult<TYield /*|TReturn*/>;
// throw and return methods elided
}
Notice that TReturn is not used in the type, but it will have special meaning if you are using something that is nominally a Generator. Use of the Generator type annotation is purely optional. The reason that we need to omit TReturn in the next method is so that Generator can be assignable to IterableIterator<TYield>
. Note that this means issue 1 still remains.
- The type of a
yield
expression will be the type of TNext
function *g(): Generator<number, any, string> {
var x = yield 0; // x has type string
}
- If the user does not specify the Generator type annotation, then consuming a yield expression as an expression will be an implicit any. Yield expression statements will be unaffected.
- For a return expression not assignable to the yield type of the generator, we can give an error (require a type annotation) or we can infer
Generator<TYield, TReturn, any>
?
function *g() {
yield 0;
return ""; // Error or infer TReturn as string
}
Once we have TReturn in place, the following rules are added:
- If the operand of
yield *
is a Generator, then theyield *
expression has the type TReturn (the second type argument of that generator) - If the operand of a
yield *
is a Generator, and theyield *
expression is inside a Generator, TNext of the outer generator must be assignable to TNext of the inner one.
function *g1(): Generator<any, any, string> {
var t = yield * g2(); // Error that string is not assignable to number
}
function *g2(): Generator<any, any, number> {
var s = yield 0;
}
- If the operand of
yield *
is not a Generator, and theyield *
is used as an expression, it will be an implicit any.
Ok, now for issue 1, the incorrectness of next. There is no great way to do this. But one idea, courtesy of @CyrusNajmabadi, is to use TReturn in the body of the Generator interface, so that it looks like this:
interface Generator<TYield, TReturn, TNext> extends IterableIterator<TYield> {
next(n: TNext): IteratorResult<TYield | TReturn>;
// throw and return methods elided
}
As it is, Generator will not be assignable to IterableIterator<TYield>
. To make it assignable, we would change assignability so that every time we assign Generator<TYield, TReturn, TNext>
to something, assignability changes this to Generator<TYield, any, TNext>
for the purposes of the assignment. This is very easy to do in the compiler.
When we do this, we get the following result:
function *g() {
yield 0;
return "";
}
var g1 = g();
var x1 = g1.next().value; // number | string (was number with old typing)
var x2 = g1.next().value; // number | string (was number with old typing, and should be string)
var g2: Iterator<number> = g(); // Assignment is allowed by special rule!
var x3 = g2.next(); // number, correct
var x4 = g2.next(); // number, should be string
So you lose the correctness of next when you subsume the generator into an iterable/iterator. But you at least get general correctness when you are using it raw, as a generator.
Additionally, operators like for-of, spread, and destructuring would just get TYield, and would be unaffected by this addition, including if they are done on a Generator.
Thank you to everyone who helped come up with these ideas.
I've updated the proposal with the results of further discussion. There have only been a few minor changes:
- Answers to open questions in the basic proposal.
- Change the rule about what return type annotations are allowed on a generator function.
IterableIterator<any>
must be assignable to the return type annotation. - Return expressions are not allowed in generators. We can consider relaxing this later if the need arises.
For the sake of completeness, I think it would extremely helpful to actually state the current declarations of the types named here:
interface IteratorResult<T> {
done: boolean;
value?: T;
}
interface Iterator<T> {
next(value?: any): IteratorResult<T>;
return?(value?: any): IteratorResult<T>;
throw?(e?: any): IteratorResult<T>;
}
interface Iterable<T> {
[Symbol.iterator](): Iterator<T>;
}
interface IterableIterator<T> extends Iterator<T> {
[Symbol.iterator](): IterableIterator<T>;
}
interface GeneratorFunction extends Function {
}
interface GeneratorFunctionConstructor {
/**
* Creates a new Generator function.
* @param args A list of arguments the function accepts.
*/
new (...args: string[]): GeneratorFunction;
(...args: string[]): GeneratorFunction;
prototype: GeneratorFunction;
}
declare var GeneratorFunction: GeneratorFunctionConstructor;
interface Generator<T> extends IterableIterator<T> {
next(value?: any): IteratorResult<T>;
throw(exception: any): IteratorResult<T>;
return(value: T): IteratorResult<T>;
[Symbol.iterator](): Generator<T>;
[Symbol.toStringTag]: string;
}
Looks good!
Got a little lost in the first post, but I'm going to write what I understood, and you guys can correct me if I'm wrong:
function *g () {
var result: TNext = yield <TYield>mything()
}
g
cannot contain the statementreturn
.- All
yield
keywords must be treated as the same type (called TNext) that can be a union. - All calls to an
ginst
(an instance of g) of the formginst.next(...)
must pass a parameter of type TNext (assuming that's only if TNext is not null, I don't know if TNext can be null). - Any value on the right of the
yield
keyword must be of typeTYield
, and if ommitted is treated as the valueundefined
. - An instance of
g
can be typed as follows:var ginst: TYield*
but then you must cast to anIterableIterator
(or something similar) before callingginst.next
(just a note here - yuck?)
Is there anything important that I missed here?
Request:
A nicer way of defining generator types e.g. for a generator,
function* g(value: number) {
while (true) {
value+= yield value;
}
}
something like:
var ginst: GeneratorInstance<number, number>
and
var gtype: *g(start: number)=>GeneratorInstance<number, number>;
For the following code:
ginst = g(0);
ginst.next(2);
gtype = g;
👍 for generators
edit: fixed putting *'s in all the wrong places.
... Also the lack of a return statement annoys me, I think it should be forced to have the same type as yield, and if it's a different type (and yield is being implicitly typed) the return type should force a change to the implicitly derived type for yield.
To summarise; In a generator function return is treated identically to yield.
This way I can have my generators actually end on a value that's not forced to be undefined (by Typescript).
@Griffork from what I understand, you can have return statements, just not return expressions - specifically, you can't return a value, but you can bail out from within the generator at any point.
This probably doesn't help your frustration in the return type being ignore; however, it would certainly help to get some use realistic cases for what exactly you'd like to return when a generator has terminated.
@DanielRosenwasser not sure I understand.
I guess what you're calling a return expression is: return true;
?
If that is the case, then how is a return statement different to a return expression?
Here's an example of the type of generator I was thinking of when I voiced my discomfort:
function* g (case) {
while(true){
switch(case) {
case "dowork1":
//do stuff
case = yield "OPERATIONAL - OK";
break;
case "dowork2":
//do stuff
case = yield "OPERATIONAL - OK";
break;
case "shutdown":
//do stuff
return "COMPLETE";
}
}
}
Where it may execute an arbitrary amount of times, but at some point it's "completed" and it notify's it's caller that it's done.
My concern (which I have not yet researched) is that without the return statement, there might be garbage-collection problems on some systems (particularly since the whole function-state has to be suspended and resumed on a yield), which is bad if you're spawning a lot of similarly-structured generators/iterators.
It also makes the function read a lot more clearly in my opinion.
I guess what you're calling a return expression is:
return true;
?
That is a return statement, for which the return expression is true
.
In other words, a return expression is the expression being returned in a return statement.
Where it may execute an arbitrary amount of times, but at some point it's "completed" and it notify's it's caller that it's done.
From what I understand of your example, you return "COMPLETE"
to indicate that the generator is done, which I don't see as any more useful as the done
property on the iterator result. We need some more compelling examples.
Though, now that I think about it, if there are multiple ways to terminate (i.e. shutdown or failure), that's when the returned value in a state-machine-style generator would be useful.
@DanielRosenwasser got it, thanks for the clarification :).
I'd argue that a correct implementation would allow return expressions, and type them distictly from yield expressions.
Generators are commonly used in asynchronous task runners, such as co. Here is an example:
var co = require('co');
var Promise = require('bluebird');
// Return a promise that resolves to `result` after `delay` milliseconds
function asyncOp(delay, result) {
return new Promise(function (resolve) {
setTimeout(function () { resolve(result); }, delay);
});
}
// Run a task asynchronously
co(function* () {
var a = yield asyncOp(500, 'A');
var ab = yield asyncOp(500, a + 'B');
var abc = yield asyncOp(500, ab + 'C');
return abc;
})
.then (console.log)
.catch (console.log);
The above program prints 'ABC'
after a 1.5 second pause.
The yield
expressions are all promises. The task runner awaits the result of each yielded promise and resumes the generator with the resolved value.
The return
expression is used by the task runner to resolve the promise associated with the task itself.
In this use case, yield
and return
expressions are (a) equally essential, and (b) have unrelated types that ideally would be kept separate. In the example, TYield
is Promise<string>
and TReturn
is string
. There is no reason why they would be conflated into one type in a task runner.
@yortus I'm not sure what you're asking for is at all possible, or if it makes any sense, I'll try to explain where I'm confused.
The only way to start or resume a generator is the generator's .next
function. This function takes a single argument (which is supplied in place of the yield expression) and returns a single value (which is the value to the right of the yield expression).
The following Javascript:
function*g() {
var a = yield "a";
var b = yield a + "b";
var c = yield b + "bc";
return 0;
}
var ginst = g();
console.log(g.next() + g.next("a") + g.next("a"));
return g.next("");
Is the equivalent to
console.log(("a") + ("a" + "b") + ("a" + "bc"));
return 0;
But what happens if I try:
var done = false;
var value;
while (!done) {
value = ginst.next(value);
console.log(value);
}
I get:
"a"
"ab"
"abbc"
0
The last one is a number, meaning if ginst.next
is to be called in a loop, the return type must be string|number
or it may be incorrect.
It's important to note here that the proposal that yield and return are treated identically will work for co's consumption, and for Promises. If it will help I can write some example implementations.
Like that last suggestion:
interface Generator<TYield, TReturn, TNext> extends IterableIterator<TYield> {
next(n: TNext): IteratorResult<TYield | TReturn>;
// throw and return methods elided
}
Seems ok to lose the correctness of next when you subsume the generator into an iterable/iterator.
Does that solve drawback #3
?
Not sure if I like T*
, this looks clearer:
function *g(): Generator<number, string, any> {
yield 0;
return "";
}
var a: Iterable<number|string> = g();
// lose correctness
var b: Iterable<number> = g();
// consider using *T for better symmetry instead of T*
var c: *number = g();
@jbondc *T
has better symmetry, but can be confusing because *T
doesn't denote a generator here, it denotes an iterable, which while that can be the same thing can also not be the same thing.
If you read *
as 'many values' from thing, it works well for generators and iterators. Likely T*
bothers me because it looks like a pointer if you write string*
Another example of using generators to support asynchronous control flow. This is working code, runnable in current io.js
. There are some comments showing the runtime types of TYield
and TReturn
. When generators are used in this way, these types tend to be unrelated to each other. The most useful type to have inferred in this example is probably the TReturn
type.
var co = require('co');
var Promise = require('bluebird');
var fs = Promise.promisifyAll(require('fs'));
var path = require('path');
// bulkStat: (dirpath: string) => Promise<{ [filepath: string]: fs.Stats; }>
var bulkStat = co.wrap(function* (dirpath) {
// filenames: string[], TYield = Promise<string[]>
var filenames = yield fs.readdirAsync(dirpath);
var filepaths = filenames.map(function (filename) {
return path.join(dirpath, filename);
});
// stats: Array<fs.Stats>, TYield = Array<Promise<fs.Stats>>
var stats = yield filepaths.map(function (filepath) {
return fs.statAsync(filepath);
});
// result: { [filepath: string]: fs.Stats; }
var result = filepaths.reduce(function (result, filepath, i) {
result[filepath] = stats[i];
return result;
}, {});
// TReturn = { [filepath: string]: fs.Stats; }
return result;
});
bulkStat(__dirname)
.then(function (stats) {
console.log(`This file is ${stats[__filename].size} bytes long.`);
})
.catch(console.log);
// console output:
// This file is 1097 bytes long.
The function bulkStat
stats all the files in the specified directory and returns a promise of an object that maps file paths to their stats.
Note that the TReturn
type is unrelated to either of the TYield
types, and the two TYield
types are unrelated to each other.
- g can contain return statements, but they cannot return a value, as explained by @DanielRosenwasser.
- Yes, if you type the generator with a
*
, you'll have to cast to IterableIterator to call next directly. If you leave off the type, it will be inferred as IterableIterator. And you can certainly type it as IterableIterator to begin with. - Sounds like what you are asking for is a type that is Nextable. Namely, a type that specifies the in-type as well as the out-type. That seems reasonable. The one caveat is that most consumers (for-of, spread, destructuring assignments) will never be passing a value to next. Would you recommend that if a generator's next requires a value to be sent in, then it is an error to use for-of on the generator? In other words, we would only allow these two-way generators to be consumed by calling next directly, or by
yield*
. - Regarding return values: The problem with basing the element type on return expressions is that most iterations (for-of, spread, etc) will never observe the value returned by the return statement. So if the generator yields one type, but returns another type, we don't want to pollute the element type with the return, when 90% of users will never even see the return. What use cases do you have in mind for the return value? Keep in mind that the return value can only be observed by calling next directly, or by using
yield*
.
@yortus
I agree that the primary case for passing in a value to next
is async frameworks, since you want to pass the value the awaited Promise was resolved with. And I see your point about the return value being used to signify the fulfilled value of the Promise being created. I suppose the limitation of the basic proposal is that while it is great at typing generators as an implementation of an iterable, it does not give strong treatment to using generators as async state machines. Suppose we relaxed the restriction on return values, and the type system just ignored them. Would that be acceptable? We would allow everything that is required to write your async state machines, but there would be a lot of any
types floating around. Presumably this is a pretty advanced use case.
Without dependent types, it becomes very hard to hold onto TReturn without having it pollute TYield. Ideally, we would have one type associated with done: false
and another with done: true
. But without that facility, there is really no good place to represent TYield and TReturn separately in the type structure.
@jbondc, I understand your syntactic concern with *
looking like a pointer. But I have to agree with @Griffork that *T
will be more confusing, because it seems to be intimately tied to generators. And in fact, this type needn't be used with generators. It is just sugar for an Iterable.
Replying in phone, bear with me...
@JsonFreeman oh, good point. I stopped monitoring the straw man before for... of was finalised. The use case that I currently have for return is the state machine example above when you consider that you can also return "ERROR".
On another note, does done = true
on error?
Yes, I plan to do some funcy promise-like stuff with a next-able state based generator.
And I like your suggestion that generators that take a value should error in a for-of.
Being able to detect type depending on the value of done
sounds good, but I'm not sure how possible that is, as it would be easy to break.
The only way I can see @Yortis' example working is if he explicitly passed typing information to co and co used that to type the return function. Either way I don't think it's possible for Typescript to provide what you're asking for, unless someone can give me a working example of how it would be implemented.
@JsonFreeman would it be possible to opt in/out of returning a value?
I don't know where your facts about the typical usage of generators comes from, an article like that would be useful to read, would I be able to get a link?
From what you're saying it sounds like most users are liable to use both yield and return to return values from their generator but they don't want to know about the value returned by return.
Or are you trying to say that most users don't use return (I imagine if you're not using return in the generator, it's not going to pollute the yielded value).
- If I provide a way to declare a generator that requires something to be sent in, I could make it an error to consume it with for-of, spread, etc. The problem that remains is the first call to next, which should not take a value. Most consumers will call
next()
instead ofnext(undefined)
on the first call, so it seems silly to require them to pass a dummy argument. So I could not do this by givingnext
a required parameter. Given that constraint, I'm not sure how we could distinguish between a generator that can be consumed with for-of and a generator that cannot. - When you ask if
done = true
on error, not sure what you mean by "on error". - Regarding opting in/out of returning a value: I think what you're asking for is to make it legal to return a value, so just remove the error, correct?
- I would not call our assumptions about typical usage "facts". They are more just conjectures at this point because the feature is so new.
- I imagine that most users who are returning a value are doing it by mistake, because they do not realize that the last value cannot be observed by most iteration constructs. Either that, or they do not care to discard it. However, I realize there are some users who are implementing advanced mechanics like async, and who control both the generator and its consumption (similar to @yortus's scenario). And those users have a legitimate reason to return a value. I don't think there are many such users though. It highly depends on whether the generator is supposed to be an iterator, or something more advanced than that.
- The fact remains that if give a way of enforcing the type of the return values, it becomes very difficult to separate it from the yield type.
It would essentially involve hacking the assignability rules to make sure a generator that returns something is assignable to an iterable when you ignore that return value. Doable, but kind of a hack.
Oh, ok.
@JsonFreeman when I was first looking up generators, the amount of threads/blogs/posts I found that wanted to use it in a promise fashion vs an iterable was about 10:1. That's why I was asking you for your source. I don't think that the idea that most users will want to use it as an iterable is valid, although it will still be very prevalent, using the generator for promises looks like it will be about equally prevalent if not more.
I see what you mean about the problems with making generators sometimes not iterable. If it's going to be a hack, either don't do it or don't do it yet, leave it to the user and if it's a big problem later you can revaluate the decision.
As for opting in/out of returning a value, yes. When I first wrote that I was thinking of something else, but that idea was bad and this one is better.
Again, I don't think you can separate the return type from the yield type due to the way generators are used (although I agree it would be useful, JavaScript's implementation does not make this doable).
@Griffork here is an in-depth article describing many uses and details of generators. TL;DR: the two main uses cases so far are (1) implementing iterables and (2) blocking on asynchronous function calls.
@JsonFreeman having TReturn = any
always would be a good start. Not allowing return expressions at all would rule out many valid uses of generators. You describe the async framework scenario as 'advanced'. Perhaps so, but in nodeland with its many async APIs, it's already a widespread idiom that works today and is growing in popularity. co
has a lot or github stars, a lot of dependents, and a lot of variants.
Interestingly, when crafting generators to pass to co
, one cares more about the TResult
type and the types returned by yield
expressions, whilst the TYield
type is not so important.
Side note: the proposal for async functions (#1664) mentions using generators in the transform for representing async functions in ES6 targets. Return expressions are needed there, in fact the proposal shows one in its example code. It would be funny if tsc
emitted generators with return expressions as its 'idiomatic ES6' for async functions, but rejected them as invalid on the input side.
@JsonFreeman #2936 mentions singleton types are getting the green light. At least for string literal types. If there was also a boolean literal type, then the next
function could return something like { done: false; value?: TYield; } | { done: true; value?: TReturn; }
. Then type guards could distinguish the two cases.
I'm just thinking out loud here, so not sure if that would make anything easier, even it if did exist.
@Griffork and @yortus, thank you for your points. It sounds like we are leaning towards the solution of the "next" parameter and the return values having type any
, but allowing generator authors to return a value. The return type of next
will take into account TYield but not TReturn. Would you agree that that solution is a good way to start?
@yortus, as for singleton types, let's see how it goes for strings, and then we can evaluate it for booleans. At that point it would be clearer whether it would help split up TYield and TReturn, but I imagine that it could be just what we need here.
@JsonFreeman sure, I'd be happy with that.
At least then there will be the opportunity to gather feedback from Typescript users instead of relying on speculation (particularly my own).
Thank you for listening, this has been one of the most enjoyable discussions I've had on a Typescript issue :-).
@JsonFreeman sounds good.
Another minor point:
interface Generator<T> extends IterableIterator<T> {
next(value?: any): IteratorResult<T>;
throw(exception: any): IteratorResult<T>;
return(value: T): IteratorResult<T>; // <--- value should not be constrained to T
[Symbol.iterator](): Generator<T>;
[Symbol.toStringTag]: string;
}
That's copied from above. Shouldn't the return
method be return(value?: any): IteratorResult<T>;
? Calling this method causes the generator to resume and immediately execute return value;
. There is no link between the type of value
and the T
type which is the type of the yield
expressions in the generator.
Great, thanks guys!
@yortus, I am actually not sure there is much value in defining the Generator type yet. I'd sooner remove it now, and add it back later if we want to leverage it to support the return value and next value.
But to your point about the return method, yeah I think you're right. I guess you could also define it as
return<U>(value: U): IteratorResult<T | U>;
Meaning it would return something of the yield type, or the thing you passed in. The yield type would only be returned in pathological cases like this:
function* g() {
try {
yield 0; // suspended here, and user calls return("hello");
}
finally {
yield 1; // return gets intercepted by this yield expression
}
}
But I realize that this is a ridiculous reason to include T in there.
@Griffork At least then there will be the opportunity to gather feedback from Typescript users instead of relying on speculation (particularly my own).
This (feedback driven changes) is definitely our preferred methodology but keep in mind the problem is we can relax a restriction later without breaking people but cannot do the reverse. So defaulting to typing something as any
while permissive also means if we realize it's wrong later (whether based on our own exploration or feedback from others) then the change is much more painful than had we taken the more conservative approach. This is not to say we're just always defaulting to the most conservative option in the face of any uncertainty but it is definitely a large factor when considering which side to come down on when we want to give ourselves room to change/adapt in the future.
@danquirk I understand your concern.
My standing comes from the fact that there are already libraries that require that the generator's return
function works. And I am planning to design a system that requires that the generator's return
function is available, if it is not available my planned library cannot work (not even with a yield replacement).
So yes - I understand that you guys don't want to commit to something that in the future you won't be able to work with, but you must understand that users of TypeScript will require this functionality, and that it may not be as small of a percentage as you may think.
It is almost tempting to return to vanilla Javascript just for the generator support, however the large project that I'm embarking on will suffer from it in the long run.
defaulting to typing something as any while permissive also means if we realize it's wrong later (whether based on our own exploration or feedback from others) then the change is much more painful than had we taken the more conservative approach.
@danquirk that's a valid point and should perhaps rule out the TResult = any
approach.
However if the current proposal to disallow return expressions stands, that will also be pretty painful for people thinking TypeScript supports ES6 generators and reaching for their favourite async control flow library. Perhaps in this case, the proposal/feature should be renamed on the 1.6 roadmap to better qualify it - something like 'Iterable Generators' or 'partial generator support'. As @Griffork points out, async control flow is a fairly major use-case of generators in current ES6 code out there.
As an alternative to TReturn = any
, what would happen if the first stage proposal was to accurately infer both the TYield
and TReturn
types, and accept for now the inconveniences associated with next()
returning { done: boolean; value?: TYield | TResult }
. (NB: I'm assuming this is feasible in the compiler, but don't know enough about it to judge).
This has inconveniences for iterators, but at least it would be correct type-wise and therefore avoid the future-proofing problem @danquirk mentions. The inconveniences could be addressed with syntax or compiler sugar at a later point. But at least generators would be full ES6 generators.
To clarify one thing. I didn't actually mean that we would infer any
from the return expressions. I just meant that we would ignore them instead of giving an error.
Inferring TYield | TReturn
for the value means that neither of our two use cases (iteration and async) are pleasant or ideal. Until we can model the done
property correctly as a literal type, I'd rather make one use case pleasant, and the other possible. This may sound short-signed, but I think we can largely avoid breaks later, if we make the stronger type change opt-in. Does that make sense?
I think we can largely avoid breaks later, if we make the stronger type change opt-in.
@JsonFreeman would you mind clarifying what this means in practice?
Sure. My statement presupposes that we will at some point have boolean literal types. The idea is that right now, for a generator, we will infer the return type to be IterableIterator<TYield>
. The return type of next
will be
{
done: boolean;
value: TYield; // not TReturn
}
Then later, let's say we have the opportunity to switch to boolean literal types. At this point, we would not change our inference. We would continue to infer IterableIterator<TYield>
. But the user would have the opportunity to change their return type annotation to a stronger type that does use boolean literals. So we might provide a type Generator<TYield, TReturn>
, whose next method returns
{ done: false; value: TYield } | { done: true; value: TReturn }
The one caveat is that I'm not sure this would be assignable to IterableIterator<TYield>
. But we don't even need it to be. Because if you are typing your generator this way, you are willing to give up its iterability. How does that sound?
If you can make it work it sounds good.
I did not think
{ done: false; value: TYield } | { done: true; value: TReturn }
Was stricter/assignable to
{
done: boolean;
value: TYield;
}
Or will the former be only with return and the latter be only without?
It is not assignable. The idea was that by default, we would infer the latter as the type (and ignore return values), but we'd allow you to specify the former.
However, there is a way to make it assignable if we alter the type of an Iterator so that it's .next
method returns
{ done: false; value: TYield } | { done: true; value: any }
Now the former type from your comment is assignable to this one. The consequence is that calling next
directly on an Iterator<string>
will give you any
, but if you can establish that done
is false, you will get string
. And all the syntactic forms that consume iterators will assume done
is false.
Oh, ok. I was more thinking that people who get used to TYield being the only return type from next() may be in for a nasty surprise when it changes.
Right, I would not want to change it on them unless they change their type annotation.
So people who want to use return on a generator would need to type the generator to do so and wouldn't be able to get it implicitly?
If they want the type of their return expressions to be tracked, yes. Otherwise, the type system will just ignore it.
Well, I suspect those who want to use promises aren't going to want to cast every generator they write (and neither am I for my library).
If that is the case I guess I'll be looking into other languages that support generators better.
I probably won't ever use the generator as an iterator.
You wouldn't have to cast it, you would just have to supply a return type annotation on your generator when you defined it.
Would it be better to accept a breaking change later, and change the inference behavior with respect to return expressions? Namely, have them be part of the inference, once we have boolean literal types?
That's your call, not mine, but since I seem to be repeatedly misunderstanding you, can you provide an example of a generator with that return type, and example usage of it using next?
Ok, let's suppose that we did not change the inference behavior upon the introduction of boolean literal types. Then you could only reasonably use the following generator as an iterator:
function* g() {
yield 0;
return "completed";
}
for (let x in g()) {
// here x has type number, as it should
}
var inst = g();
while (true) {
let next = g.next();
if (next.done) {
let vReturn = next.value; // number, but should be string
}
else {
let vYield = next.value; // number, as expected
}
}
Now with the type annotation
function* g(): Generator<number, string, any> { // Note the new type annotation
yield 0;
return "completed";
}
for (let x in g()) {
// x still has type number
}
var inst = g();
while (true) {
let next = g.next();
if (next.done) {
let vReturn = next.value; // string, now correct
}
else {
let vYield = next.value; // number, as expected
}
}
So here are the options:
- Infer IterableIterator now, and don't change that inference later (when we have boolean literals). But at that time, allow the user to supply a type annotation that makes the typing more precise. Now, everything you'd want to do is possible, but may require a type annotation (not a cast) to work correctly.
- Infer IterableIterator now, and change inference to infer the more precise type later. This means that at that point, there will be no effort on the user to make generators work in the way that you are asking for. But it would break consumers who are using your generator as an iterator, and now suddenly can't.
- Do nothing now, and wait until we have boolean literal types to implement generators, so that we can have the correct typing right off the bat. This means we have no breaks, but we delay generators until literal types are done.
- Make generators that have return expressions not iterable by the yield type. Essentially this means that instead of using TYield, we would just union TYield and TReturn so that it would not be pleasant to iterate over a generator that has a return expression.
- Do what I suggested in my advanced additions to the proposal. This means that we do some hackery in the type system to make
Generator<TYield, TReturn>
assignable toIterable<TYield>
(they would be pleasant to iterate over), but the Generator type would be somewhat of an oddball in the type system, and the compiler would pay special attention to the type arguments of the Generator type. This is a hack in the type system, but it essentially produces all the semantics that we want up front without requiring boolean literal types. We could later replace this with boolean literal types if/when they come online.
Option 5 is something I'm certainly willing to try out if you are interested in seeing what this would look like.
How will option 5 help the async use case? Even if there is a Generator<TYield, TReturn>
type, the TReturn
type won't appear in any of its members (until boolean literal types come along). That is, next()
will still return { done: boolean; value: TYield; }
for the time being.
Has option 3 been given serious consideration? It seems the only way to expose the TReturn
type to consumers of the Generator<TYield, TReturn>
interface. Does anyone on the team know how hard it would be to get boolean literal types into the compiler, so generators could be implemented fully with no hackery and no picking winners (ie out of iteration and async)?
Sorry if I wasn't clear on option 5. I meant that next would actually return { done: boolean; value: TYield | TReturn }
. So calling next on something of type Generator
would give you the right thing, but we'd still have separate access to the two types if you are using the nominal type Generator.
Option 3 has not been seriously considered yet, but maybe worth more discussion. It is also possible to go with option 5 temporarily until boolean literal types come along, at which point we'd be able to remove the hack introduced by option 5.
One more option: We could introduce a minimal form of boolean literal types early, without exposing many of the features of literal types, but use them as a way to track done-ness of the generator. This would allow us to keep the types separate. But I hesitate to suggest this because I'm not sure what the implications of adding the full literal type feature will be, given certain assumptions we might make about this initial implementation.
From the proposal above:
The element type is the common supertype of all the yield operands and the element types of all the yield * operands.
It is an error if there is no common supertype.
Isn't this also going to break the async use case? I gave a working example above where there are two yield expressions have types Promise<string[]>
and Array<Promise<fs.Stats>>
.
The current proposal would make this an error if I'm not mistaken. But its perfectly valid and normal in the async use case for yield expressions to have no common supertype.
What if TYield
was the union type of the yield expression types? Wouldn't that work out of the box for both use cases (iteration and async)? I suppose in the iteration case, it just wouldn't catch some programmer errors (ie, if they yield two different types in the same generator).
Yes, you are correct. It would break. I can change that so it uses the union type. This was more just for parity with the return expressions in a normal function, but as you point out, there is a meaningful difference here.
@Yortis for 5 next()
would return { done: boolean; value: TYield|TReturn; }
initially and would be updated when boolean literals become a thing, but for-of would only return TYield
.
3,4 or 5 sound good. Honestly. As much as I want to use generators now, I'd rather wait for proper support than to rush them and seriously gimp them.
4 seems like something that would be good to do some research for, as it seems like it could be useful even if 3 or 5 are chosen.
In fact, 3,4 and 5 are not mutually exclusive and should all be carefully considered (since if you wait for 3, 5 would still be good for having a more correct type for for-of).
Yes, as I mentioned, it is possible to do 5 now, and then when boolean literal types come online, remove the hack and use those.
Oh derp!
For some reason my brain decided that when we had the boolean literal support for-of would get TYield|TReturn
. Sorry about that.
Maybe even eventually support 4 behind a flag for those people who like those things (like noimplicitany).
Anyway, 3 and/or 5 sound the best with 4 being really good for some use-cases.
One more note about this option 5, just in case it was not clear from before. You would only have individual access to TYield
and TReturn
if you are using the named type Generator
. If you assign it to something with a different nominal type, you will just have TYield | TReturn
.
Sounds like option 5 followed asap by option 3 is the closest idea so far to a practical implementation that isn't too biased against any use case.
So taking this snippet of an async example:
function* genfunc() { // (1)
yield Promise.delay(1000);
var result = yield Promise.resolve(42); // (2)
return result;
}
co(genfunc)
.then(result => { // (3)
console.log(result);
});
How would we go about typing this? From my understanding of the proposal:
result
at (2) would be inferred asany
because proposal states that 'A yield expression has type any.'genfunc
would therefore have its type inferred asIterableIterator<any>
, sinceTYield|TReturn = Promise<number>|any = any
, which is not desirable.
Suppose we explicitly type the generator function like so: genfunc: () => Generator<Promise<number>, number>
and suppose co
looks like this (cut down):
function co<TReturn>(genfunc: () => Generator<Promise<any>, TReturn>) {
return new Promise(resolve => { // (4)
var genobj = genfunc(); // (5)
function resume(value) {
var next = genobj.next(value);
if (next.done) {
resolve(next.value); // (6)
} else {
next.value.then(resume); // (7)
}
}
resume();
});
}
Then inside co
under option 5:
genobj
at (5) is inferred asGenerator<Promise<any>, number>
next.value
at (6) and (7) is inferred asPromise<any> | number
- (7) won't compile and must be changed to
(<Promise<any>> next.value).then(resume);
- (6) will cause
resolve
at (4) to be inferred as(result: Promise<any> | number) => void
, which will cause theco(...)
expression to be inferred asPromise<Promise<any> | number>>
which is not desired. - if (6) is changed to
resolve(<TReturn> next.value);
, thenresolve
at (4) is inferred as(result: number) => void
, and theco(...)
expression is inferred asPromise<number>
, which is correct.
Alternatively inside co
under option 3:
genobj
at (5) is inferred asGenerator<Promise<any>, number>
next.value
at (6) is inferred asnumber
assumingif (next.done) {...}
is a type guard for thenext.done
boolean literal that narrows the result ofnext()
to be{done: true; value: TReturn; }
.next.value
at (7) is inferred asPromise<any>
assuming} else {
is a type guard for thenext.done
boolean literal that narrows the result ofnext()
to be{done: false; value: TYield; }
.- (6) will cause
resolve
at (4) to be inferred as(result: number) => void
, which will cause theco(...)
expression to be inferred asPromise<number>
, which is correct.
In summary for the async use case:
- under either option 3 and 5, the generator function must be explicitly typed.
- under option 3, everything else just works.
- under option 5, inside the async runner, casts are needed on the result of calling
genobj.next()
in both thedone: true
and thedone: false
case, then everything works.
Have I understood the proposal and the options correctly? And is there any way we could avoid having to explicitly type the generator function?
Yes, you are correct. It would break. I can change that so it uses the union type. This was more just for parity with the return expressions in a normal function, but as you point out, there is a meaningful difference here.
@JsonFreeman Offtopic but I don't really get why return expressions need a common supertype either, rather than being a union type. Is that just backward-compatibility baggage or am I missing something?
@yortus I have so far understood 3 and 5 to behave the way you described.
Actually, re-reading @JsonFreeman's last comment again, I think he was planning on hacking in the type-guard based on boolean literal until 3 is available. Meaning 5 behaves identically to 3.
I've been playing around with some alterations to the proposal and would like to submit some ideas for discussion.
EDIT: Removed TIteratorResult
and used TYield
and TResult
in line with current proposal. Sorry for any confusion!
Alternative Generator and IteratorResult definitions
Suppose the Generator
interface is defined like this:
interface Generator<TYield, TReturn, TYieldMapping extends (expr?: any) => any> {
next(value?: any): IteratorResult<TYield, TReturn>;
throw(error: any): IteratorResult<TYield, TReturn>;
return(result?: any): IteratorResult<TYield, TReturn>;
[Symbol.iterator](): Generator<TYield, TReturn, TYieldMapping>;
[Symbol.toStringTag]: string;
}
TYield
and TReturn
are tracked in the same way as the current proposal. IteratorResult
may be defined initially as:
interface IteratorResult<TYield, TReturn> {
done: boolean;
value?: TYield | TReturn;
}
When option 3 is possible, and if generic type aliases also make it into the compiler (greenlighted according to #2936), this could be tightened up to:
// NB: uses generic type alias and boolean singleton types - hopefully both coming to tsc vNext)
type IteratorResult<TYield, TReturn>
= { done: false; value?: TYield; }
| { done: true; value?: TResult; };
As for TYieldMapping
, this tracks the type mapping from yield
operands to yield
expressions in the generator, which generally makes sense for both the iteration and async use cases. This allows yield expressions to have their types accurately inferred, rather than always being any
.
Inferring the type of a generator
When no type annotation is given for a generator function, the compiler infers:
TYield
is the union type of all theyield
operand types present in the generator functionTResult
is the best common type (or union type?) of all thereturn
expression types present in the generator function.TYieldMapping = (expr?: any) => any
Suppose we have this example:
function* genfunc1() {
yield Promise.delay(1000);
var result = yield Promise.resolve(42);
return result;
}
Then the compiler infers:
TYield = Promise<number>
TResult = any
TYieldMapping = (expr?: any) => any
Two more examples:
function* genfunc2() {
yield 1;
yield 2;
yield 3;
}
function* genfunc3() {
yield Promise.delay(1000);
var result: number = yield Promise.resolve(42); // NB: annotated result
return result;
}
For genfunc2
the compiler infers:
TYield = number
TResult = any
TYieldMapping = (expr?: any) => any
For genfunc3
the compiler infers:
TYield = Promise<number>
TResult = number
TYieldMapping = (expr?: any) => any
These all match the inference ability of the current proposal and have the same characteristics under option 3 and 5 as discussed in preceding comments.
Contextually typing a generator
When a type annotation is provided for a generator function, the compiler uses the following rules:
- the operand of a
yield
expression (if present) is contextually typed byTYield
- the operand of a
return
expression (if present) is contextually typed byTReturn
- the type of each yield expression is inferred using
TYieldMapping
.
For example, suppose genfunc1
above is contextually typed like this:
var genfunc1: () => Generator<Promise<any>, any, <U>(expr: Promise<U>): U>;
genfunc1 = function* () {
yield Promise.delay(1000);
var result = yield Promise.resolve(42);
return result; // (3)
}
Then the compiler infers result
has type number
because TYieldMapping
maps Promise<number>
to number
. However this inference is lost at (3) because TResult
is contextually typed as any
.
More realistically, an async runner like co
would be defined something like:
interface CoYieldable {
<T>(expr: Promise<T>): T;
<T>(expr: Array<Promise<T>>): Array<T>;
// ... other yieldables ...
}
function co<TReturn>(genfunc: () => Generator<any, TReturn, CoYieldable>) {
...
}
Then the async runner can be used like this:
var promise = co(function* () { // promise inferred as Promise<number>
yield Promise.delay(1000);
var result = yield Promise.resolve(42); // result inferred as number
return result;
});
Due to the co
function's contextual typing of the generator function, all the yield
and return
expressions have their types accurately inferred with no annotations needed. This seems to be an improvement on the current proposal.
Iterables
I've focused more on the async use case above, but I believe the iterable use case is just as well catered for with this altered proposal as it is with the current proposal. That is, providing assignability rules between Generator
and IterableIterator
are worked out, iteration should be straightforward.
@yortus your post refers to TYieldOp
in one of your examples, but you don't elaborate on what this is, is it meant to read TYieldMapping
?
@Griffork I am not planning to add the type guard. I meant that I was thinking of adding boolean literal types internally in the compiler to help track the yield and return types separately. But thinking about it more, there would be too little benefit to doing that. I'd sooner implement option 5 by just holding on to the two types as type arguments to the nominal Generator type.
@yortus Regarding common supertype for return expressions, I don't actually know the reason for that. I know that we've discussed changing it, and if the function is contextually typed, it is actually allowed to be the union type. Maybe @RyanCavanaugh would know more, but I would ask that this topic be on a separate thread.
Your analysis in #2873 (comment) is pretty much correct. One question is, in the absence of a type annotation, would we infer IterableIterator<any>
, or Generator<Promise<number>, any>
? We could say that if you have at least one return expression, we switch to the more powerful Generator
type.
But your main points in that post are correct. The type guards on the done property would only work if we have boolean literal types. And to type the yield expressions, the user needs to specify the type explicitly, like Generator<Promise<number>, number, number>
.
Your following post is interesting. You want the yield expression type to depend on the yield operand type. Do you think it is necessary to infer a potentially different type for every yield expression, instead of specifying one type for all the yield expressions in the generator body? I suppose one thing I find awkward here is that you are trying to express a function on types as a function type. A function type in our type system has thus far represented only types that are inhabited by functions. They are never applied by the type system the way that generic types are. Your proposal also entails that we would perform type argument inference and overload resolution with all the call signatures in the type argument you pass as a mapper.
Then the compiler infers result has type
number
becauseTYieldMapping
mapsPromise<number>
tonumber
This concerns me (still crunching my way through your post 😛), since a yield statement may be resumed with any value, e.g.
function *generator_example() {
var test_var = yield "Hello!"; //once called with next(3), test_var becomes a number.
return test_var;
}
var giter = generator_example();
var hello = giter.next();
console.log("Hello", giter.next(3));
Here's the other thing about the YieldMapper. If you are inferring all the yield expressions to be different types, then when the consumer calls next, you would presumably want to check each call to next with the appropriate type given the yield expression the generator is currently suspended on. But of course, you would not statically know which yield expression a particular call to next corresponds to. And without that checking, I would argue there is really not much value to inferring the yield expression.
Correct me if I'm wrong, but I think typing each yield differently (within the function) and having that affect next()'s signature is not going to work (although it would be interesting, it'd be impossible to resolve if the generator or next are aliased). Also since you could call the same function twice and pass in two different types to it.
I think yield should be type (unless the programmer specifies somehow in the generator's signature otherwise) and would have to be cast for typing.
Here's the other thing about the YieldMapper. [....] you would not statically know which yield expression a particular call to next corresponds to. And without that checking, I would argue there is really not much value to inferring the yield expression.
Right, the YieldMapper is not useful to the caller of next()
. TYield
and TReturn
are the useful contstraints there, same as with the current proposal. The YieldMapper is useful for typing yield expressions inside the generator function body, which is a godsend for the async use case.
Do you think it is necessary to infer a potentially different type for every yield expression, instead of specifying one type for all the yield expressions in the generator body?
This is absolutely what is wanted in the async use case. See my examples earlier in the thread. In the async use case, each yield
expression's type is not related to the others, but it is related to the type of that yield
s operand. And it doesn't hurt the iterable use case at all, where is just a sort of degenerate case with only one expression type (any
).
I suppose one thing I find awkward here is that you are trying to express a function on types as a function type. A function type in our type system has thus far represented only types that are inhabited by functions. They are never applied by the type system the way that generic types are.
Can't the compiler's function overload resolution already do this? I had it in mind that it would use that exact same functionality already in the compiler for function overload resolution.
Your proposal also entails that we would perform type argument inference and overload resolution with all the call signatures in the type argument you pass as a mapper.
Right. Is that a problem? I see it as coverred by 4.12.1 Overload Resolution in the spec.
This concerns me (still crunching my way through your post 😛), since a yield statement may be resumed with any value
If no YieldMapper is given, this will still be the case. If a YieldMapper is provided for contextual typing, then you are effectively instructing the compiler to enforce a constraint on what yield can receive and return, which is useful in the async use case, and is not needed in the iterable use case.
I think typing each yield differently (within the function) and having that affect next()'s signature is not going to work
It wouldn't have any effect on next()
's signature.
I think yield should be type
If I understand you, it is typed - see the Generator
interface.
Here's the other thing about the YieldMapper. If you are inferring all the yield expressions to be different types, then when the consumer calls next, you would presumably want to check each call to next with the appropriate type given the yield expression the generator is currently suspended on.
Again, the YieldMapper is not useful for the consumer of the Generator
interface. That side of things is catered for by TYield
and TReturn
in the same way as the current proposal.
The YieldMapper, which defaults to (expr?: any) => any
, effectively brings us back to the current proposal if it is not provided.
But if it is provided for contextual typing, then each yield
expression can have its type accurately inferred using the YieldMapper. This is designed to fill a hole in the current proposal when in comes to the async use case. It means everything in the generator is correctly typed with no annotations needed. Without it, every single yield expression whose value is subsequently used will need to be manually annotated, or we just have any
-spaghetti.
@yortus it's one type to rule them all, not one type per yield. All yields should be one type (be that a programmer specified union or not).
More importantly; what is to the right of a yield should never influence the type of what comes to the left of a yield. That's assuming a relationship that doesn't exist and is easily proved wrong.
When you say "if no YieldMapper is given" what constitutes given? Do users have to supply extra explicit typing?
All yields should be one type (be that a programmer specified union or not).
Right, they are, that's the TYield
type.
what is to the right of a yield should never influence the type of what comes to the left of a yield. That's assuming a relationship that doesn't exist and is easily proved wrong.
The compiler will assume no relationship unless you give it one via TYieldMapper.
But the relationship in fact does exist in important and common scenarios, such as when using co
and the like for async control flow. For example, co
will always map Promise<T>
to T
, Promise<T>[]
to T[]
, and several other things that can be statically described. TYieldMapper
gives us the useful (and optional) ability to tell the compiler how to enforce/infer these rules automatically.
It's analogous to giving the compiler a bunch of function overload declarations so it can statically check if a function call is valid and what its return type will be when called with various combinations of parameter types and arities.
When you say "if no YieldMapper is given" what constitutes given? Do users have to supply extra explicit typing?
"if no YieldMapper is given" just means that the generator function has no type annotation. Giving a YieldMapper just means giving the generator function a type annotation.
Note that annotating the generator function is usually necessary for the async use case under the current proposal anyway. But under this altered proposal that annotation can be provided by the async library (like co
), so the library user won't need to annotate anything.
Gave some thought about this from a different angle, instead of trying to 'type' the whole thing, it could be expressed as:
function* genfunc3() {
yield Promise.delay(1000);
var result: number = yield Promise.resolve(42); // NB: annotated result
return result;
}
// compiler infers a type
type genfunc3IteratorResult = Promise<number> where <end(T): number>;
function* genfunc4() {
yield Promise.delay(1000);
}
// compiler infers a type
type genfunc4IteratorResult = Promise<number> where <end(T): undefined>;
// alternative syntax though unclear that 'return' means part of an iteration
type genfunc4IteratorResult = Promise<number> where <return(T): undefined>;
The thinking is that:
type genfunc3IteratorResult = Promise<number> where <end(T): number>;
is more expressive than:
type genfunc3IteratorResult = Promise<number>|number;
Going back to the first example:
function *g() {
yield 0;
return "";
}
type NumIteratorEndString = Iterator<number> where <end(T): string>;
var g2:NumIteratorEndString = g();
var x3 = g2.next(); // type number, correct
var x4 = g2.next(); // type number, could know it's string since Iterator is not infinite.
Unclear how compiler could track iteration 'loops', but seems like this is doable:
for(let a of g2) {
if(a.done) {
// a is of type IteratorResult<string>
} else {
// a is of type IteratorResult<number>
}
}
Another syntax which looks interesting or similar to this:
type NumIteratorEndString = Iterator<number> where <if(T.done): string>;
It seems I haven't made a clear case for why a YieldMapper is necessary or how it would work. I'll try another way.
Rationale
Going way back to the rationale of this proposal, it surely involves:
- accurately modeling the way ES6 generators work
- providing as much type safety as possible.
- providing as much type inference as possible.
Two major use cases of generators have been identified (in this thread and elsewhere - see for example this post):
- iterating over a set of values
- managing asynchronous control flow
Here is an example of each use case.
Example 1: Generator used for iteration
var evenNumbers = function* (min: number, max: number) {
var i = min;
while (i < max) {
if (i % 2 === 0) {
yield i;
}
++i;
}
}
Example 2: Generator used for asynchronous control flow
var co = require('co');
var Promise = require('bluebird');
var fs = Promise.promisifyAll(require('fs'));
var path = require('path');
var getDirectorySizeInBytes = co.wrap(function* (dirpath) {
var filenames = yield fs.readdirAsync(dirpath);
var filepaths = filenames.map(filename => path.join(dirpath, filename));
var stats = yield filepaths.map(filepath => fs.statAsync(filepath));
var totalSize = stats.reduce((sum, stat) => sum += stat.size, 0);
return totalSize;
});
getDirectorySizeInBytes('.')
.then(bytes => console.log(`Directory size in bytes: ${bytes}`));
// sample output:
// Directory size in bytes: 22504
Analysis of next()
and yield
in the two examples
Before looking at solutions, let's analyse how the generators actually behave in the two representative examples. In particular, the next
method of the generator object, and the yield
operator in the generator body both exhibit complex behaviour. Here is a breakdown that attempts to find patterns in the various cases.
Behaviour of yield
in generator used for iteration
yield
operands are important outside the generator body, they all have the same type (TYield
), and that type is used byfor...of
,genobj.next()
, etc.TYield = number
in the example- the values of
yield
expressions are generally not important - the values of
return
expressions (TReturn
) are generally not important
Behaviour of yield
in generator used for async control flow
yield
operands (TYield
) are important inside the generator body, but not outsideyield
operands are of generally of unrelated types within a single generatorTYield = Promise<string[]>|Array<Promise<fs.Stats>>
in the example- the values of
yield
expressions are important and are subsequently used in the generator body (filenames
andstats
in the example) - the type of a
yield
expression is related to the type of thatyield
's operand (in the example,yield
always mapsPromise<T>
toT
, andPromise<T>[]
toT[]
) - the values of
return
expressions are important outside the generator body (the example generator returns anumber
)
Behaviour of next()
in generator used for iteration
- usually called without an argument, ie
var next = genobj.next()
- before the generator returns,
next
is{ done: boolean; value: TYield; }
- once the generator returns,
next
is{ done: true; }
, andnext.value
is not used
Behaviour of next()
in generator used for async control flow
- usually called with an argument, ie
var next = genobj.next(value)
, except the first time - before the generator function returns,
next
is{ done: false; value: TYield; }
- once the generator returns,
next
is{ done: true; value: TReturn; }
next.value
is important in all cases- no relationship between the type of
next
's argument and that of its result
Accounting for all these behaviours in a single model
Based on the patterns observed above, here is a possible solution for modeling generators of all forms.
A generator's TYield
type is the union type of all its yield
operand types
- this models both
yield
andnext()
properly for both use cases
A generator's TReturn
type is the best common type (or union type?) of all its return
expression types
- this is harmlessly ignored in the iteration case.
- this models both
yield
andnext()
properly for the async control flow case.
The next
method returns {done: false; value?: TYield} | {done: true; value?: TReturn; }
- this models both cases well
- the iteration case can ignore the
done: true
part - without boolean literal types, this can be approximated as
{done: boolean; value?: TYield|TReturn}
with some minor inconvenience
The behaviour of yield
in all cases can be modeled as a polymorphic function with arity <= 1.
- In the iteration case, the
yield
expression type is unimportant, so the polymorphic model of theyield
operator can simply be(expr?: any) => any
. This models iteration well. - But
(expr?: any) => any
modelsyield
very poorly in the async case. It would require the generator author to annotate everyyield
expression, otherwise most of the generator body will be rendered untyped due to all theyield
expressions being inferred asany
. - In the async case, the
yield
expression type is typically a function of theyield
operand's type. This can be modeled with a polymorphic function, using TypeScript's function overloading feature. - For example,
co
has rules mappingyield
operands toyield
expressions.Promise<T>
always maps toT
,Promise<T>[]
always maps toT[]
, and several other rules are sufficient to describe the behaviour ofyield
in this case. It might look like this:
interface Yieldable {
<T>(Promise<T>): T;
<T>(Promise<T>[]): T[];
// ... several more ...
}
- the
Yieldable
interface is not part of TypeScript, it is supplied by user code to describe the behaviour ofyield
for the case at hand. Typically once per async library.
A possible solution
A generator interface that captures all this behaviour might look like this:
interface Generator<TYield, TReturn, TYieldMapping extends (expr?: any) => any> {
next(value?: any): IteratorResult<TYield, TReturn>;
throw(error: any): IteratorResult<TYield, TReturn>;
return(result?: any): IteratorResult<TYield, TReturn>;
[Symbol.iterator](): Generator<TYield, TReturn, TYieldMapping>;
[Symbol.toStringTag]: string;
}
TYield
and TReturn
are used to accurately model the consumption of a generator (ie, doing for...of
, calling next()
, etc.
TYieldMapper
is used to accurately model the behaviour of yield
within a generator body. Since it appears nowhere in the Generator
interface, it must either be provided as a type annotation to the generator function, otherwise it will be inferred as (expr?: any) => any
.
- async libraries like
co
can neatly provide thisTYieldMapper
annotation - like theYieldable
example above. - generators used for iteration don't need any annotation because
(expr?: any) => any
models theiryield
behaviour just fine.
Final illustration
I hope I have explained the essentiality of TYieldMapper
for accurately modeling generators. As a final example, consider example 2 above without a YieldMapper. TypeScript is really not helping at all with the types:
import co = require('co');
import Promise = require('bluebird');
var fs: FSPromised = Promise.promisifyAll(require('fs'));
import path = require('path');
var getDirectorySizeInBytes = co.wrap(function* (dirpath) {
var filenames = yield fs.readdirAsync(dirpath); // filenames is any
var filepaths = filenames.map(filename => path.join(dirpath, filename)); // everything here is any
var stats = yield filepaths.map(filepath => fs.statAsync(filepath)); // everything here is any
var totalSize = stats.reduce((sum, stat) => sum += stat.size, 0); // everything here is any
return totalSize; // totalSize is any
});
getDirectorySizeInBytes('.')
.then(bytes => console.log(`Directory size in bytes: ${bytes}`)); // bytes is any
// sample output:
// Directory size in bytes: 22504
Now consider the same example with a YieldMapper like the Yieldable
interface above. It's completely accurately typed, with no annotations:
import co = require('co');
import Promise = require('bluebird');
var fs: FSPromised = Promise.promisifyAll(require('fs'));
import path = require('path');
var getDirectorySizeInBytes = co.wrap(function* (dirpath) {
var filenames = yield fs.readdirAsync(dirpath); // filenames inferred as string[]
var filepaths = filenames.map(filename => path.join(dirpath, filename)); // filepaths inferred as string[]
var stats = yield filepaths.map(filepath => fs.statAsync(filepath)); // stats inferred as fs.Stats[]
var totalSize = stats.reduce((sum, stat) => sum += stat.size, 0); // totalSize inferred as number
return totalSize;
});
getDirectorySizeInBytes('.')
.then(bytes => console.log(`Directory size in bytes: ${bytes}`)); // bytes inferred as number
// sample output:
// Directory size in bytes: 22504
I suppose one thing I find awkward here is that you are trying to express a function on types as a function type. A function type in our type system has thus far represented only types that are inhabited by functions. They are never applied by the type system the way that generic types are. Your proposal also entails that we would perform type argument inference and overload resolution with all the call signatures in the type argument you pass as a mapper.
@JsonFreeman I've tried to code the behaviour of TYieldMapper
using roughly equivalent function overloading. The compiler is not quite as smart as I thought. Is this the problem you were referring to in the quote above? See the comments on the last two lines. The compiler infers all the types in the first case, but not in the second (generic) case, even though all the same information is statically available for it to do so.
interface Overloads {
<T>(expr: Promise<T>): T;
<T>(expr: Array<Promise<T>>): Array<T>;
(expr: number): boolean;
(expr: {}): {};
}
function f1(poly: Overloads) {
var a = poly(1);
var b = poly(Promise.resolve({ foo: 'bar' }));
var c = poly([1, 2, 3]);
var d = poly([Promise.resolve('foo'), Promise.resolve('bar')]);
return { a, b, c, d };
}
function f2<TPoly extends (expr?: any) => any>(poly: TPoly) {
var a = poly(1);
var b = poly(Promise.resolve({ foo: 'bar' }));
var c = poly([1, 2, 3]);
var d = poly([Promise.resolve('foo'), Promise.resolve('bar')]);
return { a, b, c, d };
}
declare var poly: Overloads;
var x = f1(poly); // inferred: { a: boolean; b: {foo:string}, c: any, d: string[] }
var y = f2(poly); // inferred: { a: any; b: any; c: any; d: any; }
Nevertheless, I don't see a reason why the compiler couldn't internally use its function overloading logic and a TYieldMapper
type to accurately infer each yield
expression in a generator.
It theoretically could use the overloading logic (with some refactoring to not assume that it's resolving a call expression). But it's just odd that a function type is being used to model a function on types.
It sounds like you do not care about typing the consumption point (the calls to next
) for your async scenarios. So the internals of co.wrap would not get any type checking under your proposal. Why is it okay to discard type checking inside co.wrap?
Regarding your question about the inference in the most recent example: First, I would ask that this be continued on a different thread. But the issue is that the return type of f2 is computed only once based on the definition of f2. And when it is computed, the only thing known about TPoly is its generic constraint. Then it instantiates this computed return type when it processes calls to f2, but it does not recompute it. In other words, the language does not recompute the return type from the body of f2 given information about how f2 is called. This is how it works for generic functions, but for generic types, there is a lot more connection between what happens inside the type and how the type is instantiated from the outside.
So the internals of co.wrap would not get any type checking under your proposal.
@JsonFreeman Not so; next()
is typed exactly the same way that you are proposing. it it returns { done: true; value?: TYield; } | { done: false; value?: TReturn; }
(or simply { done: boolean; value?: TYield | TReturn; }
as per option 5). Together with type guards this provides complete and accurate type checking inside co.wrap
. I gave a cut-down implementation of co
above that illustrates this.
On the contrary, what I'm trying to show is that the current proposal doesn't care about typing within the generator body. Look how the any
type bleeds through everything in the final illustration with getDirectorySizeInBytes
above. Doesn't that bother you?
Your proposal does not give checking of the next
parameter inside co.wrap. So there is nothing enforcing that the call to next
is passing something of the right type. I worry that with the YieldMapper, the author of the generator will get the feeling that all the types are being meticulously related (namely the parameter type of next, and the return type of next), when in fact on the call side (the inside of co.wrap), this relationship is not enforced.
I suppose it's that I find the asymmetry of type checking strength (inside vs outside the generator) to be unsettling.
I suppose it's that I find the asymmetry of type checking strength (inside vs outside the generator) to be unsettling.
It actually reduces the asymmetry already there in the current proposal by adding accurate modeling of the yield
operator, everything else is exactly the same as the current proposal.
The current proposal means that you get all the type checking on the right hand side of the yield, but none on the left hand side. Your suggestion means that you get all the checking on the right hand side, but half of the checking on the left. Namely, you get checking of the left hand side inside the generator, but not outside. Also, your presumed relationship between the right hand side and the left hand side is only correct insofar as the caller of next
honors it. And there is no checking to make sure that the caller of next
honors that relationship.
Your proposal does not give checking of the next parameter inside co.wrap. So there is nothing enforcing that the call to next is passing something of the right type.
OK, well how could that be done? We know all the possible types that could be yielded (that's the TYield
union). We can statically describe how these are mapped to yield
expression types using something like TYieldMapping
. What's missing, as you say, is a constraint on what gets passed back to next()
.
Take a simple rule from co
. If a Promise<T>
is yielded to it, it will await the resolved value, T
, and pass that back through next()
. But T
could be absolutely any type, so it can't be statically constrained for anything more specific than any
. ie no type checker could ever statically catch a type error here.
Also, the type passed into next()
does have a relationship to the type returned from the previous call to next()
inside the generator consumer. It's the same relationship described by TYieldMapping
. But because the relationship is spread across two separate calls to next()
, its a runtime relationship than can't possibly be statically checked.
So you are right, I'm not offering any way to check what gets passed in to next()
. But that's not out of carelessness. Its just not statically checkable.
That in no way takes away from the usefulness of TYieldMapping
for checking the yield
expressions statically.
your presumed relationship between the right hand side and the left hand side is only correct insofar as the caller of next honors it. And there is no checking to make sure that the caller of next honors that relationship.
The caller of next()
is presumably a widely used library, which it also the provider of the TYieldMapping
used to contextually type generators passed to its own API. In most cases, as with co
, its probably not even written in TypeScript. We are just talking about having an accurate .d.ts
for its API.
You're basically saying that the possibility of something not honouring its .d.ts
file at runtime is unacceptable. Totally right, but if TypeScript was based on this logic, it would suddenly disappear like ES4! TypeScript can't enforce the promises library authors make, it just assumes they honour them. I think the point is to help library users get accurate type inference.
So you are right, I'm not offering any way to check what gets passed in to next(). But that's not out of carelessness. Its just not statically checkable.
I agree with this. I realize that this relates types that are spread across two consecutive calls to next. And that it's not possible.
That in no way takes away from the usefulness of TYieldMapping for checking the yield expressions statically.
This I am not sure about. Personally I feel that without the checking on the consumer side, it is very hard to justify a fancy mechanic to relate the two sides of the yield expression. I'm not saying it has no value whatsoever. I'm just saying that without consumer-side checking, I don't think it's valuable enough to create a new type system mechanic that doesn't really resemble anything else we have.
The caller of next() is presumably a widely used library, which it also the provider of the TYieldMapping used to contextually type its own API.
Not so. In TypeScript, a caller can provide contextual typing for the return expressions inside a callback. But the contextual type does not actually serve to provide a type annotation on the callback. And in order for the caller to provide the mapper, it would have to be inherited from the contextual type in that fashion. To do that, we'd have to make a separate rule that contextual typing works differently for generators than it does for regular functions. I do not think this is a natural rule.
I see your point that in this pattern, the caller is the library, and the consumer provides the generator as a callback. I think your argument makes sense if the caller of the generator is implemented in Javascript, and then documented in .d.ts. But it does not scale to a generator consumer written in TypeScript. That may not be important for your scenario because you've likely already written co.wrap in Javascript, and you just have to document it in .d.ts. I understand that in your case, it is a perfect fit. But it just seems like the utility of the mapper is so specific to this particular use case. I think that in order to seriously consider it, we'd need to feel that the mapper would benefit more use cases in general.
@yortus the other thing is that what you're proposing only supports some async libraries that use generators and not all libraries that use generators, for example the library I am creating does not behave like that.
What you want to do is add a type/way of type checking into Typescript that supports one style of using generators that gimps library authors that have designed another was of using generators for async, even though neither of those are behaving the same way Typescript does!
That's like Typescript having built-in support for type checking jQuery, in my opinion that prevents othergood libraries from gaining popularity.
As @Griffork said earlier, 'when I was first looking up generators, the amount of threads/blogs/posts I found that wanted to use it in a promise fashion vs an iterable was about 10:1.'.
And most of the libraries involved are not written in TypeScript, so as you say its a case of documenting the API in a .d.ts
file.
So I think 'we'd need to feel that the mapper would benefit more use cases in general.' is missing the forest for the trees. If @Griffork's informal survey has any merit (and I must say I have to agree in the observation that async is the biggest practical use case out there), then we are talking about the mapper being relevant in the majority of use cases. Is that not enough?
Let's say the current proposal goes ahead without the mapper. Here's what I predict. People start using generators in TypeScript. Then one by one, issues come in all about 'Why is nothing typed inside my generator? Why do I have to annotate everything? What is the point of TypeScript if it can't infer any types? Why can't I provide accurate typing for this library? All these inferences can be described statically so why can't TypeScript do it! yadda yadda.'
Just sayin'
@Griffork can you describe what your library does with generators and how it would be gimped?
Also, nothing is proposed to be 'built-in', and no 'one style' of generators is favoured. If anything, the current proposal favours iteration over async.
I model C# style async functions, where the returned value (to yield) is typically the TReturn of another generator (regardless of how many yield calls that generator has in the middle). The call to that generator does not have to be on the right side of yield (since the decision to wait is done by the calling function, not by the yield's return value).
It also has (pain-free) support for yielding for traditional callback-driven async code.
It's just a mock up I've done of a library in going to make, it's not the final version of that library.
As in:
await(async(generator)) ;
yield;
(Typically async is called at the generator's constructor time like co is)
The library is designed to support concurrent requests (e. g. multiple XMLHTTPRequest
or multiple await(async(generator))
).
As you can see, my library would get no typing, while your library would. Making yours better and mine not competitive.
I think it is not unreasonable to let the user give a type annotation for their yield expressions. But that is the most we can do. I don't think it makes sense to build a part of the type system that presumes such a fancy relationship, even though this may make certain select cases type better. I'm not convinced it's of general utility.
I think it is not unreasonable to provide the user the option to give a type annotation for their yield expressions. But that is the most we can do. I don't think it makes sense to build a part of the type system that presumes such a fancy relationship
@JsonFreeman that's a strawman. The mapper is exactly what you just described as reasonable (an 'option to give a type annotation for their yield expressions'), and nowhere is any fancy relationship presumed. The default case is (expr?: any) => any
which covers all possible yield cases without presumption.
@Griffork I understand. I'm working on something similar myself ;) So how do you think your way of using generators would be gimped if an optional yield mapper was available?
As you can see, my library would get no typing, while your library would. Making yours better and mine not competitive.
Can you elaborate? Why would it get no typing exactly? Example maybe?
my library would get no typing, while your library would. Making yours better and mine not competitive.
Typescript would be not modelling that language, but modelling the "preferred use". And no one would be able to compete against the "preferred use". The draw to Typescript for me (over coffee script and others) is that they made no assumptions about how you're going to use the language, everything is 'fair game' (which is why I can modify Array.prototype).
Sure:
var (val1,val2) = yield await(asyncgen1), await(asyncgen2);
val1
and val2
would have no typing, so your library is by default better than mine.
*square brackets, I couldn't remember the destructuring syntax
@Griffork when you say 'your library', what library are you talking about? And what does 'preferred use' mean?
val1
and val2
would get no typing under the current proposal anyway. And your example could never be statically typed due to its API design. That's not a reason to gimp all libraries equally.
It could (with some very library/use specific and code, like the code you're suggesting that is library/use specific) 'my library' is a library that I'm working on (library in the same fashion co is a library).
"preferred use" is the use case that the language specifically supports either better than the alternatives or absent of alternatives.
@Griffork I haven't proposed any preferred use, so I still don't follow your meaning here. Can you elaborate?
The only code I mentioned that is library/use specific, like CoYieldable
, would reside in that library or its .d.ts file. TypeScript would merely provide a mechanism for optionally typing the way yield
behaves in particular library-defined scenarios. That would be useful for typing many libraries that work with generators. Even yours if you have an API that can be statically modelled.
TypeScript lets you optionally type all kinds of other things, if you can provide a static description of them. That doesn't make all these typings somehow 'preferred'. They reside in the libraries where they belong.
@yortus Typescript lets you type Javascript. It only (so far) supplies semantics for describing how raw javascript works, not for describing how libraries work.
What your asking for is a)
not required for the first implementation of generators (thus @JsonFreeman telling you to create a new thread for this) and b)
only describes one use of generators, without supporting any other uses of generators (e.g. what I described).
All of these libraries that do async that are common now (e.g. angular, co, etc.) currently all use promises, but that doesn't mean in the next year promises are going to be the most common way of doing async with generators, we don't know that yet.
What you would do is practically prevent other ways of using generators from being able to emerge, due to Typescript having built in support for promises (only), and no other ways.
Why, if you're going to support using generators with promises, can't I also insist that the Typescript team support how I'm going to use generators (because yes, it is possible, just not very feasible and not useful for more than that one way of using generators).
My initial points for async (which you quoted out of context) were in the context that async would be just as common (not more so) than iterators, and they should be supported equally. If you're going to argue support for a specific way of doing async, then I argue for support for any other way of doing async.
Which will (imo) will end up with an over-the-top bloated unmaintainable typing system.
After some discussion, here is the current plan:
- For now, yield expressions will be required to have a common supertype. For additional discussion, let's use issue #921
- For now, return expressions will be allowed, but ignored. When we have boolean literal types, we will track the return expression's type correctly, instead of hacking it temporarily. This is outlined in #2983
- Yield expressions themselves will have type
any
, but we will also revisit this when we do the return expressions after boolean literal types. My plan is that we will allow the user to provide a parameter type fornext
, and all the yield expressions have to be that type. - I will temporarily remove the Generator type from es6.d.ts, as we will add a better one when we have boolean literal types.
As a result, I will add good support for generators as iterables for now. This is because it is possible to support that use case well now, whereas supporting the async use case should be built on top of boolean literals. Async use cases will still be possible, just not strongly typed. After boolean literals, we can better support async use cases.
@JsonFreeman I'm curious, what was the reasoning behind ignoring return types vs using option 5 (special hack)?
@JsonFreeman sounds like a good start. 'For now, yield expressions will be required to have a common supertype'. This means that for any of the async examples I've given in this thread to compile, they will have to be explicitly annotated with TYield=any
, since their yield operands don't have a common supertype. Is that right? And all the yield expressions will have to be explicitly typed too for now. And TReturn
too. Basically everything.
@Griffork TypeScript would know nothing about promises under my proposal. Not sure why you think that. I wholeheartedly agree with your point, but it just doesn't apply to the technique I proposed.
The reasoning behind not doing the hack (option 5) is that we have a better long term solution that is not a hack. Doing the hack would give us some value in the short term and none in the long term. And while I think tracking the return type is important, I do not think it is urgent enough to warrant the hack that we will later remove.
For the common type issue, yes you must provide any
or the union type that you're interested in. Again, issue #921 is relevant here. Btw, if your generator is contextually typed, then we do infer the union type, so if you pass it directly to co.wrap, you probably should be fine not supplying the type.
Yield expressions themselves will have to be explicitly typed inline for now, yes.