microsoft/TypeScript

Proposal for generators design

JsonFreeman opened this issue · 107 comments

A generator is a syntactic way to declare a function that can yield. Yielding will give a value to the caller of the next() method of the generator, and will suspend execution at the yield point. A generator also supports yield * which means that it will delegate to another generator and yield the results that the inner generator yields. yield and yield * are also bi-directional. A value can flow in as well as out.

Like an iterator, the thing returned by the next method has a done property and a value property. Yielding sets done to false, and returning sets done to true.

A generator is also iterable. You can iterate over the yielded values of the generator, using for-of, spread or array destructuring. However, only yielded values come out when you use a generator in this way. Returned values are never exposed. As a result, this proposal only considers the value type of next() when the done property is false, since those are the ones that will normally be observed.

Basic support for generators

Type annotation on a generator

A generator function can have a return type annotation, just like a function. The annotation represents the type of the generator returned by the function. Here is an example:

function *g(): Iterable<string> {
    for (var i = 0; i < 100; i++) {
        yield ""; // string is assignable to string
    }
    yield * otherStringGenerator(); // otherStringGenerator must be iterable and element type assignable to string
}

Here are the rules:

  • The type annotation must be assignable to Iterable<any>.
    • This has been revised: IterableIterator<any> must be assignable to the type annotation instead.
  • The operand of every yield expression (if present) must be assignable to the element type of the generator (string in this case)
  • The operand of every yield * expression must be assignable to Iterable<any>
  • The element type of the operand of every yield * expression must be assignable to the element type of the generator. (string is assignable to string)
  • The operand of a yield (if present) expression is contextually typed by the element type of the generator (string)
  • The operand of a yield * expression is contextually typed by the type of the generator (Iterable<string>)
  • A yield expression has type any.
  • A yield * expression has type any.
  • The generator is allowed to have return expressions as well, but they are ignored for the purposes of type checking the generator type. The generator cannot have return expressions
  • Open question: Do we want to give an error for a return expression that is not assignable to the element type? If so, we would also contextually type it by the element type.
    • Answer: we will give an error on all return expressions in a generator. Consider relaxing this later.
  • Open question: Should we allow void generators?
    • Answer: no

Inferring the type of a generator

A generator function with no type annotation can have the type annotation inferred. So in the following case, the type will be inferred from the yield statements:

function *g() {
    for (var i = 0; i < 100; i++) {
        yield ""; // infer string
    }
    yield * otherStringGenerator(); // infer element type of otherStringGenerator
}
  • Rather than inferring Iterable, we will infer IterableIterator, with some element type. The reason is that someone can call next directly on the generator without first getting its iterator. A generator is in fact an iterator as well as an iterable.
  • The element type is the common supertype of all the yield operands and the element types of all the yield * operands.
  • It is an error if there is no common supertype.
  • As before, the operand of every yield * expression must be assignable to Iterable<any>
  • yield and yield * expressions again have type any
  • If the generator is contextually typed, the operands of yield expressions are contextually typed by the element type of the contextual type
  • If the generator is contextually typed, the operands of yield * expressions are contextually typed by the contextual type.
  • Again, return expressions are allowed, but not used for inferring the element type. Return expressions are not allowed. Consider relaxing this later, particularly if there is no type annotation.
  • Open question: Should we give an error for return expressions not assignable to element type (same as the question above)
    • Answer: no return expressions.
  • If there are no yield operands and no yield * expressions, what should the element type be?
    • Answer: implicit any

The * type constructor

Since the Iterable type will be used a lot, it is a good opportunity to add a syntactic form for iterable types. We will use T* to mean Iterable<T>, much the same as T[] is Array<T>. It does not do anything special, it's just a shorthand. It will have the same grammatical precedence as [].

Question: Should it be an error to use * type if you are compiling below ES6.

The good things about this design is that it is super easy to create an iterable by declaring a generator function. And it is super easy to consume it like you would any other type of iterable.

function *g(limit) {
    for (var i = 0; i < limit; i++) {
        yield i;
    }
}

for (let i of g(100)) {
    console.log(i);
}
var array = [...g(50)];
var [first, second, ...rest] = g(100);

Drawbacks of this basic design

  1. The type returned by a call to next is not always correct if the generator has a return expression.
function *g() {
    yield 0;
    return "";
}
var instance = g();
var x = instance.next().value; // x is number, correct
var x2 = instance.next().value; // x2 is given type number, but it's actually a string!

This implies that maybe we should give an error when return expressions are not assignable to the element type. Though if we do, there is no way out.
2. The types of yield and yield * expressions are just any. Many users will not care about these, but the type of the yield expression is useful if for example, you are implementing await on top of yield.
3. If you type your generator with the * type, it does not allow someone to call next directly on the generator. Instead they must cast the generator or get the iterator from the generator.

function *g(): number* {
    yield 0;
}
var gen = g();
gen.next(); // Error, but allowed in ES6 (preferred in fact)
(<IterableIterator<number>>gen).next(); // works, but really ugly
gen[Symbol.iterator]().next(); // works, but pretty ugly as well

To clarify, issue 3 is not an issue for for-of, spread, and destructuring. It is only an issue for direct calls to next. The good thing is that you can get around this by either leaving off the type annotation from the generator, or by typing it as an IterableIterator.

Advanced additions to proposal

To help alleviate issue 2, we can introduce a nominal Generator type (already in es6.d.ts today). It is an interface, but the compiler would have a special understanding of its type arguments. It would look something like this:

interface Generator<TYield, TReturn, TNext> extends IterableIterator<TYield /*| TReturn*/> {
    next(n: TNext): IteratorResult<TYield /*|TReturn*/>;
    // throw and return methods elided
}

Notice that TReturn is not used in the type, but it will have special meaning if you are using something that is nominally a Generator. Use of the Generator type annotation is purely optional. The reason that we need to omit TReturn in the next method is so that Generator can be assignable to IterableIterator<TYield>. Note that this means issue 1 still remains.

  • The type of a yield expression will be the type of TNext
function *g(): Generator<number, any, string> {
   var x = yield 0; // x has type string
}
  • If the user does not specify the Generator type annotation, then consuming a yield expression as an expression will be an implicit any. Yield expression statements will be unaffected.
  • For a return expression not assignable to the yield type of the generator, we can give an error (require a type annotation) or we can infer Generator<TYield, TReturn, any>?
function *g() {
    yield 0;
    return ""; // Error or infer TReturn as string
}

Once we have TReturn in place, the following rules are added:

  • If the operand of yield * is a Generator, then the yield * expression has the type TReturn (the second type argument of that generator)
  • If the operand of a yield * is a Generator, and the yield * expression is inside a Generator, TNext of the outer generator must be assignable to TNext of the inner one.
function *g1(): Generator<any, any, string> {
    var t = yield * g2(); // Error that string is not assignable to number
}
function *g2(): Generator<any, any, number> {
    var s = yield 0;
}
  • If the operand of yield * is not a Generator, and the yield * is used as an expression, it will be an implicit any.

Ok, now for issue 1, the incorrectness of next. There is no great way to do this. But one idea, courtesy of @CyrusNajmabadi, is to use TReturn in the body of the Generator interface, so that it looks like this:

interface Generator<TYield, TReturn, TNext> extends IterableIterator<TYield> {
    next(n: TNext): IteratorResult<TYield | TReturn>;
    // throw and return methods elided
}

As it is, Generator will not be assignable to IterableIterator<TYield>. To make it assignable, we would change assignability so that every time we assign Generator<TYield, TReturn, TNext> to something, assignability changes this to Generator<TYield, any, TNext> for the purposes of the assignment. This is very easy to do in the compiler.

When we do this, we get the following result:

function *g() {
    yield 0;
    return "";
}
var g1 = g();
var x1 = g1.next().value; // number | string (was number with old typing)
var x2 = g1.next().value; // number | string (was number with old typing, and should be string)

var g2: Iterator<number> = g(); // Assignment is allowed by special rule!
var x3 = g2.next(); // number, correct
var x4 = g2.next(); // number, should be string

So you lose the correctness of next when you subsume the generator into an iterable/iterator. But you at least get general correctness when you are using it raw, as a generator.

Additionally, operators like for-of, spread, and destructuring would just get TYield, and would be unaffected by this addition, including if they are done on a Generator.

Thank you to everyone who helped come up with these ideas.

I've updated the proposal with the results of further discussion. There have only been a few minor changes:

  • Answers to open questions in the basic proposal.
  • Change the rule about what return type annotations are allowed on a generator function. IterableIterator<any> must be assignable to the return type annotation.
  • Return expressions are not allowed in generators. We can consider relaxing this later if the need arises.

For the sake of completeness, I think it would extremely helpful to actually state the current declarations of the types named here:

interface IteratorResult<T> {
    done: boolean;
    value?: T;
}

interface Iterator<T> {
    next(value?: any): IteratorResult<T>;
    return?(value?: any): IteratorResult<T>;
    throw?(e?: any): IteratorResult<T>;
}

interface Iterable<T> {
    [Symbol.iterator](): Iterator<T>;
}

interface IterableIterator<T> extends Iterator<T> {
    [Symbol.iterator](): IterableIterator<T>;
}

interface GeneratorFunction extends Function {
}

interface GeneratorFunctionConstructor {
    /**
      * Creates a new Generator function.
      * @param args A list of arguments the function accepts.
      */
    new (...args: string[]): GeneratorFunction;
    (...args: string[]): GeneratorFunction;
    prototype: GeneratorFunction;
}
declare var GeneratorFunction: GeneratorFunctionConstructor;

interface Generator<T> extends IterableIterator<T> {
    next(value?: any): IteratorResult<T>;
    throw(exception: any): IteratorResult<T>;
    return(value: T): IteratorResult<T>;
    [Symbol.iterator](): Generator<T>;
    [Symbol.toStringTag]: string;
}

Looks good!

Got a little lost in the first post, but I'm going to write what I understood, and you guys can correct me if I'm wrong:

function *g () {
    var result: TNext = yield <TYield>mything()
}
  • g cannot contain the statement return.
  • All yield keywords must be treated as the same type (called TNext) that can be a union.
  • All calls to an ginst (an instance of g) of the form ginst.next(...) must pass a parameter of type TNext (assuming that's only if TNext is not null, I don't know if TNext can be null).
  • Any value on the right of the yield keyword must be of type TYield, and if ommitted is treated as the value undefined.
  • An instance of g can be typed as follows: var ginst: TYield* but then you must cast to an IterableIterator (or something similar) before calling ginst.next (just a note here - yuck?)

Is there anything important that I missed here?


Request:
A nicer way of defining generator types e.g. for a generator,

function* g(value: number) {
    while (true) {
        value+= yield value;
    }
}

something like:

var ginst: GeneratorInstance<number, number>

and

var gtype: *g(start: number)=>GeneratorInstance<number, number>;

For the following code:

ginst = g(0);
ginst.next(2);
gtype = g;

👍 for generators

edit: fixed putting *'s in all the wrong places.

... Also the lack of a return statement annoys me, I think it should be forced to have the same type as yield, and if it's a different type (and yield is being implicitly typed) the return type should force a change to the implicitly derived type for yield.
To summarise; In a generator function return is treated identically to yield.

This way I can have my generators actually end on a value that's not forced to be undefined (by Typescript).

@Griffork from what I understand, you can have return statements, just not return expressions - specifically, you can't return a value, but you can bail out from within the generator at any point.

This probably doesn't help your frustration in the return type being ignore; however, it would certainly help to get some use realistic cases for what exactly you'd like to return when a generator has terminated.

@DanielRosenwasser not sure I understand.
I guess what you're calling a return expression is: return true;?
If that is the case, then how is a return statement different to a return expression?


Here's an example of the type of generator I was thinking of when I voiced my discomfort:

function* g (case) {
    while(true){
        switch(case) {
            case "dowork1":
                //do stuff
                case = yield "OPERATIONAL - OK";
                break;
            case "dowork2":
                //do stuff
                case = yield "OPERATIONAL - OK";
                break;
           case "shutdown":
               //do stuff
               return "COMPLETE";
        }
    }
}

Where it may execute an arbitrary amount of times, but at some point it's "completed" and it notify's it's caller that it's done.

My concern (which I have not yet researched) is that without the return statement, there might be garbage-collection problems on some systems (particularly since the whole function-state has to be suspended and resumed on a yield), which is bad if you're spawning a lot of similarly-structured generators/iterators.

It also makes the function read a lot more clearly in my opinion.

I guess what you're calling a return expression is: return true;?

That is a return statement, for which the return expression is true.

In other words, a return expression is the expression being returned in a return statement.

Where it may execute an arbitrary amount of times, but at some point it's "completed" and it notify's it's caller that it's done.

From what I understand of your example, you return "COMPLETE" to indicate that the generator is done, which I don't see as any more useful as the done property on the iterator result. We need some more compelling examples.

Though, now that I think about it, if there are multiple ways to terminate (i.e. shutdown or failure), that's when the returned value in a state-machine-style generator would be useful.

@DanielRosenwasser got it, thanks for the clarification :).

I'd argue that a correct implementation would allow return expressions, and type them distictly from yield expressions.

Generators are commonly used in asynchronous task runners, such as co. Here is an example:

var co = require('co');
var Promise = require('bluebird');

// Return a promise that resolves to `result` after `delay` milliseconds
function asyncOp(delay, result) {
    return new Promise(function (resolve) {
        setTimeout(function () { resolve(result); }, delay);
    });
}

// Run a task asynchronously
co(function* () {
    var a = yield asyncOp(500, 'A');
    var ab = yield asyncOp(500, a + 'B');
    var abc = yield asyncOp(500, ab + 'C');
    return abc;
})
.then (console.log)
.catch (console.log);

The above program prints 'ABC' after a 1.5 second pause.

The yield expressions are all promises. The task runner awaits the result of each yielded promise and resumes the generator with the resolved value.

The return expression is used by the task runner to resolve the promise associated with the task itself.

In this use case, yield and return expressions are (a) equally essential, and (b) have unrelated types that ideally would be kept separate. In the example, TYield is Promise<string> and TReturn is string. There is no reason why they would be conflated into one type in a task runner.

@yortus I'm not sure what you're asking for is at all possible, or if it makes any sense, I'll try to explain where I'm confused.

The only way to start or resume a generator is the generator's .next function. This function takes a single argument (which is supplied in place of the yield expression) and returns a single value (which is the value to the right of the yield expression).

The following Javascript:

function*g() {
    var a = yield "a";
    var b = yield a + "b";
    var c = yield b + "bc";
    return 0;
}
var ginst = g();

console.log(g.next() + g.next("a") + g.next("a"));
return g.next("");

Is the equivalent to

console.log(("a") + ("a" + "b") + ("a" + "bc"));
return 0;

But what happens if I try:

var done = false;
var value;
while (!done) {
    value = ginst.next(value);
    console.log(value);
}

I get:

"a"
"ab"
"abbc"
0

The last one is a number, meaning if ginst.next is to be called in a loop, the return type must be string|number or it may be incorrect.


It's important to note here that the proposal that yield and return are treated identically will work for co's consumption, and for Promises. If it will help I can write some example implementations.

Like that last suggestion:

interface Generator<TYield, TReturn, TNext> extends IterableIterator<TYield> {
    next(n: TNext): IteratorResult<TYield | TReturn>;
    // throw and return methods elided
}

Seems ok to lose the correctness of next when you subsume the generator into an iterable/iterator.

Does that solve drawback #3?
Not sure if I like T*, this looks clearer:

function *g(): Generator<number, string, any>  {
    yield 0;
    return "";
}
var a: Iterable<number|string> = g();

// lose correctness
var b: Iterable<number> = g();

// consider using *T for better symmetry instead of T*
var c: *number = g();

@jbondc *T has better symmetry, but can be confusing because *T doesn't denote a generator here, it denotes an iterable, which while that can be the same thing can also not be the same thing.

If you read * as 'many values' from thing, it works well for generators and iterators. Likely T* bothers me because it looks like a pointer if you write string*

Another example of using generators to support asynchronous control flow. This is working code, runnable in current io.js. There are some comments showing the runtime types of TYield and TReturn. When generators are used in this way, these types tend to be unrelated to each other. The most useful type to have inferred in this example is probably the TReturn type.

var co = require('co');
var Promise = require('bluebird');
var fs = Promise.promisifyAll(require('fs'));
var path = require('path');

// bulkStat: (dirpath: string) => Promise<{ [filepath: string]: fs.Stats; }>
var bulkStat = co.wrap(function* (dirpath) {

    // filenames: string[], TYield = Promise<string[]>
    var filenames = yield fs.readdirAsync(dirpath);
    var filepaths = filenames.map(function (filename) {
        return path.join(dirpath, filename);
    });

    // stats: Array<fs.Stats>, TYield = Array<Promise<fs.Stats>>
    var stats = yield filepaths.map(function (filepath) {
        return fs.statAsync(filepath);
    });

    // result: { [filepath: string]: fs.Stats; }
    var result = filepaths.reduce(function (result, filepath, i) {
        result[filepath] = stats[i];
        return result;
    }, {});

    // TReturn = { [filepath: string]: fs.Stats; }
    return result;
});

bulkStat(__dirname)
    .then(function (stats) {
        console.log(`This file is ${stats[__filename].size} bytes long.`);
    })
    .catch(console.log);

// console output:
// This file is 1097 bytes long.

The function bulkStat stats all the files in the specified directory and returns a promise of an object that maps file paths to their stats.

Note that the TReturn type is unrelated to either of the TYield types, and the two TYield types are unrelated to each other.

@Griffork

  • g can contain return statements, but they cannot return a value, as explained by @DanielRosenwasser.
  • Yes, if you type the generator with a *, you'll have to cast to IterableIterator to call next directly. If you leave off the type, it will be inferred as IterableIterator. And you can certainly type it as IterableIterator to begin with.
  • Sounds like what you are asking for is a type that is Nextable. Namely, a type that specifies the in-type as well as the out-type. That seems reasonable. The one caveat is that most consumers (for-of, spread, destructuring assignments) will never be passing a value to next. Would you recommend that if a generator's next requires a value to be sent in, then it is an error to use for-of on the generator? In other words, we would only allow these two-way generators to be consumed by calling next directly, or by yield*.
  • Regarding return values: The problem with basing the element type on return expressions is that most iterations (for-of, spread, etc) will never observe the value returned by the return statement. So if the generator yields one type, but returns another type, we don't want to pollute the element type with the return, when 90% of users will never even see the return. What use cases do you have in mind for the return value? Keep in mind that the return value can only be observed by calling next directly, or by using yield*.

@yortus
I agree that the primary case for passing in a value to next is async frameworks, since you want to pass the value the awaited Promise was resolved with. And I see your point about the return value being used to signify the fulfilled value of the Promise being created. I suppose the limitation of the basic proposal is that while it is great at typing generators as an implementation of an iterable, it does not give strong treatment to using generators as async state machines. Suppose we relaxed the restriction on return values, and the type system just ignored them. Would that be acceptable? We would allow everything that is required to write your async state machines, but there would be a lot of any types floating around. Presumably this is a pretty advanced use case.

Without dependent types, it becomes very hard to hold onto TReturn without having it pollute TYield. Ideally, we would have one type associated with done: false and another with done: true. But without that facility, there is really no good place to represent TYield and TReturn separately in the type structure.

@jbondc, I understand your syntactic concern with * looking like a pointer. But I have to agree with @Griffork that *T will be more confusing, because it seems to be intimately tied to generators. And in fact, this type needn't be used with generators. It is just sugar for an Iterable.

Replying in phone, bear with me...

@JsonFreeman oh, good point. I stopped monitoring the straw man before for... of was finalised. The use case that I currently have for return is the state machine example above when you consider that you can also return "ERROR".
On another note, does done = true on error?

Yes, I plan to do some funcy promise-like stuff with a next-able state based generator.
And I like your suggestion that generators that take a value should error in a for-of.


Being able to detect type depending on the value of done sounds good, but I'm not sure how possible that is, as it would be easy to break.
The only way I can see @Yortis' example working is if he explicitly passed typing information to co and co used that to type the return function. Either way I don't think it's possible for Typescript to provide what you're asking for, unless someone can give me a working example of how it would be implemented.

@JsonFreeman would it be possible to opt in/out of returning a value?

I don't know where your facts about the typical usage of generators comes from, an article like that would be useful to read, would I be able to get a link?
From what you're saying it sounds like most users are liable to use both yield and return to return values from their generator but they don't want to know about the value returned by return.
Or are you trying to say that most users don't use return (I imagine if you're not using return in the generator, it's not going to pollute the yielded value).

@Griffork

  • If I provide a way to declare a generator that requires something to be sent in, I could make it an error to consume it with for-of, spread, etc. The problem that remains is the first call to next, which should not take a value. Most consumers will call next() instead of next(undefined) on the first call, so it seems silly to require them to pass a dummy argument. So I could not do this by giving next a required parameter. Given that constraint, I'm not sure how we could distinguish between a generator that can be consumed with for-of and a generator that cannot.
  • When you ask if done = true on error, not sure what you mean by "on error".
  • Regarding opting in/out of returning a value: I think what you're asking for is to make it legal to return a value, so just remove the error, correct?
  • I would not call our assumptions about typical usage "facts". They are more just conjectures at this point because the feature is so new.
  • I imagine that most users who are returning a value are doing it by mistake, because they do not realize that the last value cannot be observed by most iteration constructs. Either that, or they do not care to discard it. However, I realize there are some users who are implementing advanced mechanics like async, and who control both the generator and its consumption (similar to @yortus's scenario). And those users have a legitimate reason to return a value. I don't think there are many such users though. It highly depends on whether the generator is supposed to be an iterator, or something more advanced than that.
  • The fact remains that if give a way of enforcing the type of the return values, it becomes very difficult to separate it from the yield type.

It would essentially involve hacking the assignability rules to make sure a generator that returns something is assignable to an iterable when you ignore that return value. Doable, but kind of a hack.

Oh, ok.
@JsonFreeman when I was first looking up generators, the amount of threads/blogs/posts I found that wanted to use it in a promise fashion vs an iterable was about 10:1. That's why I was asking you for your source. I don't think that the idea that most users will want to use it as an iterable is valid, although it will still be very prevalent, using the generator for promises looks like it will be about equally prevalent if not more.

I see what you mean about the problems with making generators sometimes not iterable. If it's going to be a hack, either don't do it or don't do it yet, leave it to the user and if it's a big problem later you can revaluate the decision.

As for opting in/out of returning a value, yes. When I first wrote that I was thinking of something else, but that idea was bad and this one is better.

Again, I don't think you can separate the return type from the yield type due to the way generators are used (although I agree it would be useful, JavaScript's implementation does not make this doable).

@Griffork here is an in-depth article describing many uses and details of generators. TL;DR: the two main uses cases so far are (1) implementing iterables and (2) blocking on asynchronous function calls.

@JsonFreeman having TReturn = any always would be a good start. Not allowing return expressions at all would rule out many valid uses of generators. You describe the async framework scenario as 'advanced'. Perhaps so, but in nodeland with its many async APIs, it's already a widespread idiom that works today and is growing in popularity. co has a lot or github stars, a lot of dependents, and a lot of variants.

Interestingly, when crafting generators to pass to co, one cares more about the TResult type and the types returned by yield expressions, whilst the TYield type is not so important.

Side note: the proposal for async functions (#1664) mentions using generators in the transform for representing async functions in ES6 targets. Return expressions are needed there, in fact the proposal shows one in its example code. It would be funny if tsc emitted generators with return expressions as its 'idiomatic ES6' for async functions, but rejected them as invalid on the input side.

@JsonFreeman #2936 mentions singleton types are getting the green light. At least for string literal types. If there was also a boolean literal type, then the next function could return something like { done: false; value?: TYield; } | { done: true; value?: TReturn; }. Then type guards could distinguish the two cases.

I'm just thinking out loud here, so not sure if that would make anything easier, even it if did exist.

@Griffork and @yortus, thank you for your points. It sounds like we are leaning towards the solution of the "next" parameter and the return values having type any, but allowing generator authors to return a value. The return type of next will take into account TYield but not TReturn. Would you agree that that solution is a good way to start?

@yortus, as for singleton types, let's see how it goes for strings, and then we can evaluate it for booleans. At that point it would be clearer whether it would help split up TYield and TReturn, but I imagine that it could be just what we need here.

@JsonFreeman sure, I'd be happy with that.
At least then there will be the opportunity to gather feedback from Typescript users instead of relying on speculation (particularly my own).

Thank you for listening, this has been one of the most enjoyable discussions I've had on a Typescript issue :-).

@JsonFreeman sounds good.

Another minor point:

interface Generator<T> extends IterableIterator<T> {
    next(value?: any): IteratorResult<T>;
    throw(exception: any): IteratorResult<T>;
    return(value: T): IteratorResult<T>;   // <--- value should not be constrained to T
    [Symbol.iterator](): Generator<T>;
    [Symbol.toStringTag]: string;
}

That's copied from above. Shouldn't the return method be return(value?: any): IteratorResult<T>;? Calling this method causes the generator to resume and immediately execute return value;. There is no link between the type of value and the T type which is the type of the yield expressions in the generator.

Great, thanks guys!

@yortus, I am actually not sure there is much value in defining the Generator type yet. I'd sooner remove it now, and add it back later if we want to leverage it to support the return value and next value.

But to your point about the return method, yeah I think you're right. I guess you could also define it as

return<U>(value: U): IteratorResult<T | U>;

Meaning it would return something of the yield type, or the thing you passed in. The yield type would only be returned in pathological cases like this:

function* g() {
    try {
        yield 0; // suspended here, and user calls return("hello");
    }
    finally {
        yield 1; // return gets intercepted by this yield expression
    }
}

But I realize that this is a ridiculous reason to include T in there.

@Griffork At least then there will be the opportunity to gather feedback from Typescript users instead of relying on speculation (particularly my own).

This (feedback driven changes) is definitely our preferred methodology but keep in mind the problem is we can relax a restriction later without breaking people but cannot do the reverse. So defaulting to typing something as any while permissive also means if we realize it's wrong later (whether based on our own exploration or feedback from others) then the change is much more painful than had we taken the more conservative approach. This is not to say we're just always defaulting to the most conservative option in the face of any uncertainty but it is definitely a large factor when considering which side to come down on when we want to give ourselves room to change/adapt in the future.

@danquirk I understand your concern.

My standing comes from the fact that there are already libraries that require that the generator's return function works. And I am planning to design a system that requires that the generator's return function is available, if it is not available my planned library cannot work (not even with a yield replacement).

So yes - I understand that you guys don't want to commit to something that in the future you won't be able to work with, but you must understand that users of TypeScript will require this functionality, and that it may not be as small of a percentage as you may think.

It is almost tempting to return to vanilla Javascript just for the generator support, however the large project that I'm embarking on will suffer from it in the long run.

defaulting to typing something as any while permissive also means if we realize it's wrong later (whether based on our own exploration or feedback from others) then the change is much more painful than had we taken the more conservative approach.

@danquirk that's a valid point and should perhaps rule out the TResult = any approach.

However if the current proposal to disallow return expressions stands, that will also be pretty painful for people thinking TypeScript supports ES6 generators and reaching for their favourite async control flow library. Perhaps in this case, the proposal/feature should be renamed on the 1.6 roadmap to better qualify it - something like 'Iterable Generators' or 'partial generator support'. As @Griffork points out, async control flow is a fairly major use-case of generators in current ES6 code out there.

As an alternative to TReturn = any, what would happen if the first stage proposal was to accurately infer both the TYield and TReturn types, and accept for now the inconveniences associated with next() returning { done: boolean; value?: TYield | TResult }. (NB: I'm assuming this is feasible in the compiler, but don't know enough about it to judge).

This has inconveniences for iterators, but at least it would be correct type-wise and therefore avoid the future-proofing problem @danquirk mentions. The inconveniences could be addressed with syntax or compiler sugar at a later point. But at least generators would be full ES6 generators.

To clarify one thing. I didn't actually mean that we would infer any from the return expressions. I just meant that we would ignore them instead of giving an error.

Inferring TYield | TReturn for the value means that neither of our two use cases (iteration and async) are pleasant or ideal. Until we can model the done property correctly as a literal type, I'd rather make one use case pleasant, and the other possible. This may sound short-signed, but I think we can largely avoid breaks later, if we make the stronger type change opt-in. Does that make sense?

I think we can largely avoid breaks later, if we make the stronger type change opt-in.

@JsonFreeman would you mind clarifying what this means in practice?

Sure. My statement presupposes that we will at some point have boolean literal types. The idea is that right now, for a generator, we will infer the return type to be IterableIterator<TYield>. The return type of next will be

{
    done: boolean;
    value: TYield; // not TReturn
}

Then later, let's say we have the opportunity to switch to boolean literal types. At this point, we would not change our inference. We would continue to infer IterableIterator<TYield>. But the user would have the opportunity to change their return type annotation to a stronger type that does use boolean literals. So we might provide a type Generator<TYield, TReturn>, whose next method returns

{ done: false; value: TYield } | { done: true; value: TReturn }

The one caveat is that I'm not sure this would be assignable to IterableIterator<TYield>. But we don't even need it to be. Because if you are typing your generator this way, you are willing to give up its iterability. How does that sound?

If you can make it work it sounds good.
I did not think

{ done: false; value: TYield } | { done: true; value: TReturn }

Was stricter/assignable to

{
    done: boolean;
    value: TYield; 
}

Or will the former be only with return and the latter be only without?

It is not assignable. The idea was that by default, we would infer the latter as the type (and ignore return values), but we'd allow you to specify the former.

However, there is a way to make it assignable if we alter the type of an Iterator so that it's .next method returns

{ done: false; value: TYield } | { done: true; value: any }

Now the former type from your comment is assignable to this one. The consequence is that calling next directly on an Iterator<string> will give you any, but if you can establish that done is false, you will get string. And all the syntactic forms that consume iterators will assume done is false.

Oh, ok. I was more thinking that people who get used to TYield being the only return type from next() may be in for a nasty surprise when it changes.

Right, I would not want to change it on them unless they change their type annotation.

So people who want to use return on a generator would need to type the generator to do so and wouldn't be able to get it implicitly?

If they want the type of their return expressions to be tracked, yes. Otherwise, the type system will just ignore it.

Well, I suspect those who want to use promises aren't going to want to cast every generator they write (and neither am I for my library).

If that is the case I guess I'll be looking into other languages that support generators better.

I probably won't ever use the generator as an iterator.

You wouldn't have to cast it, you would just have to supply a return type annotation on your generator when you defined it.

Would it be better to accept a breaking change later, and change the inference behavior with respect to return expressions? Namely, have them be part of the inference, once we have boolean literal types?

That's your call, not mine, but since I seem to be repeatedly misunderstanding you, can you provide an example of a generator with that return type, and example usage of it using next?

Ok, let's suppose that we did not change the inference behavior upon the introduction of boolean literal types. Then you could only reasonably use the following generator as an iterator:

function* g() {
    yield 0;
    return "completed";
}
for (let x in g()) {
    // here x has type number, as it should
}
var inst = g();
while (true) {
    let next = g.next();
    if (next.done) {
          let vReturn = next.value; // number, but should be string
    }
    else {
          let vYield = next.value; // number, as expected
    }
}

Now with the type annotation

function* g(): Generator<number, string, any> { // Note the new type annotation
    yield 0;
    return "completed";
}
for (let x in g()) {
    // x still has type number
}
var inst = g();
while (true) {
    let next = g.next();
    if (next.done) {
          let vReturn = next.value; // string, now correct
    }
    else {
          let vYield = next.value; // number, as expected
    }
}

So here are the options:

  1. Infer IterableIterator now, and don't change that inference later (when we have boolean literals). But at that time, allow the user to supply a type annotation that makes the typing more precise. Now, everything you'd want to do is possible, but may require a type annotation (not a cast) to work correctly.
  2. Infer IterableIterator now, and change inference to infer the more precise type later. This means that at that point, there will be no effort on the user to make generators work in the way that you are asking for. But it would break consumers who are using your generator as an iterator, and now suddenly can't.
  3. Do nothing now, and wait until we have boolean literal types to implement generators, so that we can have the correct typing right off the bat. This means we have no breaks, but we delay generators until literal types are done.
  4. Make generators that have return expressions not iterable by the yield type. Essentially this means that instead of using TYield, we would just union TYield and TReturn so that it would not be pleasant to iterate over a generator that has a return expression.
  5. Do what I suggested in my advanced additions to the proposal. This means that we do some hackery in the type system to make Generator<TYield, TReturn> assignable to Iterable<TYield> (they would be pleasant to iterate over), but the Generator type would be somewhat of an oddball in the type system, and the compiler would pay special attention to the type arguments of the Generator type. This is a hack in the type system, but it essentially produces all the semantics that we want up front without requiring boolean literal types. We could later replace this with boolean literal types if/when they come online.

Option 5 is something I'm certainly willing to try out if you are interested in seeing what this would look like.

How will option 5 help the async use case? Even if there is a Generator<TYield, TReturn> type, the TReturn type won't appear in any of its members (until boolean literal types come along). That is, next() will still return { done: boolean; value: TYield; } for the time being.

Has option 3 been given serious consideration? It seems the only way to expose the TReturn type to consumers of the Generator<TYield, TReturn> interface. Does anyone on the team know how hard it would be to get boolean literal types into the compiler, so generators could be implemented fully with no hackery and no picking winners (ie out of iteration and async)?

Sorry if I wasn't clear on option 5. I meant that next would actually return { done: boolean; value: TYield | TReturn }. So calling next on something of type Generator would give you the right thing, but we'd still have separate access to the two types if you are using the nominal type Generator.

Option 3 has not been seriously considered yet, but maybe worth more discussion. It is also possible to go with option 5 temporarily until boolean literal types come along, at which point we'd be able to remove the hack introduced by option 5.

One more option: We could introduce a minimal form of boolean literal types early, without exposing many of the features of literal types, but use them as a way to track done-ness of the generator. This would allow us to keep the types separate. But I hesitate to suggest this because I'm not sure what the implications of adding the full literal type feature will be, given certain assumptions we might make about this initial implementation.

From the proposal above:

The element type is the common supertype of all the yield operands and the element types of all the yield * operands.
It is an error if there is no common supertype.

Isn't this also going to break the async use case? I gave a working example above where there are two yield expressions have types Promise<string[]> and Array<Promise<fs.Stats>>.

The current proposal would make this an error if I'm not mistaken. But its perfectly valid and normal in the async use case for yield expressions to have no common supertype.

What if TYield was the union type of the yield expression types? Wouldn't that work out of the box for both use cases (iteration and async)? I suppose in the iteration case, it just wouldn't catch some programmer errors (ie, if they yield two different types in the same generator).

Yes, you are correct. It would break. I can change that so it uses the union type. This was more just for parity with the return expressions in a normal function, but as you point out, there is a meaningful difference here.

@Yortis for 5 next() would return { done: boolean; value: TYield|TReturn; } initially and would be updated when boolean literals become a thing, but for-of would only return TYield.

3,4 or 5 sound good. Honestly. As much as I want to use generators now, I'd rather wait for proper support than to rush them and seriously gimp them.

4 seems like something that would be good to do some research for, as it seems like it could be useful even if 3 or 5 are chosen.
In fact, 3,4 and 5 are not mutually exclusive and should all be carefully considered (since if you wait for 3, 5 would still be good for having a more correct type for for-of).

Yes, as I mentioned, it is possible to do 5 now, and then when boolean literal types come online, remove the hack and use those.

Oh derp!
For some reason my brain decided that when we had the boolean literal support for-of would get TYield|TReturn. Sorry about that.

Maybe even eventually support 4 behind a flag for those people who like those things (like noimplicitany).

Anyway, 3 and/or 5 sound the best with 4 being really good for some use-cases.

One more note about this option 5, just in case it was not clear from before. You would only have individual access to TYield and TReturn if you are using the named type Generator. If you assign it to something with a different nominal type, you will just have TYield | TReturn.

Sounds like option 5 followed asap by option 3 is the closest idea so far to a practical implementation that isn't too biased against any use case.

So taking this snippet of an async example:

function* genfunc() {                       // (1)
    yield Promise.delay(1000);
    var result = yield Promise.resolve(42); // (2)
    return result;
}
co(genfunc)
.then(result => {                           // (3)
    console.log(result);
});

How would we go about typing this? From my understanding of the proposal:

  • result at (2) would be inferred as any because proposal states that 'A yield expression has type any.'
  • genfunc would therefore have its type inferred as IterableIterator<any>, since TYield|TReturn = Promise<number>|any = any, which is not desirable.

Suppose we explicitly type the generator function like so: genfunc: () => Generator<Promise<number>, number> and suppose co looks like this (cut down):

function co<TReturn>(genfunc: () => Generator<Promise<any>, TReturn>) {
    return new Promise(resolve => {         // (4)
        var genobj = genfunc();             // (5)

        function resume(value) {
            var next = genobj.next(value);
            if (next.done) {
                resolve(next.value);        // (6)
            } else {
                next.value.then(resume);    // (7)
            }
        }

        resume();
    });
}

Then inside co under option 5:

  • genobj at (5) is inferred as Generator<Promise<any>, number>
  • next.value at (6) and (7) is inferred as Promise<any> | number
  • (7) won't compile and must be changed to (<Promise<any>> next.value).then(resume);
  • (6) will cause resolve at (4) to be inferred as (result: Promise<any> | number) => void, which will cause the co(...) expression to be inferred as Promise<Promise<any> | number>> which is not desired.
  • if (6) is changed to resolve(<TReturn> next.value);, then resolve at (4) is inferred as (result: number) => void, and the co(...) expression is inferred as Promise<number>, which is correct.

Alternatively inside co under option 3:

  • genobj at (5) is inferred as Generator<Promise<any>, number>
  • next.value at (6) is inferred as number assuming if (next.done) {...} is a type guard for the next.done boolean literal that narrows the result of next() to be {done: true; value: TReturn; }.
  • next.value at (7) is inferred as Promise<any> assuming } else { is a type guard for the next.done boolean literal that narrows the result of next() to be {done: false; value: TYield; }.
  • (6) will cause resolve at (4) to be inferred as (result: number) => void, which will cause the co(...) expression to be inferred as Promise<number>, which is correct.

In summary for the async use case:

  • under either option 3 and 5, the generator function must be explicitly typed.
  • under option 3, everything else just works.
  • under option 5, inside the async runner, casts are needed on the result of calling genobj.next() in both the done: true and the done: false case, then everything works.

Have I understood the proposal and the options correctly? And is there any way we could avoid having to explicitly type the generator function?

Yes, you are correct. It would break. I can change that so it uses the union type. This was more just for parity with the return expressions in a normal function, but as you point out, there is a meaningful difference here.

@JsonFreeman Offtopic but I don't really get why return expressions need a common supertype either, rather than being a union type. Is that just backward-compatibility baggage or am I missing something?

@yortus I have so far understood 3 and 5 to behave the way you described.
Actually, re-reading @JsonFreeman's last comment again, I think he was planning on hacking in the type-guard based on boolean literal until 3 is available. Meaning 5 behaves identically to 3.

I've been playing around with some alterations to the proposal and would like to submit some ideas for discussion.

EDIT: Removed TIteratorResult and used TYield and TResult in line with current proposal. Sorry for any confusion!

Alternative Generator and IteratorResult definitions

Suppose the Generator interface is defined like this:

interface Generator<TYield, TReturn, TYieldMapping extends (expr?: any) => any> {
    next(value?: any): IteratorResult<TYield, TReturn>;
    throw(error: any): IteratorResult<TYield, TReturn>;
    return(result?: any): IteratorResult<TYield, TReturn>;
    [Symbol.iterator](): Generator<TYield, TReturn, TYieldMapping>;
    [Symbol.toStringTag]: string;
}

TYield and TReturn are tracked in the same way as the current proposal. IteratorResult may be defined initially as:

interface IteratorResult<TYield, TReturn> {
    done: boolean;
    value?: TYield | TReturn;
}

When option 3 is possible, and if generic type aliases also make it into the compiler (greenlighted according to #2936), this could be tightened up to:

// NB: uses generic type alias and boolean singleton types - hopefully both coming to tsc vNext)
type IteratorResult<TYield, TReturn>
    = { done: false; value?: TYield; }
    | { done: true; value?: TResult; };

As for TYieldMapping, this tracks the type mapping from yield operands to yield expressions in the generator, which generally makes sense for both the iteration and async use cases. This allows yield expressions to have their types accurately inferred, rather than always being any.

Inferring the type of a generator

When no type annotation is given for a generator function, the compiler infers:

  • TYield is the union type of all the yield operand types present in the generator function
  • TResult is the best common type (or union type?) of all the return expression types present in the generator function.
  • TYieldMapping = (expr?: any) => any

Suppose we have this example:

function* genfunc1() {
    yield Promise.delay(1000);
    var result = yield Promise.resolve(42);
    return result;
}

Then the compiler infers:

  • TYield = Promise<number>
  • TResult = any
  • TYieldMapping = (expr?: any) => any

Two more examples:

function* genfunc2() {
    yield 1;
    yield 2;
    yield 3;
}

function* genfunc3() {
    yield Promise.delay(1000);
    var result: number = yield Promise.resolve(42); // NB: annotated result
    return result;
}

For genfunc2 the compiler infers:

  • TYield = number
  • TResult = any
  • TYieldMapping = (expr?: any) => any

For genfunc3 the compiler infers:

  • TYield = Promise<number>
  • TResult = number
  • TYieldMapping = (expr?: any) => any

These all match the inference ability of the current proposal and have the same characteristics under option 3 and 5 as discussed in preceding comments.

Contextually typing a generator

When a type annotation is provided for a generator function, the compiler uses the following rules:

  • the operand of a yield expression (if present) is contextually typed by TYield
  • the operand of a return expression (if present) is contextually typed by TReturn
  • the type of each yield expression is inferred using TYieldMapping.

For example, suppose genfunc1 above is contextually typed like this:

var genfunc1: () => Generator<Promise<any>, any, <U>(expr: Promise<U>): U>;
genfunc1 = function* () {
    yield Promise.delay(1000);
    var result = yield Promise.resolve(42);
    return result;   // (3)
}

Then the compiler infers result has type number because TYieldMapping maps Promise<number> to number. However this inference is lost at (3) because TResult is contextually typed as any.

More realistically, an async runner like co would be defined something like:

interface CoYieldable {
    <T>(expr: Promise<T>): T;
    <T>(expr: Array<Promise<T>>): Array<T>;
    // ... other yieldables ...
}

function co<TReturn>(genfunc: () => Generator<any, TReturn, CoYieldable>) {
    ...
}

Then the async runner can be used like this:

var promise = co(function* () {             // promise inferred as Promise<number>
    yield Promise.delay(1000);
    var result = yield Promise.resolve(42); // result inferred as number
    return result;
});

Due to the co function's contextual typing of the generator function, all the yield and return expressions have their types accurately inferred with no annotations needed. This seems to be an improvement on the current proposal.

Iterables

I've focused more on the async use case above, but I believe the iterable use case is just as well catered for with this altered proposal as it is with the current proposal. That is, providing assignability rules between Generator and IterableIterator are worked out, iteration should be straightforward.

@yortus your post refers to TYieldOp in one of your examples, but you don't elaborate on what this is, is it meant to read TYieldMapping?

Thanks @Griffork, I'm editing it now, will fix that typo too.

@Griffork I am not planning to add the type guard. I meant that I was thinking of adding boolean literal types internally in the compiler to help track the yield and return types separately. But thinking about it more, there would be too little benefit to doing that. I'd sooner implement option 5 by just holding on to the two types as type arguments to the nominal Generator type.

@yortus Regarding common supertype for return expressions, I don't actually know the reason for that. I know that we've discussed changing it, and if the function is contextually typed, it is actually allowed to be the union type. Maybe @RyanCavanaugh would know more, but I would ask that this topic be on a separate thread.

Your analysis in #2873 (comment) is pretty much correct. One question is, in the absence of a type annotation, would we infer IterableIterator<any>, or Generator<Promise<number>, any>? We could say that if you have at least one return expression, we switch to the more powerful Generator type.

But your main points in that post are correct. The type guards on the done property would only work if we have boolean literal types. And to type the yield expressions, the user needs to specify the type explicitly, like Generator<Promise<number>, number, number>.

Your following post is interesting. You want the yield expression type to depend on the yield operand type. Do you think it is necessary to infer a potentially different type for every yield expression, instead of specifying one type for all the yield expressions in the generator body? I suppose one thing I find awkward here is that you are trying to express a function on types as a function type. A function type in our type system has thus far represented only types that are inhabited by functions. They are never applied by the type system the way that generic types are. Your proposal also entails that we would perform type argument inference and overload resolution with all the call signatures in the type argument you pass as a mapper.

@yortus

Then the compiler infers result has type number because TYieldMapping maps Promise<number> to number

This concerns me (still crunching my way through your post 😛), since a yield statement may be resumed with any value, e.g.

function *generator_example() {
    var test_var = yield "Hello!"; //once called with next(3), test_var becomes a number.
    return test_var;
}

var giter = generator_example();
var hello = giter.next();

console.log("Hello", giter.next(3));

Here's the other thing about the YieldMapper. If you are inferring all the yield expressions to be different types, then when the consumer calls next, you would presumably want to check each call to next with the appropriate type given the yield expression the generator is currently suspended on. But of course, you would not statically know which yield expression a particular call to next corresponds to. And without that checking, I would argue there is really not much value to inferring the yield expression.

Correct me if I'm wrong, but I think typing each yield differently (within the function) and having that affect next()'s signature is not going to work (although it would be interesting, it'd be impossible to resolve if the generator or next are aliased). Also since you could call the same function twice and pass in two different types to it.
I think yield should be type (unless the programmer specifies somehow in the generator's signature otherwise) and would have to be cast for typing.

@JsonFreeman

Here's the other thing about the YieldMapper. [....] you would not statically know which yield expression a particular call to next corresponds to. And without that checking, I would argue there is really not much value to inferring the yield expression.

Right, the YieldMapper is not useful to the caller of next(). TYield and TReturn are the useful contstraints there, same as with the current proposal. The YieldMapper is useful for typing yield expressions inside the generator function body, which is a godsend for the async use case.

Do you think it is necessary to infer a potentially different type for every yield expression, instead of specifying one type for all the yield expressions in the generator body?

This is absolutely what is wanted in the async use case. See my examples earlier in the thread. In the async use case, each yield expression's type is not related to the others, but it is related to the type of that yields operand. And it doesn't hurt the iterable use case at all, where is just a sort of degenerate case with only one expression type (any).

I suppose one thing I find awkward here is that you are trying to express a function on types as a function type. A function type in our type system has thus far represented only types that are inhabited by functions. They are never applied by the type system the way that generic types are.

Can't the compiler's function overload resolution already do this? I had it in mind that it would use that exact same functionality already in the compiler for function overload resolution.

Your proposal also entails that we would perform type argument inference and overload resolution with all the call signatures in the type argument you pass as a mapper.

Right. Is that a problem? I see it as coverred by 4.12.1 Overload Resolution in the spec.

@Griffork

This concerns me (still crunching my way through your post 😛), since a yield statement may be resumed with any value

If no YieldMapper is given, this will still be the case. If a YieldMapper is provided for contextual typing, then you are effectively instructing the compiler to enforce a constraint on what yield can receive and return, which is useful in the async use case, and is not needed in the iterable use case.

I think typing each yield differently (within the function) and having that affect next()'s signature is not going to work

It wouldn't have any effect on next()'s signature.

I think yield should be type

If I understand you, it is typed - see the Generator interface.

@JsonFreeman

Here's the other thing about the YieldMapper. If you are inferring all the yield expressions to be different types, then when the consumer calls next, you would presumably want to check each call to next with the appropriate type given the yield expression the generator is currently suspended on.

Again, the YieldMapper is not useful for the consumer of the Generator interface. That side of things is catered for by TYield and TReturn in the same way as the current proposal.

The YieldMapper, which defaults to (expr?: any) => any, effectively brings us back to the current proposal if it is not provided.

But if it is provided for contextual typing, then each yield expression can have its type accurately inferred using the YieldMapper. This is designed to fill a hole in the current proposal when in comes to the async use case. It means everything in the generator is correctly typed with no annotations needed. Without it, every single yield expression whose value is subsequently used will need to be manually annotated, or we just have any-spaghetti.

@yortus it's one type to rule them all, not one type per yield. All yields should be one type (be that a programmer specified union or not).
More importantly; what is to the right of a yield should never influence the type of what comes to the left of a yield. That's assuming a relationship that doesn't exist and is easily proved wrong.

When you say "if no YieldMapper is given" what constitutes given? Do users have to supply extra explicit typing?

@Griffork

All yields should be one type (be that a programmer specified union or not).

Right, they are, that's the TYield type.

what is to the right of a yield should never influence the type of what comes to the left of a yield. That's assuming a relationship that doesn't exist and is easily proved wrong.

The compiler will assume no relationship unless you give it one via TYieldMapper.

But the relationship in fact does exist in important and common scenarios, such as when using co and the like for async control flow. For example, co will always map Promise<T> to T, Promise<T>[] to T[], and several other things that can be statically described. TYieldMapper gives us the useful (and optional) ability to tell the compiler how to enforce/infer these rules automatically.

It's analogous to giving the compiler a bunch of function overload declarations so it can statically check if a function call is valid and what its return type will be when called with various combinations of parameter types and arities.

When you say "if no YieldMapper is given" what constitutes given? Do users have to supply extra explicit typing?

"if no YieldMapper is given" just means that the generator function has no type annotation. Giving a YieldMapper just means giving the generator function a type annotation.

Note that annotating the generator function is usually necessary for the async use case under the current proposal anyway. But under this altered proposal that annotation can be provided by the async library (like co), so the library user won't need to annotate anything.

Gave some thought about this from a different angle, instead of trying to 'type' the whole thing, it could be expressed as:

function* genfunc3() {
    yield Promise.delay(1000);
    var result: number = yield Promise.resolve(42); // NB: annotated result
    return result;
}

// compiler infers a type 
type genfunc3IteratorResult = Promise<number> where <end(T): number>;

function* genfunc4() {
    yield Promise.delay(1000);
}

// compiler infers a type 
type genfunc4IteratorResult = Promise<number> where <end(T): undefined>;

// alternative syntax though unclear that 'return' means part of an iteration
type genfunc4IteratorResult = Promise<number> where <return(T): undefined>;

The thinking is that:

type genfunc3IteratorResult = Promise<number> where <end(T): number>;

is more expressive than:

type genfunc3IteratorResult = Promise<number>|number;

Going back to the first example:

function *g() {
    yield 0;
    return "";
}

type NumIteratorEndString = Iterator<number> where <end(T): string>;

var g2:NumIteratorEndString = g();
var x3 = g2.next(); // type number, correct
var x4 = g2.next(); // type number, could know it's string since Iterator is not infinite. 

Unclear how compiler could track iteration 'loops', but seems like this is doable:

for(let a of g2) {
   if(a.done) {
          // a is of type IteratorResult<string>
    } else {
          // a is of type IteratorResult<number>
    }
}

Another syntax which looks interesting or similar to this:

type NumIteratorEndString = Iterator<number> where <if(T.done): string>;

It seems I haven't made a clear case for why a YieldMapper is necessary or how it would work. I'll try another way.

Rationale

Going way back to the rationale of this proposal, it surely involves:

  • accurately modeling the way ES6 generators work
  • providing as much type safety as possible.
  • providing as much type inference as possible.

Two major use cases of generators have been identified (in this thread and elsewhere - see for example this post):

  • iterating over a set of values
  • managing asynchronous control flow

Here is an example of each use case.

Example 1: Generator used for iteration

var evenNumbers = function* (min: number, max: number) {
    var i = min;
    while (i < max) {
        if (i % 2 === 0) {
            yield i;
        }
        ++i;
    }
}

Example 2: Generator used for asynchronous control flow

var co = require('co');
var Promise = require('bluebird');
var fs = Promise.promisifyAll(require('fs'));
var path = require('path');

var getDirectorySizeInBytes = co.wrap(function* (dirpath) {
    var filenames = yield fs.readdirAsync(dirpath);
    var filepaths = filenames.map(filename => path.join(dirpath, filename));
    var stats = yield filepaths.map(filepath => fs.statAsync(filepath));
    var totalSize = stats.reduce((sum, stat) => sum += stat.size, 0);
    return totalSize;
});

getDirectorySizeInBytes('.')
.then(bytes => console.log(`Directory size in bytes: ${bytes}`));

// sample output:
// Directory size in bytes: 22504

Analysis of next() and yield in the two examples

Before looking at solutions, let's analyse how the generators actually behave in the two representative examples. In particular, the next method of the generator object, and the yield operator in the generator body both exhibit complex behaviour. Here is a breakdown that attempts to find patterns in the various cases.

Behaviour of yield in generator used for iteration

  • yield operands are important outside the generator body, they all have the same type (TYield), and that type is used by for...of, genobj.next(), etc.
  • TYield = number in the example
  • the values of yield expressions are generally not important
  • the values of return expressions (TReturn) are generally not important

Behaviour of yield in generator used for async control flow

  • yield operands (TYield) are important inside the generator body, but not outside
  • yield operands are of generally of unrelated types within a single generator
  • TYield = Promise<string[]>|Array<Promise<fs.Stats>> in the example
  • the values of yield expressions are important and are subsequently used in the generator body (filenames and stats in the example)
  • the type of a yield expression is related to the type of that yield's operand (in the example, yield always maps Promise<T> to T, and Promise<T>[] to T[])
  • the values of return expressions are important outside the generator body (the example generator returns a number)

Behaviour of next() in generator used for iteration

  • usually called without an argument, ie var next = genobj.next()
  • before the generator returns, next is { done: boolean; value: TYield; }
  • once the generator returns, next is { done: true; }, and next.value is not used

Behaviour of next() in generator used for async control flow

  • usually called with an argument, ie var next = genobj.next(value), except the first time
  • before the generator function returns, next is { done: false; value: TYield; }
  • once the generator returns, next is { done: true; value: TReturn; }
  • next.value is important in all cases
  • no relationship between the type of next's argument and that of its result

Accounting for all these behaviours in a single model

Based on the patterns observed above, here is a possible solution for modeling generators of all forms.

A generator's TYield type is the union type of all its yield operand types

  • this models both yield and next() properly for both use cases

A generator's TReturn type is the best common type (or union type?) of all its return expression types

  • this is harmlessly ignored in the iteration case.
  • this models both yield and next() properly for the async control flow case.

The next method returns {done: false; value?: TYield} | {done: true; value?: TReturn; }

  • this models both cases well
  • the iteration case can ignore the done: true part
  • without boolean literal types, this can be approximated as {done: boolean; value?: TYield|TReturn} with some minor inconvenience

The behaviour of yield in all cases can be modeled as a polymorphic function with arity <= 1.

  • In the iteration case, the yield expression type is unimportant, so the polymorphic model of the yield operator can simply be (expr?: any) => any. This models iteration well.
  • But (expr?: any) => any models yield very poorly in the async case. It would require the generator author to annotate every yield expression, otherwise most of the generator body will be rendered untyped due to all the yield expressions being inferred as any.
  • In the async case, the yield expression type is typically a function of the yield operand's type. This can be modeled with a polymorphic function, using TypeScript's function overloading feature.
  • For example, co has rules mapping yield operands to yield expressions. Promise<T> always maps to T, Promise<T>[] always maps to T[], and several other rules are sufficient to describe the behaviour of yield in this case. It might look like this:
interface Yieldable {
    <T>(Promise<T>): T;
    <T>(Promise<T>[]): T[];
    // ... several more ...
}
  • the Yieldable interface is not part of TypeScript, it is supplied by user code to describe the behaviour of yield for the case at hand. Typically once per async library.

A possible solution

A generator interface that captures all this behaviour might look like this:

interface Generator<TYield, TReturn, TYieldMapping extends (expr?: any) => any> {
    next(value?: any): IteratorResult<TYield, TReturn>;
    throw(error: any): IteratorResult<TYield, TReturn>;
    return(result?: any): IteratorResult<TYield, TReturn>;
    [Symbol.iterator](): Generator<TYield, TReturn, TYieldMapping>;
    [Symbol.toStringTag]: string;
}

TYield and TReturn are used to accurately model the consumption of a generator (ie, doing for...of, calling next(), etc.

TYieldMapper is used to accurately model the behaviour of yield within a generator body. Since it appears nowhere in the Generator interface, it must either be provided as a type annotation to the generator function, otherwise it will be inferred as (expr?: any) => any.

  • async libraries like co can neatly provide this TYieldMapper annotation - like the Yieldable example above.
  • generators used for iteration don't need any annotation because (expr?: any) => any models their yield behaviour just fine.

Final illustration

I hope I have explained the essentiality of TYieldMapper for accurately modeling generators. As a final example, consider example 2 above without a YieldMapper. TypeScript is really not helping at all with the types:

import co = require('co');
import Promise = require('bluebird');
var fs: FSPromised = Promise.promisifyAll(require('fs'));
import path = require('path');

var getDirectorySizeInBytes = co.wrap(function* (dirpath) {
    var filenames = yield fs.readdirAsync(dirpath);                          // filenames is any
    var filepaths = filenames.map(filename => path.join(dirpath, filename)); // everything here is any
    var stats = yield filepaths.map(filepath => fs.statAsync(filepath));     // everything here is any
    var totalSize = stats.reduce((sum, stat) => sum += stat.size, 0);        // everything here is any
    return totalSize;                                                        // totalSize is any
});

getDirectorySizeInBytes('.')
.then(bytes => console.log(`Directory size in bytes: ${bytes}`));            // bytes is any

// sample output:
// Directory size in bytes: 22504

Now consider the same example with a YieldMapper like the Yieldable interface above. It's completely accurately typed, with no annotations:

import co = require('co');
import Promise = require('bluebird');
var fs: FSPromised = Promise.promisifyAll(require('fs'));
import path = require('path');

var getDirectorySizeInBytes = co.wrap(function* (dirpath) {
    var filenames = yield fs.readdirAsync(dirpath);                          // filenames inferred as string[]
    var filepaths = filenames.map(filename => path.join(dirpath, filename)); // filepaths inferred as string[]
    var stats = yield filepaths.map(filepath => fs.statAsync(filepath));     // stats inferred as fs.Stats[]
    var totalSize = stats.reduce((sum, stat) => sum += stat.size, 0);        // totalSize inferred as number
    return totalSize;
});

getDirectorySizeInBytes('.')
.then(bytes => console.log(`Directory size in bytes: ${bytes}`));            // bytes inferred as number

// sample output:
// Directory size in bytes: 22504

I suppose one thing I find awkward here is that you are trying to express a function on types as a function type. A function type in our type system has thus far represented only types that are inhabited by functions. They are never applied by the type system the way that generic types are. Your proposal also entails that we would perform type argument inference and overload resolution with all the call signatures in the type argument you pass as a mapper.

@JsonFreeman I've tried to code the behaviour of TYieldMapper using roughly equivalent function overloading. The compiler is not quite as smart as I thought. Is this the problem you were referring to in the quote above? See the comments on the last two lines. The compiler infers all the types in the first case, but not in the second (generic) case, even though all the same information is statically available for it to do so.

interface Overloads {
    <T>(expr: Promise<T>): T;
    <T>(expr: Array<Promise<T>>): Array<T>;
    (expr: number): boolean;
    (expr: {}): {};
}

function f1(poly: Overloads) {                                      
    var a = poly(1);                                                
    var b = poly(Promise.resolve({ foo: 'bar' }));                  
    var c = poly([1, 2, 3]);                                        
    var d = poly([Promise.resolve('foo'), Promise.resolve('bar')]); 
    return { a, b, c, d };
}

function f2<TPoly extends (expr?: any) => any>(poly: TPoly) {       
    var a = poly(1);                                                
    var b = poly(Promise.resolve({ foo: 'bar' }));                  
    var c = poly([1, 2, 3]);                                        
    var d = poly([Promise.resolve('foo'), Promise.resolve('bar')]); 
    return { a, b, c, d };
}

declare var poly: Overloads;
var x = f1(poly); // inferred: { a: boolean; b: {foo:string}, c: any, d: string[] }
var y = f2(poly); // inferred: { a: any; b: any; c: any; d: any; }

Nevertheless, I don't see a reason why the compiler couldn't internally use its function overloading logic and a TYieldMapper type to accurately infer each yield expression in a generator.

It theoretically could use the overloading logic (with some refactoring to not assume that it's resolving a call expression). But it's just odd that a function type is being used to model a function on types.

It sounds like you do not care about typing the consumption point (the calls to next) for your async scenarios. So the internals of co.wrap would not get any type checking under your proposal. Why is it okay to discard type checking inside co.wrap?

Regarding your question about the inference in the most recent example: First, I would ask that this be continued on a different thread. But the issue is that the return type of f2 is computed only once based on the definition of f2. And when it is computed, the only thing known about TPoly is its generic constraint. Then it instantiates this computed return type when it processes calls to f2, but it does not recompute it. In other words, the language does not recompute the return type from the body of f2 given information about how f2 is called. This is how it works for generic functions, but for generic types, there is a lot more connection between what happens inside the type and how the type is instantiated from the outside.

So the internals of co.wrap would not get any type checking under your proposal.

@JsonFreeman Not so; next() is typed exactly the same way that you are proposing. it it returns { done: true; value?: TYield; } | { done: false; value?: TReturn; } (or simply { done: boolean; value?: TYield | TReturn; } as per option 5). Together with type guards this provides complete and accurate type checking inside co.wrap. I gave a cut-down implementation of co above that illustrates this.

On the contrary, what I'm trying to show is that the current proposal doesn't care about typing within the generator body. Look how the any type bleeds through everything in the final illustration with getDirectorySizeInBytes above. Doesn't that bother you?

Your proposal does not give checking of the next parameter inside co.wrap. So there is nothing enforcing that the call to next is passing something of the right type. I worry that with the YieldMapper, the author of the generator will get the feeling that all the types are being meticulously related (namely the parameter type of next, and the return type of next), when in fact on the call side (the inside of co.wrap), this relationship is not enforced.

I suppose it's that I find the asymmetry of type checking strength (inside vs outside the generator) to be unsettling.

I suppose it's that I find the asymmetry of type checking strength (inside vs outside the generator) to be unsettling.

It actually reduces the asymmetry already there in the current proposal by adding accurate modeling of the yield operator, everything else is exactly the same as the current proposal.

The current proposal means that you get all the type checking on the right hand side of the yield, but none on the left hand side. Your suggestion means that you get all the checking on the right hand side, but half of the checking on the left. Namely, you get checking of the left hand side inside the generator, but not outside. Also, your presumed relationship between the right hand side and the left hand side is only correct insofar as the caller of next honors it. And there is no checking to make sure that the caller of next honors that relationship.

Your proposal does not give checking of the next parameter inside co.wrap. So there is nothing enforcing that the call to next is passing something of the right type.

OK, well how could that be done? We know all the possible types that could be yielded (that's the TYield union). We can statically describe how these are mapped to yield expression types using something like TYieldMapping. What's missing, as you say, is a constraint on what gets passed back to next().

Take a simple rule from co. If a Promise<T> is yielded to it, it will await the resolved value, T, and pass that back through next(). But T could be absolutely any type, so it can't be statically constrained for anything more specific than any. ie no type checker could ever statically catch a type error here.

Also, the type passed into next() does have a relationship to the type returned from the previous call to next() inside the generator consumer. It's the same relationship described by TYieldMapping. But because the relationship is spread across two separate calls to next(), its a runtime relationship than can't possibly be statically checked.

So you are right, I'm not offering any way to check what gets passed in to next(). But that's not out of carelessness. Its just not statically checkable.

That in no way takes away from the usefulness of TYieldMapping for checking the yield expressions statically.

your presumed relationship between the right hand side and the left hand side is only correct insofar as the caller of next honors it. And there is no checking to make sure that the caller of next honors that relationship.

The caller of next() is presumably a widely used library, which it also the provider of the TYieldMapping used to contextually type generators passed to its own API. In most cases, as with co, its probably not even written in TypeScript. We are just talking about having an accurate .d.ts for its API.

You're basically saying that the possibility of something not honouring its .d.ts file at runtime is unacceptable. Totally right, but if TypeScript was based on this logic, it would suddenly disappear like ES4! TypeScript can't enforce the promises library authors make, it just assumes they honour them. I think the point is to help library users get accurate type inference.

So you are right, I'm not offering any way to check what gets passed in to next(). But that's not out of carelessness. Its just not statically checkable.

I agree with this. I realize that this relates types that are spread across two consecutive calls to next. And that it's not possible.

That in no way takes away from the usefulness of TYieldMapping for checking the yield expressions statically.

This I am not sure about. Personally I feel that without the checking on the consumer side, it is very hard to justify a fancy mechanic to relate the two sides of the yield expression. I'm not saying it has no value whatsoever. I'm just saying that without consumer-side checking, I don't think it's valuable enough to create a new type system mechanic that doesn't really resemble anything else we have.

The caller of next() is presumably a widely used library, which it also the provider of the TYieldMapping used to contextually type its own API.

Not so. In TypeScript, a caller can provide contextual typing for the return expressions inside a callback. But the contextual type does not actually serve to provide a type annotation on the callback. And in order for the caller to provide the mapper, it would have to be inherited from the contextual type in that fashion. To do that, we'd have to make a separate rule that contextual typing works differently for generators than it does for regular functions. I do not think this is a natural rule.

I see your point that in this pattern, the caller is the library, and the consumer provides the generator as a callback. I think your argument makes sense if the caller of the generator is implemented in Javascript, and then documented in .d.ts. But it does not scale to a generator consumer written in TypeScript. That may not be important for your scenario because you've likely already written co.wrap in Javascript, and you just have to document it in .d.ts. I understand that in your case, it is a perfect fit. But it just seems like the utility of the mapper is so specific to this particular use case. I think that in order to seriously consider it, we'd need to feel that the mapper would benefit more use cases in general.

@yortus the other thing is that what you're proposing only supports some async libraries that use generators and not all libraries that use generators, for example the library I am creating does not behave like that.
What you want to do is add a type/way of type checking into Typescript that supports one style of using generators that gimps library authors that have designed another was of using generators for async, even though neither of those are behaving the same way Typescript does!

That's like Typescript having built-in support for type checking jQuery, in my opinion that prevents othergood libraries from gaining popularity.

As @Griffork said earlier, 'when I was first looking up generators, the amount of threads/blogs/posts I found that wanted to use it in a promise fashion vs an iterable was about 10:1.'.

And most of the libraries involved are not written in TypeScript, so as you say its a case of documenting the API in a .d.ts file.

So I think 'we'd need to feel that the mapper would benefit more use cases in general.' is missing the forest for the trees. If @Griffork's informal survey has any merit (and I must say I have to agree in the observation that async is the biggest practical use case out there), then we are talking about the mapper being relevant in the majority of use cases. Is that not enough?

Let's say the current proposal goes ahead without the mapper. Here's what I predict. People start using generators in TypeScript. Then one by one, issues come in all about 'Why is nothing typed inside my generator? Why do I have to annotate everything? What is the point of TypeScript if it can't infer any types? Why can't I provide accurate typing for this library? All these inferences can be described statically so why can't TypeScript do it! yadda yadda.'

Just sayin'

@Griffork can you describe what your library does with generators and how it would be gimped?

Also, nothing is proposed to be 'built-in', and no 'one style' of generators is favoured. If anything, the current proposal favours iteration over async.

I model C# style async functions, where the returned value (to yield) is typically the TReturn of another generator (regardless of how many yield calls that generator has in the middle). The call to that generator does not have to be on the right side of yield (since the decision to wait is done by the calling function, not by the yield's return value).
It also has (pain-free) support for yielding for traditional callback-driven async code.

It's just a mock up I've done of a library in going to make, it's not the final version of that library.

As in:

await(async(generator)) ;
yield;

(Typically async is called at the generator's constructor time like co is)

The library is designed to support concurrent requests (e. g. multiple XMLHTTPRequest or multiple await(async(generator))).

As you can see, my library would get no typing, while your library would. Making yours better and mine not competitive.

I think it is not unreasonable to let the user give a type annotation for their yield expressions. But that is the most we can do. I don't think it makes sense to build a part of the type system that presumes such a fancy relationship, even though this may make certain select cases type better. I'm not convinced it's of general utility.

I think it is not unreasonable to provide the user the option to give a type annotation for their yield expressions. But that is the most we can do. I don't think it makes sense to build a part of the type system that presumes such a fancy relationship

@JsonFreeman that's a strawman. The mapper is exactly what you just described as reasonable (an 'option to give a type annotation for their yield expressions'), and nowhere is any fancy relationship presumed. The default case is (expr?: any) => any which covers all possible yield cases without presumption.

@Griffork I understand. I'm working on something similar myself ;) So how do you think your way of using generators would be gimped if an optional yield mapper was available?

@Griffork

As you can see, my library would get no typing, while your library would. Making yours better and mine not competitive.

Can you elaborate? Why would it get no typing exactly? Example maybe?

my library would get no typing, while your library would. Making yours better and mine not competitive.

Typescript would be not modelling that language, but modelling the "preferred use". And no one would be able to compete against the "preferred use". The draw to Typescript for me (over coffee script and others) is that they made no assumptions about how you're going to use the language, everything is 'fair game' (which is why I can modify Array.prototype).

Sure:

var (val1,val2) = yield await(asyncgen1), await(asyncgen2);

val1 and val2 would have no typing, so your library is by default better than mine.

*square brackets, I couldn't remember the destructuring syntax

@Griffork when you say 'your library', what library are you talking about? And what does 'preferred use' mean?

val1 and val2 would get no typing under the current proposal anyway. And your example could never be statically typed due to its API design. That's not a reason to gimp all libraries equally.

It could (with some very library/use specific and code, like the code you're suggesting that is library/use specific) 'my library' is a library that I'm working on (library in the same fashion co is a library).
"preferred use" is the use case that the language specifically supports either better than the alternatives or absent of alternatives.

@Griffork I haven't proposed any preferred use, so I still don't follow your meaning here. Can you elaborate?

The only code I mentioned that is library/use specific, like CoYieldable, would reside in that library or its .d.ts file. TypeScript would merely provide a mechanism for optionally typing the way yield behaves in particular library-defined scenarios. That would be useful for typing many libraries that work with generators. Even yours if you have an API that can be statically modelled.

TypeScript lets you optionally type all kinds of other things, if you can provide a static description of them. That doesn't make all these typings somehow 'preferred'. They reside in the libraries where they belong.

@yortus Typescript lets you type Javascript. It only (so far) supplies semantics for describing how raw javascript works, not for describing how libraries work.
What your asking for is a) not required for the first implementation of generators (thus @JsonFreeman telling you to create a new thread for this) and b) only describes one use of generators, without supporting any other uses of generators (e.g. what I described).

All of these libraries that do async that are common now (e.g. angular, co, etc.) currently all use promises, but that doesn't mean in the next year promises are going to be the most common way of doing async with generators, we don't know that yet.
What you would do is practically prevent other ways of using generators from being able to emerge, due to Typescript having built in support for promises (only), and no other ways.

Why, if you're going to support using generators with promises, can't I also insist that the Typescript team support how I'm going to use generators (because yes, it is possible, just not very feasible and not useful for more than that one way of using generators).

My initial points for async (which you quoted out of context) were in the context that async would be just as common (not more so) than iterators, and they should be supported equally. If you're going to argue support for a specific way of doing async, then I argue for support for any other way of doing async.
Which will (imo) will end up with an over-the-top bloated unmaintainable typing system.

After some discussion, here is the current plan:

  • For now, yield expressions will be required to have a common supertype. For additional discussion, let's use issue #921
  • For now, return expressions will be allowed, but ignored. When we have boolean literal types, we will track the return expression's type correctly, instead of hacking it temporarily. This is outlined in #2983
  • Yield expressions themselves will have type any, but we will also revisit this when we do the return expressions after boolean literal types. My plan is that we will allow the user to provide a parameter type for next, and all the yield expressions have to be that type.
  • I will temporarily remove the Generator type from es6.d.ts, as we will add a better one when we have boolean literal types.

As a result, I will add good support for generators as iterables for now. This is because it is possible to support that use case well now, whereas supporting the async use case should be built on top of boolean literals. Async use cases will still be possible, just not strongly typed. After boolean literals, we can better support async use cases.

@JsonFreeman I'm curious, what was the reasoning behind ignoring return types vs using option 5 (special hack)?

@JsonFreeman sounds like a good start. 'For now, yield expressions will be required to have a common supertype'. This means that for any of the async examples I've given in this thread to compile, they will have to be explicitly annotated with TYield=any, since their yield operands don't have a common supertype. Is that right? And all the yield expressions will have to be explicitly typed too for now. And TReturn too. Basically everything.

@Griffork TypeScript would know nothing about promises under my proposal. Not sure why you think that. I wholeheartedly agree with your point, but it just doesn't apply to the technique I proposed.

The reasoning behind not doing the hack (option 5) is that we have a better long term solution that is not a hack. Doing the hack would give us some value in the short term and none in the long term. And while I think tracking the return type is important, I do not think it is urgent enough to warrant the hack that we will later remove.

For the common type issue, yes you must provide any or the union type that you're interested in. Again, issue #921 is relevant here. Btw, if your generator is contextually typed, then we do infer the union type, so if you pass it directly to co.wrap, you probably should be fine not supplying the type.

Yield expressions themselves will have to be explicitly typed inline for now, yes.