jashkenas/coffeescript

No way to do ES6 'for .. of' loops for generators?

Closed this issue · 67 comments

Given a generator

gen = ->
  j = 0
  while j < 9
    yield j++
  return

how do we loop through it via the javascript for .. of loops? In javascript, we'd do

for (i of gen()) {
  console.log(i);
}

The coffeescript for i in gen() gives the C-style for (_i = 0; _i < gen().length; i++) loops, and for i of gen() gives for (i in gen()). Is there some way to get for (i of gen()) that I didn't see in the documentation?

I came to ask the same question. Having to manually run next and check for done is quite counter to the syntax-goodness that CS normally prides itself of.

Shoot. That's a major bummer.

@alubbe et. al — any ideas here? Do all of the runtimes that support generators also support ES2015's for of?

@jashkenas to my knowledge only chrome, firefox and opera support generators and all three support for .. of - so the answer would seem to be yes.

Syntax wise, this will be a bit of a challenge. for all .. of/in? each ... of/in?

We might have to add a flag for this to compile CoffeeScript for-in to ES6 for-of. I don't see any better way.

@michaelficarra That sound problematic -- a given script could either loop through generators, or arrays.

Node 0.11 and iojs also support for of.

I'd much rather have a special syntax, like for gen x of myGen().

@jagill: ES6 for-of works with arrays because they are iterables.

I think @michaelficarra's suggestion is the most sensible approach.

I personally would prefer releasing a backward incompatible version of CS (2.0, 3.0, what have you) that aligned for in and for of semantics with JS and have a conversion tool that can be run reliably and transform CS code from one version to the other. After all, it's just a deterministic syntax transformation.

I'm not in favour of adding yet another looping construct to CS.

if they're supposed to be used by generators, can't we use something like the currently-invalid from x from y()?

@vendethiel did you mean for x from y()? I like that.

A current workaround would be something like:

forOf = (gen, fn) ->
  `for (value of gen()) fn(value)`

forOf gen, (i) -> console.log i

Yes, sorry.

I dislike creating type-specific loop declarations.
The very power of for ... of lies in looping through all iterables: arrays, (invoked) generators, sets, maps and, interestingly, arguments.

So fundamentally, there are two routes and we'll need @jashkenas to give us a pointer:

  1. Backwards-compatible: will need a new syntax like each ... of
  2. Breaking compatibility: fundamentally rethinking how for ... in and for ... of work in coffee-script. @michaelficarra and @epidemian have offered two ways this could be done

Couldn’t the same syntax be used? If the context of the for is a generator it compiles to for of otherwise we compile using the current strategy. We can assume the for of feature is available if users use yield and the required information about weather the context is a generator should also be available?

@ssboisen, the problem is that we can't determine the context of the for at the compile time (imagine it comes from a function parameter). So we can't choose the right way to compile. Runtime check will have significant performance hit.

@jashkenas, I'd vote for backward incompatible CS 2.0 (according to semver!)
If we'll introduce a new looping construct (to maintain backward compatibility) then it will replace the original in eventually in everyday usage, and we will have an obsolete syntax: why to use in if we can use ??? for looping over iterables polymorphicaly and cheaply?

Breaking changes will allow introduce new keywords for existing problems as well (I mean await prefix operator for #3813 and #3757)

why to use in if we can use ??? for looping over iterables polymorphicaly and cheaply?

Ha, ha, suckers! 😉

Never assume that a new feature in JS is going to be "cheap" or smart to use — not when its freshly implemented, and probably not even after many years of use.

Check this out. Run it yourself.

http://jsperf.com/for-in-vs-for-of

nice! that is some next level performance if I've ever seen one.

@jashkenas can you help me interpret whether your response means that your are leaning towards a new third syntax or changing CS's loops? :)

I don't know of a great solution just yet — but I haven't really had a chance to sit down and give it a think. There are, however, ground rules:

  • We're not adding flags to CoffeeScript.
  • It would be better to revert yield and remove generators for now, than to have to add a flag.
  • We're not going to hurt regular array iteration performance to support this new feature.
  • It's undesirable (but perhaps not unavoidable) to introduce a new looping form.
  • Since for of performance is so shitty — maybe we can compile down to something else and still beat it handily?
  • I wish we had considered this wrinkle before merging the yield PR ;(

Regarding 2: please please please no! for ... of is a feature independent of generators and should have no impact on whether we keep this new awesome feature around or not. It is a loop over iterables, and an instantiated generator just happens to be iterable.
Regarding 6: (see 2) Also, it's awesome because finally more people are joining the discussion, reporting bugs and actually writing better code than before.

Here is more information on what for ... of does: https://developer.mozilla.org/en/docs/Web/JavaScript/Guide/The_Iterator_protocol

Basically, it converts strings, arrays, maps, etc. into a generator, instantiates it and continually calls .next() on it. I assume that is where all of the performance is lost.

Looking at it, I think CS could go without for ... of for the time being. It's quite slow and, if you really ever need for x of y, you can use

_ref = y[Symbol.iterator]()
while _next = _ref.next(), val = _next.value, !_next.done
  ...

That's sort of the kind of thing I'm talking about ... Is a direct:

_ref = y[Symbol.iterator]()
while x = _ref.next()
  ...

... a lot faster than a for of call? If it is — should we have a construct or a looping hint that compiles into it?

@jashkenas you don't want "for x from y" as a third syntax then?

Apropos of nothing — what the hell is up with the:

generator = array[Symbol.iterator]()

... bullshit? Are we trying to pretend like we're not in a dynamic language that embraces duck typing any more? On what planet would that be preferable to:

generator = array.iterator()

Maybe for ES2019, we'll need a new third type of strings, after strings and symbols, to avoid name clashes again, so that we can have:

newFancyIterator = array[Symbol2019.iterator]()

</grump>

One of the things I've liked about CoffeeScript is that it allows me to do anything I could do in Javascript, modulo a couple bad constructs. Assuming we don't think the for of loop for generators is a bad construct, IMO CS should support it (or an equivalent), at least someday. It also looks like there's no possibility any time soon to use the same syntax for normal array loops and for of. So I don't see an alternative to a third syntax.

That being said, I have less absolute feelings about when it happens. For my own CS use (since iojs and "soon" Node 0.12 support generators), I'd prefer it to be sooner rather than later.

Apropos of nothing — what the hell is up with the:

generator = arraySymbol.iterator

... bullshit? Are we trying to pretend like we're not in a dynamic language that embraces duck typing any more

Yeah, it's lamentable. But the philosophy of never breaking backwards compatibility, nor breaking existing applications, plus the common practice of extending native objects, makes it impossible to extend the language in a sane way like adding a normal iterator method to arrays.

Hell, not even something as simple as Array#contains can be added to the standard library because some library monkey-patched it and there are tons of applications relying on it being something else by now.

I think for x from gen (or another syntax) is the only way that really makes sense here, unfortunately. Otherwise we basically have to either know the type in the compiler (not currently possible), or do a runtime check to see if its an object or iterator, which will be slow.

ES5 has two for loops. So does CS. CS for-in is ES5 for with shorter syntax. CS for-of is ES5 for-in. ES6 added a third for loop. Why can't CS too? CS for-from would be ES6 for-of.

But wouldn't it just be super tragic for:

CoffeeScript ES6
for-in for
for-of for-in
for-from for-of

... especially when, IIRC, ES6's for-of was partially a syntax "cowpaved path" borrowed from CoffeeScript?

It's even more tragic when you consider that in a perfect language, you wouldn't have three different syntaxes for these loops — you would only have one that handled arrays, objects and iterables.

Yeah, it sucks and there should definitely be one loop type ideally. But unless you want to implement type-inference in the compiler (hard), or type checks at runtime (slow), we're kinda stuck right?

My two cents: What about adding a keyword to for...of to let the interpreter know it's a generator?

for i of yielded gen()
  code...
  • or -
for i in yielded gen()
  code...

One of those would compile to for (i of gen()) { code... }

Cheers!

my idea:

gen = () ->
  index = 0

  while index++ < 10
    yield index

a = gen()

for val next a
  console.log val

because An object is an iterator when it implements a next()

@lydell

forOf = (gen, fn) ->
  `for(value of gen()) fn(value)`
  undefined

otherwise

    return for(value of gen()) fn(value);
           ^^^
SyntaxError: Unexpected token for

Haha yep for-of performance sucks. Its purposely bailed out in v8 so it's performance is bad.
But we have to bear in mind it's still a draft spec and things could change. Hopefully it will be much better when the first spec is finalised.

I think if there is a new loop syntax introduced it should be pronounceable in English.

+1 @alubbe re: yield being a separate and awesome feature, which i've been using like crazy since it became available, without ever need iterator support. (Thank you.)

Key point: people keep introducing generator-specific syntax, when the issue is iterator support. generators just happen to be iterators. that's the only connection. We want iterator support here.

Since current for ... in performance is far better than using iterators, we're not going to slow it down to check for an iterable, and we don't want a third syntax, that means this isn't a coffeescript feature for the foreseeable future, right?

FWIW, you could generate a cheap runtime check to see if you're dealing with an iterator by checking for length or next, since these must be defined anyway. This does not appear to be appreciably slower than the current compiled code, at least on modern browsers:

http://jsperf.com/for-in-vs-for-of/6

You could always do something like:

for (var i of nums) {}
for (var _iterator = nums, _isArray = Array.isArray(_iterator), _i = 0, _iterator = _isArray ? _iterator : _iterator[Symbol.iterator]();;) {
  var _ref;

  if (_isArray) {
    if (_i >= _iterator.length) break;
    _ref = _iterator[_i++];
  } else {
    _i = _iterator.next();
    if (_i.done) break;
    _ref = _i.value;
  }

  var i = _ref;
}

Produces much more code but it puts arrays in the fast-path while also supporting all iterables.

@sebmck , your solution must run an if branch statement every step of the loop.

how about this CoffeeScript:

for e of iterator
  <code>

compiles into this ES6 javascript:

if (Array.isArray(iterator)) {
  for (var i = 0, len = iterator.length; i < len; i++) {
    var e = iterator[i];
    <code>
  }
} else {
  for (var e of iterator) {
    <code>
  }
}

Pros:

  • easy to implement
  • easy to transition once browsers have VM optimization on Array iterator loops.
  • until the optimization is made:
    • the extra js being generated is not that expensive when it is gzipped

Cons:

  • until the optimization is made:
    • code in for loops will be generated twice
    • 1 isArray method call is ran at run-time at the beginning of each loop.

CoffeeScript for loops with 2 variables:

for key, value of object
  <code>

can remain the same.

@MetaMemoryT Yes and the overhead of that if is extremely minimal. Copying the entire loop body is not at all practical.

So what's the current best way to step through a generator in CS?

Actually, I think a function called foreach will be enough. since es6 for..of is already slow enough :)
ie. you can enumerate a range like this:

foreach range(5), (x) ->
    console.log x

this foreach is implemented here and you can try it here

@MetaMemoryT +1 I like this since we also consider that when the key,value of obj and syntax is in place, we don't need this check.

Only in the case of item of mysteryThing do we need to check if we are dealing with an obj, array or generator. My 2 cents is that the syntax of key of obj be depreciated but supported for a little while before the behavior is consistent with ES6. Besides, the current key of obj is reverting to key in obj which is kind of strange anyway in my opinion.

Some further thoughts:

In ES2015 I understand there are at least three ways of iterating over a generator (see https://leanpub.com/exploring-es6/read, 21.3.1):

  • for ... of (broken in coffeescript)
  • let [x,y] = gen() (desctructuring, doesn't seem to work in coffeescript right now)
  • let arr = [...genFunc()]; (spread operator, doesn't seem to work in coffeescript right now)

What is the idea to support in coffeescript?

Since I anyhow have to call coffee --nodejs --harmony_generators test.coffee because ES5 doesn't support function*, why not all of them since the underlying transpiled javascript is anyhow not ES5?

I guess the problem is to distinguish between a generator object and "normal" object for for...to and the assignment operator = to determine whether a ES5 or an ES6 construct is necessary.

Maybe typeof obj[Symbol.iterator] == "function" (again taken from https://leanpub.com/exploring-es6/read) could help? To take the example of for...of:

if (typeof obj[Symbol.iterator] == "function") {
    // use ES6 for...of
} else {
    // use ES5 for...in
}

let [x,y] = gen() (desctructuring, doesn't seem to work in coffeescript right now)

Are you saying that isn’t equivalent to the following?

let tmp = gen()
let x = tmp[0]
let y = tmp[1]

no, I mean destructuring of generators. While it should work in ES6/ES2015, in coffeescript

squares = ->
    num = 0
    while num < 2
        num += 1
        yield num * num
    return

[x, y] = squares()
console.log x, y

returns undefined undefined

$ coffee -bpe '[x, y] = squares()'
var ref, x, y;

ref = squares(), x = ref[0], y = ref[1];

So you mean that the above is equivalent to var [x, y] = squares() but still does not work as intended?

maybe I miss something, so here is my terminal output

$ cat test2.coffee
squares = ->
  num = 0
  while num < 2
    num += 1
    yield num * num
  return

[x, y] = squares()
console.log x, y


$ coffee --nodejs --harmony_generators test2.coffee
undefined undefined


$ coffee -bp test2.coffee
var ref, squares, x, y;

squares = function*() {
  var num;
  num = 0;
  while (num < 2) {
    num += 1;
    (yield num * num);
  }
};

ref = squares(), x = ref[0], y = ref[1];

console.log(x, y);

I would not have expected undefined undefined

please see also #4018 for some surprising generator behaviour (at least for me)

Yes, @lydell, it is different. It does not use indexing internally, but instead calls next on the iterable repeatedly.

@michaelficarra Thanks for explaining!

@michaelficarra So the only way to access elements of a generator in coffescript right now is to iterate over it using .next(), e.g.

iterator = squares()
until ((it = iterator.next()).done)
  doSomething it.value

No for loops and no deconstruction, right?

@bernhard-42 As of right now, yes.

igl commented

Bad performance is not specific to the for of feature. All es6 features are not optimized and v8 will just bail out to the interpreter. Afaik it does that on functions including a try-catch too. ES5 features also took a long time to be optimized by V8.

Therefor the performance issue is less relevant than generally thinking about how breaking changes will be introduced into coffee in the future. TC39 is just starting and es7 and es8 are not too far off... They will certainly not stop advancing the language now like they did 15 years ago and if coffee wants to stay relevant it has to move on too. Hack or Break(+1)?

I tend to think the optimal syntax for using iterators/generators would be for ... using since it just sounds right and suggests the iterated object carries special meaning (IE: for n using fibonacci()), but that would introduce a new keyword. The next best thing would have to be for ... with since the with keyword is JavaScript reserved and is not used in CoffeeScript at all.

Neither of these syntax suggestions would interfere with legacy CoffeeScript code either.

This is all assuming you want to keep iterator syntax separate from array (for ... in) and object (for ... of) iteration syntax. It would seem to me to be a good idea to do so, as long as CoffeeScript is maintaining ES5 compatibility, so as not to require weird conditionals at every for ... of block to figure out what the nature of the object being iterated is.

Plus, ES5 could take advantage of iterators too; so long as the object being used follows the iterator protocol, a structure not unlike what was suggested by @bernhard-42 could be used to run through it.

Unfortunately for standard arrays in the current nodejs, iojs, Chrome and Firefox typeof arr[Symbol.iterator] is also "function". The performance of for of for standard arrays is really different across all platforms - from poor to good:

                                  for(;;)  while()  for of  iterator via while
nodejs 0.12.5                         4       4      363         367
iojs v2.3.1                           5       4       93          64
Firefox 38.0.5                      160     154       59         255
Chrome 43.0.2357.130 (64-bit) Mac   306     325      125         694

(see https://gist.github.com/bernhard-42/27836f4ce719de6bee3e)

For the nodejs world I would really not like to see ES6 for...of used for standard arrays for the time being. The penalty is just too big. Any conditional around for...of needs to take care to omit arrays, e.g.

if (!Array.isArray(ref) && (typeof Symbol != "undefined") && (typeof ref[Symbol.iterator] == "function"))

From a performance perspective not critical, but ugly ...

fa7ad commented

I'm probably not qualified enough to be anpart of thus discussion but here's my 2 cents

What if we had a special shebang-esque comment to put CS in ES2015 mode?

Something like
#! es2k15

What I'm proposing is everything by default compiles to ES5 (discount the es6 features already implemented) but when the shebang-thing is present, things like for..of are compiled assuming ES2015 (compiled to for..of rather than a es5 for..in).

This -i think- would not break backwards compatibility and allow the es2015 users to use the parts they love about es2015 in CS without the need for new keywords, etc.

fa7ad commented

*anpart = a part
(Typo)

Bikeshedding here, but what about this (no new keyword):

for yield value in squares()
  alert value

@nilskp I don't think would work, because that's already a valid expression for a function returning an array. You can't assume that it's returning an iterator.

@dyoder is it valid syntax? I get Error on line 1: unexpected yield.

Coffeescript 2.0 should break backward compatibility and compile to ES6 & ES7 or become itself "backwards" from an era progressively ever behind us.

@nilskp I stand corrected. :)

My brain parsed that as an expression and move right along. But of course that's an assignment so an expression isn't valid. So what you're suggesting is basically a for yield construct?

My only concern with that is semantic. Iterators aren't related to yield, which is associated with Generators. Generators, of course, are Iterable, but so are arrays and so forth.

@dyoder My proposal was for generators specifically, which was the original subject. I haven't thought about Iterators in general.

Maybe:

for next x in generator()
  console.log x

I'm not sure anyone here wants a new keyword, but I'd prefer this over for yield x... since it makes it very obvious we're going by the generated next value and not yielding something.

Considering that generators in Coffeescript are defined in the same way regular functions are: It seems like there's an effort to make creating and using generators as transparent as possible. Iterating over a generated iterator is done in an each()-like way, so it might make sense to have the identical syntax:

for x in generator()
    console.log x

Coffeescript could reference a utility function to detect if it should iterate by .length or by .done. Afaik in ES6 arrays have a defined iterator anyway. ([Symbol.iterator])

eachIter = (a, f) -> (isArray(a) and walkArray or walkIter) a, f

walkArray and walkIter would do what you expect. It'd be important to separate it into 2 different code paths so you're not testing whether to treat the 'array-like' as an array or an iterator on each iteration. (performance concern?)

I'd like to voice support for for...from. I thought of this independently the other day and saw it way up the thread. There is already precedent for CoffeeScript to deviate from standard loop names and improve them.

Javascript had for (var key in obj). "For key in object" doesn't even make semantic sense compared to CoffeeScript's replacement: for key, val of obj. It does exactly what one would expect and "for key/value pair of object" makes way more sense.

Then, using for val in arr completely altering the native Javascript meaning of for...in was also was an improvement. "For value in array". Or optionally, for val, idx in arr also read as "for value and index in array". Both these read way better semantically. Finally, CoffeeScript exposes for key of obj which is actually full circle back to the only original native Javascript possibility: for (var key in obj).

So basically for...in and for...of were added to CoffeeScript to completely supersede the sad version of for...in that Javascript supported, even with completely disparate syntax.

Now Javascript finally has for...of and they choose to make it not even support proper enumeration of objects but just iterators. It's perfectly in line with previous decisions made by the CoffeeScript team to say "this syntax is garbage and we will make our own".

tl;dr
Just like CoffeeScript's for...in and for...of make semantic sense in use, for...from makes semantic sense for an iterator or generator. You are saying "for every value I take from this iterator". When looping over an array you are taking every value in it. When looping over a collection of key/value pairs you taking every key and value of it. And when looping over an iterator that yields a new result each time you are taking values from it.

atg commented

I'm totally convinced by the rationale for for ... from. Having to state awareness the loop may (or may not) empty the thing being looped over, is definitely a feature.

I can't tell you the number of times I've accidentally tried to loop over the same generator twice in Python. Of course the second loop never executes because the generator is now empty! Having for...from to annotate which of my loops consciously support generators would be very useful.

This has been merged into master per #4355. Anyone up for writing some documentation?

Great stuff! ✨

in a perfect language, you wouldn't have three different syntaxes for these loops — you would only have one

That is still an option. A universal for .. within .. loop that accepts the performance hit and determines the type at runtime?

That is still an option. A universal for .. within .. loop that accepts the performance hit and determines the type at runtime?