jashkenas/coffeescript

Remove Ranges, introduce "from x to y"

satyr opened this issue · 38 comments

tl;dr

Introduce infix to (with optional by) and replace current ranges with it.

intro

The pseudo ranges may look nifty in the first glance, but I think
they've become the bad parts in CoffeeScript.

proposed syntax

i for i in x to y by z
a = [x to y]  # can be by-ed as well
# no slice/splice

instead of

i for i in [x .. y] by z
a = [x .. y]
b[x .. y] = c[x .. y]

why current ranges are bad

Because they

  • behave wildly different in different contexts.
    • array / array comprehension / loop range / slice / splice
  • have awkward syntax.
    • (i for i in [.0 .. .2] by .1)
      • Parses as [(0)...(2)].
      • Looks ugly too.
  • generate inefficient code or are inferior syntax sugars.
    • x = 1; y = 5; [x..y]
      • This is because it has to determine the direction dynamically. to-by would have a fixed step and thus generate better code.
    • Slicing is broken (or less versatile than .slice)
      • '012345'[-3..-1]
    • Splicing is broken (and less useful in general)
      • a = [0..5]; a[1..-2] = [42]
      • Ruby
  • can be exclusive, but this is a confusing feature (especially with the look). We can live without it by simply subtracting from the right operand (since we only have numeric ranges).
  • are confusing in the presence of splatted arrays: [x...,y]

edit: removed cup links that no longer work.

By no means criticising, but here are my thoughts:

behave wildly different in different contexts.

List of 1 to 5 [1..5], expand list to values list..., loop values 1 to 5 for [1...5], slice list values list[1..5]. I've realised it's much more consistent than it first looks.

(i for i in [.0 .. .2] by .1)

So does (i for i in .0 to .2 by .1). Much better to just add those zeros back in.

generate inefficient code or are inferior syntax sugars.

Agreed, but to-by is a new syntax so it can easily be back-ported to be [x..y by z].

Slicing is broken (or less versatile than slice)
Splicing is broken (and less useful in general)

How does to address this?

can be exclusive

I hate remembering which one is which. More dots -- inclusive, right? Wrong.

List of 1 to 5 [1..5], expand list to values list..., loop values 1 to 5 for [1...5], slice list values list[1..5]. I've realised it's much more consistent than it first looks.

The loop range supports by, others don't (the brackets look out of place too). Negative values behave different in slice/splice. a... is not a range.

How does to address this?

By removing them.

I hate remembering which one is which. More dots -- inclusive, right? Wrong.

Exactly.

I kind of like to, apart from removing slicing/splicing. I would hate to see these go away just because negative indices don't play nice.

apart from removing slicing/splicing

Maybe it could stay in the form of a[x to y], but how would you support by with them?

The fact that no splicing is used in the compiler itself (albeit the extensive use of splice in rewriter) is kind of telling its awkwardness I suppose.

I implemented a branch once (long time ago, dark ages) that supported both negative indices and by in slices and splices. It didn't get merged it because it was too awkward to have a[-10:-1:-2] and not to allow for a[-10] alone. Anyhow, I did it using __helpers so it certainly is possible. While CoffeeScript's core is a good measure of what is useful in the language, it certainly isn't the only place we should be looking for. I kind of agree ranged splices are not very common, but ranged slices are very useful, especially when combined in a destructuring assignment like so [apiKey, success] = response[1..]. Don't take this as the only place where they are useful cause it certainly ain't.

Back to the point, let's take out splices then and have a syntax for list[top to bottom].

ranged slices are very useful, especially when combined in a destructuring assignment like so [apiKey, success] = response[1..]

Well of course it's a must-have feature. But since JS's slice is quite strong as is, it'd end up with just an awkward sugar.

[apiKey, success] = response[1 to response.length - 1]
[apiKey, success] = response.slice 1

If we do pursue this direction, I don't think that by should be mandatory... We've already gone back and forth on this issue:

puts i for i in 1 to 10

... should be legit.

(with optional by)

It'd be great to lose the .. vs ... ambiguity; I think inclusivity is a reasonable default. And while slices might look good the way they are, they do behave inconsistently. I don't like the fact that enumerable objects exposing a[b] don't necessarily support a[b..c], since it compiles to a slice call, which is only valid against normal arrays. So +1 for losing that too.

What about having a til operator for exclusive?

akva commented

It'd be great to lose the .. vs ... ambiguity;

This is a great idea.

I think inclusivity is a reasonable default.

I'd prefer it to be exclusive by default, to be consistent with JS's slice. Makes it easier to remember

To me, [3 to 7] looks like it means [3, 4, 5, 6, 7], with the 7 included. Of course, that's only my opinion. I see your point about slice, but since this proposal entails removing the range-as-subscript shortcut for slices, I think it's ok if ranges and slices have different inclusivity rules.

http://github.com/satyr/coffee-script/compare/master...to
1st stab without lexer/rewriter hacks (meaning to is a reserved keyword).

I'd like to merge this soon, and there's one more issue discussed in #coffeescript that should be mentioned here.

I'm not a big fan of our [1..5] literals (with this ticket [1 to 5]), which expand into loop comprehensions that generate the corresponding array at runtime. I use them occasionally, but hardly ever, and would rather get rid of them. If you want to generate an array from a range at runtime, you would write it like this:

array = (i for i in 0 to 100)

Sound reasonable? Anyone using the insta-arrays?

akva commented

+1 for removing [1..5] literals

I would still prefer for i in 0 to 100 to be exclusive though. Besides being consistent with slice it's also consistent with common idiom for (var i = 0; i < a.length; i++)

@akva: I would argue that the idiom of iterating from 0 to a.length - 1 is already well served by for element, i in a. How often do you want an exclusive range when you are not enumerating the elements of a list?

@jashkenas: +1 for losing the [1..5] literals, but why not have (0 to 100) as a shortcut to (i for i in 0 to 100)?

sethaurus: just to have one less piece of rarely-used special syntax.

In the absence of range literals, the for i in 0 to 100 syntax is potentially confusing.
Compare it to the deceptively similar for i in (0 to 100). This would be a syntax error, since 0 to 100 is not a real expression, and cannot be evaluated on its own. There is no underlying range object over which to iterate, but the in keyword seems to suggest that there is. Actually, it's a different kind of iteration, distinct from for-in and for-of. How about for-from? How does this look?
for i from 0 to 100
etc i

range = (start, end, step) ->
    i for i from start to end by step or 1
akva commented

@sethaurus I was actually going to propose this syntax too. I like it better. The only objection is that it introduces a new keyword from which may conflict with existing libraries. I'm not sure about CS's keywords policy.

The only objection is that it introduces a new keyword from which may conflict with existing libraries

This may actually solve the keyword problem. all of for all isn't a keyword because it can only appear after for. We can do similar with from/to.

That sounds like the best of both worlds.

akva commented

@satyr Sounds good. Then you can write
for i from from to to

Nice! Glad we won't have to reserve from and to.

http://github.com/satyr/coffee-script/compare/master...fromto

$ bin/coffee --no-wrap -pe 'i for i from x to y'
var _to, i;
for (i = x, _to = y; i <= _to; ++i) {
  i;
}
$ bin/coffee --no-wrap -pe 'i for i from x to y by -2'
var _to, i;
for (i = x, _to = y; i >= _to; i -= 2) {
  i;
}
$ bin/coffee --no-wrap -pe 'i for i from x to y by z'
var _step, _to, i;
for (i = x, _to = y, _step = z; _step < 0 ? i >= _to : i <= _to; i += _step) {
  i;
}
$ bin/coffee --no-wrap -pe 'i for i in a by -2'
var _i, _ref, i;
for (_i = (_ref = a).length - 1; _i >= 0; _i -= 2) {
  i = _ref[_i];
  i;
}
dsrw commented

I don't want to lose ranges or slicing, and I think satyr's branch above does a fair job of showing why. Reading the code, I prefer the previous version, even though 1 to 5 is much nicer than [1..5]. The .slice method is exclusive and reads poorly, and ranges are too handy a shorthand to throw away completely. I use them all the time in my tests.

I have no idea if the following is parsable, but what about something like this?

letters = ['a', 'b', 'c', 'd']
start = 0
end = -1

ok (1 to 5).join() is "1,2,3,4,5"
ok (l for l from 2 to end in letters).join() is "c,d"
ok (2 to end in letters).join() is "c,d"
ok (start to 1 in letters).join() is "a,b"
ok (0 in letters) is "a"
ok (-1 in letters) is "d"
ok (end to start in letters).join() is "d,c,b,a"
ok (start to end in letters by 2).join() is "a,c"

Well, the 0 in letters isn't valid, because it's already has a meaning. Besides, you'd just use letters[0]. Being able to write start to end in array seems cool, but is it really that bad to just write array.slice(start, end)?

@hiddenbek: glad I am not the only one not too fond of the new syntax. Kind of sold on it, but not there just yet. [..] stills reads better for me and without a syntax highlighter (not everyone is on Textmate you know) this is just bad:

for value from top to bottom
vs.
for value in [top...bottom]

Besides ranges are cool:

action() for [0...length]
dsrw commented

@Tesco: Right, in is already an indexOf type helper. I agree that 0 in letters isn't very useful, but -1 in letters would have been nice. Oh well.

slice is exclusive, so if end is a positive index we'd have array.slice(start, end + 1). It works, but I'm not a fan.

@StanAngeloff: It definitely feels like one step forward, two steps back. I prefer reading the new version, but I quickly scan my code more often than I actually read it, and the new version turns into a jumble of words while scanning. The old way makes it easy to identify the important bits at a glance. That said, if we could keep slices and ranges, I'd still call it a win.

Let's stop imitating Ruby. We can do better, right? ;)

I don't think the exclusivity of Array::slice is an argument in favor of range subscripts. It's pretty easy to define Array::grab = (start, end) -> @slice start, (end + 1). Do we really need to implement this method in the core, with its own special syntax?

I like the for i from x to y syntax, but the slicing syntax should stay Ruby-esque. I also like being able to declare array literals such as [x..y].

satyr -- mind rebasing your fix against the current master? There's a pretty large set of merge conflicts between the two branches at the moment, and I don't feel expert enough in the patch to try and resolve them.

rebasing your fix against the current master?

Trying. Kinda hard with the recent hacks went into For (#728, #781).

Done rebasing over master, without slice/splice/range and the scope magic.

Considering #781, should we change

for (i = x; i <= y; ++i)

to

for (i = x - 1; ++i <= y;)

or even

i = x - 1;
while (++i <= y)

?

The new for i from 0 to 10 syntax is now merged to master...

Two cents regarding iterating downwards...

Instead of: for i from 10 to 1 by -1
How about: for i from 10 downto 1 ?

Was this syntax reverted? I'm using 1.2.0, which should be after the merge, and I'm getting a parse error:

coffee> i for i in [1..10]
[ 1,
  2,
  3,
  4,
  5,
  6,
  7,
  8,
  9,
  10 ]
coffee> i for i from 1 to 10
Error: In repl, Parse error on line 1: Unexpected 'CALL_START'
    at Object.parseError (/usr/lib/node_modules/coffee-script/lib/coffee-script/parser.js:470:11)
    at Object.parse (/usr/lib/node_modules/coffee-script/lib/coffee-script/parser.js:546:22)
    at /usr/lib/node_modules/coffee-script/lib/coffee-script/coffee-script.js:40:22
    at Object.eval (/usr/lib/node_modules/coffee-script/lib/coffee-script/coffee-script.js:123:10)
    at Interface.<anonymous> (/usr/lib/node_modules/coffee-script/lib/coffee-script/repl.js:51:34)
    at Interface.emit (events.js:67:17)
    at Interface._onLine (readline.js:162:10)
    at Interface._line (readline.js:426:8)
    at Interface._ttyWrite (readline.js:603:14)
    at ReadStream.<anonymous> (readline.js:82:12)
    at ReadStream.emit (events.js:88:20)

Yep.