Slice notation

This repository contains a proposal for adding slice notation syntax to JavaScript. This is currently at stage 0 of the TC39 process.

Introduction

The slice notation provides an ergonomic alternative to the various slice methods present on Array.prototype, String.prototype, etc.

const arr = ['a', 'b', 'c', 'd'];

arr[1:3];
// → ['b', 'c']

arr.slice(1, 3);
// → ['b', 'c']

const str = 'hello world';

str[6:];
// → 'world'

str.slice(6);
// → 'world'

This notation can be used for slice operations on primitives like String and any object that provides indexed access using [[Get]] like Array and TypedArray.

The length used for these operations is the length property of the object.

const obj = { 0: 'a', 1: 'b', 2: 'c', 3: 'd', length: 4 };
obj[1:3];
// → ['b', 'c']

The slice notation extends the slice operations by accepting an optional step argument. The step argument is set to 1 if not provided.

const arr = ['a', 'b', 'c', 'd'];
arr[1:4:2];
// → ['b', 'd']

Motivation

const arr = ['a', 'b', 'c', 'd'];
arr.slice(3);
// → ['a', 'b', 'c'] or ['d'] ?

In the above example, it's not immediately clear if the newly created array is a slice from the range 0 to 3 or from 3 to len(arr).

const arr = ['a', 'b', 'c', 'd'];
arr.slice(1, 3);
// → ['b', 'c'] or ['b', 'c', 'd'] ?

Adding a second argument is also ambigous since it's not clear if the second argument specifies an upper bound or the length of the new slice.

Programming language like Ruby and C++ take the length of the new slice as the second argument, but JavaScript's slice methods take the upper bound as the second argument.

const arr = ['a', 'b', 'c', 'd'];
arr[3:];
// → ['d']

arr[1:3];
// → ['b', 'c']

With the new slice syntax, it's immediately clear that the lower bound is 3 and the upper bound is len(arr). It makes the intent explicit.

The syntax is also much shorter and more ergonomic than a function call.

The step argument is useful for patterns like creating a slice with every other element in an array or for reversing an array.

const arr = ['a', 'b', 'c', 'd'];
arr[1:4:2];
// → ['b', 'd']

arr[::-1];
// → ['d', 'c', 'b', 'a']

The step argument also makes it really easy to work with matrices.

const matrix = [ 1, 2, 3,
                 4, 5, 6,
                 7, 8, 9 ];
getColumn = col => matrix[col::3];

This is used a lot in scientific computing projects in other programming languages. For example:

Examples

In the following text, 'length of the object' refers to the length property of the object.

Default values

The lower bound, upper bound and the step argument are all optional.

The default value for the lower bound is 0.

const arr = ['a', 'b', 'c', 'd'];

arr[:3:1];
// → ['a', 'b', 'c']

The default value for the upper bound is the length of the object.

const arr = ['a', 'b', 'c', 'd'];
arr[1::1];
// → ['b', 'c', 'd']

The default value for the step argument is 1.

const arr = ['a', 'b', 'c', 'd'];

arr[1:];
// → ['b', 'c', 'd']

arr[:3];
// → ['a', 'b', 'c']

arr[1::2];
// → ['b', 'd']

arr[:3:2];
// → ['a', 'c']

Omitting all lower bound and upper bound value, produces a new copy of the object.

const arr = ['a', 'b', 'c', 'd'];

arr[:];
// → ['a', 'b', 'c', 'd']

arr[::];
// → ['a', 'b', 'c', 'd']

Negative indices

If the lower bound is negative, then the start index is computed as follows:

start = max(lowerBound + len, 0)

where len is the length of the object.

const arr = ['a', 'b', 'c', 'd'];

arr[-2:];
// → ['c', 'd']

In the above example, start = max((-2 + 4), 0) = max(2, 0) = 2.

const arr = ['a', 'b', 'c', 'd'];

arr[-10:];
// → ['a', 'b', 'c', 'd']

In the above example, start = max((-10 + 4), 0) = max(-6, 0) = 0.

Similarly, if the upper bound is negative, the end index is computed as follows:

end = max(upperBound + len, 0)

const arr = ['a', 'b', 'c', 'd'];

arr[:-2];
// → ['a', 'b']

arr[:-10];
// → []

These semantics exactly match the behavior of existing slice operations.

If the step argument is negative, then the object is traversed in reverse.

const arr = ['a', 'b', 'c', 'd'];

arr[::-1];
// → ['d', 'c', 'b', 'a']

Out of bounds indices

Both the lower and upper bounds are capped at the length of the object.

const arr = ['a', 'b', 'c', 'd'];

arr[100:];
// → []

arr[:100];
// → ['a', 'b', 'c', 'd']

These semantics exactly match the behavior of existing slice operations.

Prior art

Python

This proposal is highly inspired by Python. Unsurprisingly, the Python syntax for slice notation is strikingly similar:

slicing      ::=  primary "[" slice_list "]"
slice_list   ::=  slice_item ("," slice_item)* [","]
slice_item   ::=  expression | proper_slice
proper_slice ::=  [lower_bound] ":" [upper_bound] [ ":" [stride] ]
lower_bound  ::=  expression
upper_bound  ::=  expression
stride       ::=  expression

Examples:

arr = [1, 2, 3, 4];

arr[1:3];
// → [2, 3]

arr[1:4:2]
// → [2, 4]

CoffeeScript

CoffeeScript provides a Range operator that is inclusive with respect to the upper bound.

arr = [1, 2, 3, 4];
arr[1..3];
// → [2, 3, 4]

CoffeeScript also provides another form the Range operator that is exclusive with respect to the upper bound.

arr = [1, 2, 3, 4];
arr[1...3];
// → [2, 3]

Go

Go offers slices:

arr := []int{1,2,3,4};
arr[1:3]
// → [2, 3]

There is also ability to not provide lower or upper bound:

arr := []int{1,2,3,4};
arr[1:]
// → [2, 3, 4]

arr := []int{1,2,3,4};
arr[:3]
// → [1, 2, 3]

Ruby

Ruby seems to have two different ways to get a slice:

Using a Range:

arr = [1, 2, 3, 4];
arr[1..3];
// → [2, 3, 4]

This is similar to CoffeeScript. The 1..3 produces a Range object which defines the set of indices to be sliced out.

Using the comma operator:

arr = [1, 2, 3, 4];
arr[1, 3];
// → [2, 3, 4]

The difference here is that the second argument is actually the length of the new slice, not the upper bound index.

This is currently valid ECMAScript syntax which makes this a non starter.

const s = 'foobar'
s[1, 3]
// → 'b'

FAQ

Why pick the Python syntax over the Ruby/CoffeeScript syntax?

The Python syntax allows us to provide an optional step argument.

Also, the Python syntax which excludes the upper bound index is similar to the existing slice methods in JavaScript.

We could use exclusive Range operator (...) from CoffeeScript, but that doesn't quite work for all cases because it's ambiguous with the spread syntax. Example code from getify:

Object.defineProperty(Number.prototype,Symbol.iterator,{
  *value({ start = 0, step = 1 } = {}) {
     var inc = this > 0 ? step : -step;
     for (let i = start; Math.abs(i) <= Math.abs(this); i += inc) {
        yield i;
     }
  },
  enumerable: false,
  writable: true,
  configurable: true
});

const range = [ ...8 ];
// → [0, 1, 2, 3, 4, 5, 6, 7, 8]

Why does this not use the iterator protocol?

The iterator protocol isn't restricted to index lookup making it incompatible with this slice notation which works only on indices.

For example, Map and Sets have iterators but we shouldn't be able to slice them as they don't have indices.

What about splice?

CoffeeScript allows similar syntax to be used on the left hand side of an AssignmentExpression leading to splice operation.

numbers = [1, 2, 3, 4]
numbers[2..4] = [7, 8]
// → [1, 2, 7, 8]

This doesn't work with Strings as they are immutable, but could be made to work with any object using a [Set]] operation.

This feature is currently omitted to limit the scope of the proposal, but can be incorporated in a follow on proposal.

Doesn't the bind operator have similar syntax?

Unfortunately, yes. The ambiguity arises from this production:

const x = [2];
const arr = [1, 2, 3, 4];
arr[::x[0]];

Is the above creating a new array with values [1, 3] or is it creating a bound method?

Should this create a `view` over the array, instead of a creating new array?

Go creates a slice over the underlying array, instead of allocating a new array.

arr := []int{1,2,3,4};
v = arr[1:3];
// → [2, 3]

Here, v is just descriptor that holds a reference to the original array arr. No new array allocation is performed. See this blog post for more details.

This doesn't map to any existing contruct in JavaScript and this would be a step away from how methods work in JavaScript. To make this syntax work well within the JavaScript model, such a view data structure is not included in this proposal.

What happens when you slice a String that contains multi-point characters?

The slice notation maintains the behavior of the existing String.prototype.slice method.

Should we ban slice notation on strings?

The String.prototype.slice method doesn't work well with unicode characters. This blog post by Mathias Bynens, explains the problem.

Given that the existing method doesn't work well, banning the slice notation for strings might be a good idea to prevent more footguns.

How about combining this with `+` for append?

const arr = [1, 2, 3, 4] + [5, 6];
// → [1, 2, 3, 4, 5, 6]

This is not included in order to keep the proposal's scope maximally minimal.

The operator overloading proposal may be a better fit for this.

Can you create a Range object using this syntax?

Languages like Ruby, evaluate their slice (well, range) syntax to create a Range object.

range = 1..4
// → 1..4

A similar construct is already possible with the spread operator as shown in an example in an above FAQ.

Isn't it confusing that this isn't doing property lookup?

This is actually doing a property lookup using [[Get]] on the underlying object. For example,

const arr = [1, 2, 3, 4];

arr[1:3];
// → [1, 2, 3]

This is doing a property lookup for the keys 1, 2 and 3.

But, shouldn't it do a lookup for the string '1:3'?

const arr = [1, 2, 3, 4];

arr['1:3'];
// → undefined

No. The slice notation makes it analogus with how keyed lookup works. The key is first evaluated to a value and then the lookup happens using this value.

const arr = [1, 2, 3, 4];
const x = 0;

arr[x] !== arr['x'];
// → true

The slice notation works similarly. The notation is first evaluated to a range of values and then each of the values are looked up.

There are already many modes where ':' mean different things. Isn't this confusing?

Depending on context a:b, can mean:

LabelledStatement with a as the label
Property a with value b in an object literal: {a: b }
ConditionalExpression: confused ? a : b
Potential type systems (like TypeScript and Flow) that might make it to JavaScript in the future.

Is it a lot of overhead to disambiguate between modes with context? Major mainstream programming languages like Python have all these modes and are being used as a primary tool for teaching programming.

Can the upper bound, lower bound or the step argument be an arbitrary Expression?

Currently the proposal (arbitrarily) restricts them to be an IdentifierReference or DecimalDigits.

chicoxyzzy/proposal-slice-notation