bazelbuild/starlark

Clarify Sequence Type Relationships

mahmoudimus opened this issue · 3 comments

In the spec, under Sequence types, it states:

Sequence: a sequence of known length lets us know how many elements it contains without processing them. Examples: dict, list, tuple, but not string or bytes.

However, in the Java implementation, Dict does not implement the Sequence interface. I think this is a mistake in the spec and should be removed:

-Sequence: a sequence of known length lets us know how many elements it contains without processing them. Examples: dict, list, tuple, but not string or bytes.
+Sequence: a sequence of known length lets us know how many elements it contains without processing them. Examples: list, tuple, but not string or bytes.

This exposes a few other questions that the spec mentions:

  • SetIndexable currently does not exist. The Java implementation relies on EvalUtils#setIndex but is hardcoded to only Dict and List. Should there be a StarlarkSetIndexable interface or should there be a patch to the existing StarlarkIndexable, similar to Evalutils#setField? For example:
+  /**
+   * Updates an object as if by the Starlark statement {@code object[key] = value}.
+   *
+   * @throws EvalException if underlying object is immutable.
+   */
+  void setIndex(StarlarkSemantics semantics, Object key, Object value) throws EvalException;
  • The concept of a Mapping is in the spec (as per #198), but does not exist. Should that be introduced alongside with the modification for SetIndexable above? For example:
package net.starlark.java.eval;

import java.util.Map;

public interface StarlarkMapping<K, V> extends Map<K, V>, StarlarkSetIndexable, StarlarkIterable<K> {

}

This would also require a change in Starlark#len().

  • Last, but not least, in the Java implementation Sequence extends the java.util.List interface. Are all sequences a sub-type of List? Did we mean Collection ?

Interesting to hear your thoughts on this.

Thanks.

I think these remarks are all very specific to the Java Starlark, and more than that, the internal implementation of Java Starlark., Whereas this repo is only about the things that can be observed by the source Starlark code. For example there isn't a Sequence interface in Starlark, it's just a bunch of types that can be iterated over. Are you saying that in Java Starlark [x for x in {1:1, 2:2}] doesn't work? Similarly for all the other types, are you saying that the Java Starlark implementation doesn't correctly at the Starlark level implement the spec? Or are you saying that it does implement the spec but the spec should change? If it's merely that the Java implementation should change without changing the Starlark visible behaviour, that's something for a different repo.

Hi @ndmitchell thanks for the response.

I think these remarks are all very specific to the Java Starlark, and more than that, the internal implementation of Java Starlark.,

That's a fair point, I'll clarify this further below.

Whereas this repo is only about the things that can be observed by the source Starlark code. For example there isn't a Sequence interface in Starlark, it's just a bunch of types that can be iterated over. Are you saying that in Java Starlark [x for x in {1:1, 2:2}] doesn't work?

While there isn't a Sequence interface, there are a few different Sequence types, also referred to in the spec as "contracts." I think I may be just equating "contract" to mean "interface," but perfectly happy to use "contracts" over interface.

In your example of [x for x in {1:1, 2:2}], that would be of an iterable contract and that's exactly the kind of ambiguity I'm trying to resolve. For example, if the difference between an Iterable and a Sequence is that a sequence is of known length and an Iterable is of unknown length, that makes sense.

Similarly for all the other types, are you saying that the Java Starlark implementation doesn't correctly at the Starlark level implement the spec? Or are you saying that it does implement the spec but the spec should change?

So this is where it is a bit ambiguous and I'm asking if there needs to be clarification. Take for example the semantics behind a slice expression. What contracts must a particular type satisfy for a slice expression to work? The spec says that they work on an "indexable sequence". Does that apply to all types that satisfy an "indexable" and a "sequence" contract?

Python's collections.abc has a nice table that helps highlight all the operations available to a particular contract. I'm trying to understand if it makes sense to propose (and that's really the purpose of this Github issue) that the Starlark spec follows similar naming conventions and map the appropriate operations to the corresponding Starlark types.

So, something like:

  • List: Satisfies Iterable, Indexable, SetIndexable, Sequence
  • etc. etc.

If it's merely that the Java implementation should change without changing the Starlark visible behaviour, that's something for a different repo.

Of course, I think I was just trying to start a conversation to centralize the discussion before suggesting anything on the bazel repo itself. If it still is too Java implementation specific, I'm happy to close this issue and move it to that repo instead.

Starlark only has three compound types - lists, tuples and dictionaries, with strings being a kinda-compound type. I think it's probably easier to remove all ambiguity from the spec and just say "slices work for lists, tuples and strings", "indexing works for lists, tuples, strings and dictionaries", "iteration works for lists, tuples and dictionaries". Given there are only 4 types, it's not too much to type and is 100% explicit.