Polymorphic operator support for creating ranges
Closed this issue · 48 comments
[@luolong] Currently as far as I understand, Range operators ..
and :
are implemented by invoking Range constructor which basically takes two Ordinal arguments and creates an appropriate Range object.
While working on date/time library for Ceylon I've come up with use cases where it would make sense to create ranges of the circularily referenced discreet set of ordinal values that wrap around either end of the discreet set.
For example given weekdays fith following successor relationships: sunday -> monday -> tuesday -> ... -> saturday -> sunday
It would make huge sense to be able to declare ranges like monday .. sunday
or sunday .. saturday
.
The trouble is that the way Range
is currently implemented, I can not define such ranges whose values are not strictly speaking comparable (monotonously growing).
Without that implementing operations like in
would be hugely inefficient.
On the other hand when implementing types like DayOfWeek
, I can create a less generic implementation that can cope with such fragmented ranges and provide more optimized version of Range
, while making creation of such ranges syntactically much more natural.
So, what I propose is to redefine range creation operators so that they would call a method on the implementation of Ordinal
For example:
monday .. sunday
would be redefined asmonday.rangeTo(sunday)
monday:4
would be redefined asmonday.rangeOf(4)
In addition to making my use case a bit nicer to use, it would also make range creation operators more regular in respect to other polymorphically defined operators like +
, -
, *
, etc.
[Migrated from ceylon/ceylon-spec#451]
[Closed at 2014-06-21 21:37:58]
[@FroMage] It would also make monday.rangeOf(14)
work, to get two weeks.
[@luolong] Yes, that too, although I must admit, I didn't think of it at first...
[@gavinking] I think this change makes sense. We should add spanTo()
and segment()
to Ordinal
. They would return [Other+]
and [Other*]
. I will work on this next.
[@gavinking] Damn, ran into a little problem with this one: Ordinal
is supposed to be covariant, which means the signature of spanTo()
is wrong. Of course I can use the trick I use on Set
. Or I could add a whole new (invariant) type. Don't love either option...
[@gavinking] Slipping from M6.
[@gavinking] @luolong + @DiegoCoronel Is this something we have to do for 1.0, or can we come back to it after that? I'm trying to clear the deck of issues that are blocking the release of 1.0 final.
[@DiegoCoronel] Well, I see it as a plus in the language and that we could have benefits in ceylon.time api and currently i see only this specific behavior below, at least of course if @luolong has some idea of use that we can improve in our current implementation.
For example this code:
for( week in saturday..sunday ) {
print( week );
}
results in:
saturday
friday
thursday
wednesday
tuesday
monday
sunday
this is the current behavior of the language but i think maybe it does not make sense for our case because in my opinion i would not expect that weeks come in the inverse order, so if we launch ceylon final version with this behavior we shouldnt change it anymore. I know you have lot of work to do but i would like to see it for 1.0
[@DiegoCoronel] It means im thinking that with both ideasrangeTo
and rangeOf
we are going to be able to provide the iterator, is it?
[@gavinking] Sorry?
Sent from my iPhone
On 28 Sep 2013, at 4:19 pm, Diego Coronel notifications@github.com wrote:
It means im thinking that with both ideasrangeTo and rangeOf we are going to be able to provide the iterator, is it?
—
Reply to this email directly or view it on GitHub.
[@DiegoCoronel] maybe i misunderstand, but at my reading i was thinking that those ideas were ways to we refine the behavior of monday..sunday
and monday:4
...
[@gavinking] All it really means is that in 1.0 you would write:
saturday to sunday
Using the infix method invocation syntax.
And in a future release of the language we can come back and make DayOfWeek implement Rangeable or whatever. I know it would be nice to write .. instead of to, but really I think we can maybe live without it for a couple of months...
[@gavinking] I'm going to reschedule this for 1.2, unless I hear some screaming from you guys that this is something super-important.
[@luolong] Yes, given the now formally introduced infix operator syntax support, the absence of this feature is now much less noticeable.
[@tombentley] @gavinking are you working on this? If not I can look at it.
[@gavinking] Well I was planning to but not sure if I will get much time in the next week. So let me give you an idea of what I've been thinking when I get back home.
Sent from my iPhone
On 21 May 2014, at 2:12 pm, Tom Bentley notifications@github.com wrote:
@gavinking are you working on this? If not I can look at it.
—
Reply to this email directly or view it on GitHub.
[@tombentley] I'm working on the idea that we add span()
and segment()
methods to Ordinal
(which the ..
and :
operators will be syntax sugar for).
Ideally we'd like to drop the given Element satisfies Comparable<Element>
constraint from Range
, because there are things (such as DayOfWeek
) for which the syntax sugar would be natural (mon:7
) but which don't have a total order (whether sun<mon
depends on whether you think the week starts of a Monday or Sunday). However we can't remove the constraint because unless we know whether start > end
we don't know whether the range is generated via successor
or predecessor
. In other words without the Comparable
constraint the contract for span
is really weak and impossible to document.
Note that this isn't a problem for segment
/:
because these are always increasing/generated using successor
.
Another thing to note about the DayOfWeek
example is that wed:14
generates a perfectly good sequence, though it's not actually a range in the usual sense of the word (it's not even a repeated range). For this reason I think that segment()
/:
should return a Sequence
or an Iterable
Getting back to span()
: We could constrain all Ordinal
s to be Comparable
("if x.successor==y
then x<y
"). The problem with that is it doesn't deal with circularity (maxInt.successor == minInt
would mean that maxInt<minInt
!). So we would have to define it the other way around ("if x<y
then after some number of iterations (x.successor) == y
") which is weaker but gets us out of the circularity hole.
So what I think makes sense and is sufficiently general looks like this:
Firstly we define a supertype of Ordinal
, let's call it successor:
"Abstract supertype of types whose values may have a successor.
It is recommended for implementations that satisfy
Comparable that `x<y` implies `x.successors.contains(y)`
"
see(`interface Ord`)
interface Successor<Other,Absent=Null>
satisfies Iterable<Other,Absent>
given Other satisfies Successor<Other, Absent>
given Absent satisfies Null {
"The successor of this value, or null if there is no successor"
shared formal Other|Absent successor;
"This value and all its successors"
shared default {Other*} successors {
value start = this;
object result satisfies Iterable<Other, Null> {
shared actual Iterator<Other> iterator() {
object iter satisfies Iterator<Other> {
variable Successor<Other,Absent>|Null current = start;
shared actual Other|Finished next() {
if (is Successor<Other,Absent> f=current) {
value result = f;
current = result.successor;
assert(is Other result);
return result;
} else {
return finished;
}
}
}
return iter;
}
}
return result;
}
"This value and the next [[num]] successors"
shared default {Other*} segment(Integer num) {
return successors.take(num);
}
}
The segment operator (:
) would be defined in terms of segment
. Note that this definition leaves it up to the implementor whether every value has a successor, or whether some value(s) are terminal (e.g. you might want to model MonthOfYear
as non circular, so that december.successor
is null, but make DayOfWeek
circular; this would make january:13
be of length 12, but mon:14
be of length 14).
Then we can define Ordinal
like this:
"Specialization of Successor for types which are also Comparable.
For two values `x` and `y` if `x<y` then `x.successors.contains(y)`
Also `x.successor.predecessor == x`
"
interface Ordinal<Other,Absent=Null> of Other
satisfies Successor<Other,Absent> & Comparable<Other>
given Other satisfies Ordinal<Other,Absent>
given Absent satisfies Null {
"The predecessor of this value."
shared formal Other|Absent predecessor;
shared formal Range<Other,Absent> span(Other to);
}
I guess we'd define the ++
, --
, +=
and -=operators on
Ordinal(even though the incrementing ones could be defined in terms of
Successor`).
Range
could remain mostly as it is now, though next()
would be default
and probably include getting the nth next.
Integer
and would satisfy Ordinal<Integer,Nothing>
(because we allow maxInt++
to return an Integer
even if it has overflowed or lost precision), similarly Character
.
[@luolong]
However we can't remove the constraint because unless we know whether
start > end
we don't know whether the range is generated viasuccessor
orpredecessor
.
To be honest, I've always been a little suspicious about this feature. I can see how it might be considered useful, but I am perfectly okay with (1..10).reverse
instead of 10..1
form.
[@luolong] Other than my previous comment, how would your proposal affect performance of a range operator?
Specially, when considering well-known and most likely cases like Integer ranges...
[@gavinking] If there's no ordering, how would you even know there is anything wrong with 10..1?
Sent from my iPhone
On 26 May 2014, at 9:21 am, Roland Tepp notifications@github.com wrote:
However we can't remove the constraint because unless we know whether start > end we don't know whether the range is generated via successor or predecessor.
To be honest, I've always been a little suspicious about this feature. I can see how it might be considered useful, but I am perfectly okay with (1..10).reverse instead of 10..1 form.
—
Reply to this email directly or view it on GitHub.
[@luolong] yea, you've got a point there...
[@luolong]
If there's no ordering, how would you even know there is anything wrong with 10..1?
On the other hand, does it have to?
What would you consider to be the correct set of values returned by this expression: 10..13 step 4
?
My point is - at some point the range will iterate over it's ending point and will have to terminate the iteration. In the case of 10..1
, I would expect the range to return one element: i.e. {10}
In the case of
10..1
, I would expect the range to return one element: i.e.{10}
To me x..y
means start at x
and keep incrementing or decrementing until you get to y
. Similarly x:z
means start at x
and keep incrementing until you've got z
elements.
Without using the ordering (i.e. without supporting decreasing ranges), I agree with Gavin that 10..1
doesn't make sense. I certainly don't think {10}
is an intuitive result in that situation.
[@tombentley] What I suggested above mostly works out fine. The only change I've had to make is that Ordinal<Element> satisfies Successor<Element, Nothing>
, rather than having an extra Absent
type argument to support non-terminating Ordinal
s. The main reason for that is that Range.shifted()
doesn't really work nicely when you have a terminating Ordinal
(it would be weird for a shifted Range
to be empty!).
I'm wondering what @gavinking makes of it all.
[@gavinking] @tombentley I just don't see how this proposal really even addresses the main problem with Ordinal
as it exists today which is that many operations of Range
just aren't efficient unless the type is also Enumerable
. Sure, sure, you've given Ordinal
a formal span()
operation, so, indeed, I can go ahead and create my own Range
type. But then WTF is the point of all this complexity of Successor
, Ordinal
, etc if at the end of the day you're not even using them for anything!? Again, they just don't seem like very powerful abstractions at all, especially since they fail to abstract over the main problem we're interested in here!
[@gavinking] What I'm saying is that as presented, I don't see how this is an improvement over what we have today. The proposal doesn't really make Enumerable
obsolete. It just introduces additional complexity where we already have too much.
[@tombentley]
@gavinking well, imo the problem with Ordinal
isn't just that it's inefficient, it's also that:
- It assumes
Ordinal
s must beComparable
, and I think people want to use the language features which are defined in terms of Ordinal and/or Range with types which are notComparable
(likeDayOfWeek
). - It assumes that
Ordinal
s are either infinite or circular (i.e. that you can always get asuccessor
andpredecessor
), and again I think people want to use related language features with types which are finite and non-circular (e.g. aPriority
).
So you're right that Successor
isn't relevant to solving the efficiency problem of Ordinal
-- we could just have kept successor
on Ordinal
and defined segment()
on Ordinal
to achieve that, and done without this whole Absent
thing. But Successor
is trying to deal with those other problems where Ordinal
is too specialized to be useful in reasonable seeming circumstances. I'm aware, however, that I've thus far failed to convince you that those are problems with Ordinal
.
Ordinal
doesn't inheritComparable
. it's onlyRange
that requires that, and it would still require it, AFAICT.- I'm not at all convinced that
successor
andpredecessor
should be optional. I can't imagine what you think the semantics and type of++
and--
should be in that case.
So I think maybe this is addressing the wrong problem.
- Although
Ordinal
doesn't inheritComparable
it does induce a total order, so saying it doesn't inheritComparable
is splitting hairs isn't it? - I said that
++
and--
would still be defined onOrdinal
and later said thatOrdinal
should satisfySuccessor<Other, Nothing>
so you don't have to do anything about++
, or--
AFAICS because thesuccessor
/predecessor
which they're defined in terms of are not optional.
[@tombentley] For the record, this is what I have at the moment:
"Abstract supertype of types whose values may have a [[successor]].
This is a generalization of [[Ordinal]].
The type `Successor<Element,Nothing>` represents a non-terminating
successor type: It is always possible to get a next element. Such types
include true infinite successor types ([[successors]] contains unique elements)
as well as circular successor types ([[successors]] ends up repeating the
same sequence of elements).
The type `Successor<Element,Null>` represents terminating successor types
which have a _terminal_ element which lacks a successor.
A `Successor` instance may be used to generate a [[Sequence]], using the
segment (`:`) operators. The [[segment]] operator `:` accepts the
first element and maximum length of the subrange.
0:5 // [0, 1, 2, 3, 4]
For a terminating successor type the segment will have fewer elements than
the requested length if the terminal element was reached before the given
number of elements was reached.
If the length is nonpositive, the subrange is empty.
0:0 // []
5:0 // []
0:-5 // []
"
see(`interface Ordinal`)
shared interface Successor<Other,Absent=Nothing>
given Other satisfies Successor<Other, Absent>
given Absent satisfies Null {
"The successor of this value, or null if there is no successor"
shared formal Other|Absent successor;
"This value and all its successors"
shared default {Other+} successors {
value start = this;
object iterable satisfies Iterable<Other, Nothing> {
shared actual Iterator<Other> iterator() {
object iter satisfies Iterator<Other> {
variable Successor<Other,Absent>? current = start;
shared actual Other|Finished next() {
if (exists result=current) {
current = result.successor;
assert(is Other result);
return result;
} else {
return finished;
}
}
}
return iter;
}
}
return iterable;
}
"This value and the next [[num]] successors"
shared default {Other*} segment(Integer num) {
return successors.take(num);
}
}
"Specialization of [[Successor]] for types which are also [[Comparable]].
This includes [[Integer]] and other [[Integral]] numeric types.
[[Character]] is also considered an ordinal type.
Given any two values `x` and `y` of `Ordinal` type, where `x<y` then
`x.successors.contains(y)` (in other words, starting from a smaller value,
we can get to any larger value by iteratively getting successors some
number of times).
Ordinal also comes with an inverse of successor called predecessor. It it
expected that given any value `x` of `Ordinal` type,
x.successor.predecessor == x
is always true
The _increment_ operator `++` and _decrement_ operator `--`
are defined for all `Ordinal` types.
function increment() {
count++;
}
An `Ordinal` instance may be used
to generate a [[Range]], using the span (`..`) operator. The [[span]]
operator `..` accepts the first and last values
of the range.
0..5 // [0, 1, 2, 3, 4, 5]
0..0 // [0]
If the last value is smaller than the first value, the
range is reversed.
5..0 // [5, 4, 3, 2, 1, 0]
0..-5 // [0, -1, -2, -3, -4, -5]
"
see (`interface Successor`,
`class Character`,
`class Integer`,
`interface Integral`,
`class Range`)
by ("Gavin", "Tom")
shared interface Ordinal<Other> of Other
satisfies Successor<Other,Nothing> & Comparable<Other>
given Other satisfies Ordinal<Other> {
shared formal actual Other successor;
"The predecessor of this value."
shared formal Other predecessor;
"The instances between this instance and the
given [[end]] instance, inclusive.
If the implementor also satisfies [[Comparable]]
the elements of the sequence should be ordered
(increasing if `end > this` or decreasing if `end < this`)."
shared default Range<Other> span(Other end) {
return Range(this of Other, end);
}
//"The [[next|successor] [[length]] instances,
// or empty if `length` is non-positive."
//shared formal Range<Other>|[] segment(Integer length);
}
[@gavinking] It's an order but not a total order, I think.
[@tombentley] True.
[@gavinking] So the thing with this issue is that since #5076 is back in play, I need a Range
class that is sufficiently general to abstract over all kinds of ranges we might be interested in. So I started asking myself about how to make the most completely abstracted Range
I could come up with.
Inspecting the current implementation of Range
, it appears to need the following:
- the
first
andlast
items, - a
compare()
function, - an
offset()
function, and - a
neighbour()
function.
UPDATE: in fact the compare()
function could in theory be expressed in terms of offset()
though that might affect performance.
Now, we could stick those operations on an interface (Enumerable
, it's called today) but that has the disadvantage that you can only have one sort of range for each element type. For example, (1..10).by(2)
can't be a Range
today. So at first it seemed to me better to just change the constructor of Range
to class Range<E>(E first, E last, Comparison compare(E x, E y), Integer offset(E x, E y), E neighbour(E e, Integer offset))
.
But then I realized that optimizing Range.equals()
and Range.includes()
would be impossible because functions don't have equality. Those seem like quite useful optimizations to me. So I find myself back at the notion of something conceptually similar to the Enumerable
interface (though perhaps we could consider at some point giving Range
a step)
.
So at this point, I'm inclined toward just unlinking Range
from Ordinal
, and defining:
shared interface Enumerable<out Other> of Other
satisfies Comparable<Other>
given Other satisfies Enumerable<Other> {
shared formal Integer offset(Other other);
shared formal Other neighbour(Integer offset);
}
Perhaps you @tombentley could convince me that it should be:
shared interface Enumerable<out Other> of Other
given Other satisfies Enumerable<Other> {
shared formal Integer? offset(Other other);
shared formal Other? neighbour(Integer offset);
}
in order to accommodate discontinuities.
(Now, FTR, if ranges were merely iterables, rather than sequences, you could form a range with only first
, last
, and a next()
function, which would be a lot simpler. But of course plain iterables are much less useful.)
[@luolong] Now, I'm curious -- why does a Range
need to be a Sequence
? Why isn't just Iterable
enough?
[@gavinking] Well that's equivalent to asking "why have sequences at all?". If sequences are useful, then more efficient sequences are useful.
[@luolong] No, not quite the same question.
I was asking essentially, "What use case requires Range
to be a sequence"?
But you are right of course -- making Range
(and Segment
) a sequence will allow it to be used wherever a sequence is expected, enabling a whole range of possibilities (pun fully intended)
[@luolong] A question though.
You have this signature in your Enumerable
interface:
shared formal Other? neighbour(Other other, Integer offset);
What is the Other other
doing there?
[@tombentley] I'm going to make the same point at I have for a while now, just in terms of your new abstractions, so please forgive me banging on about the same stuff...
shared interface Enumerable<out Other> of Other
satisfies Comparable<Other>
given Other satisfies Enumerable<Other> {
shared formal Integer offset(Other other);
shared formal Other neighbour(Integer offset);
}
So as you alluded to in your "UPDATE:" offset()
implies a total order between Enumerable
s of the same type. It follows that we wouldn't be able to have Range
s of DayOfWeek
. I'm not sure how well this copes with a type like Priority
either, because neighbour()
is supposed to return an Other
for any arbitrary Integer
, so how is an Enumerable
type with fewer inhabitants than Integer
supposed to work? It seems the only thing is can do is to wrap around, which isn't always a suitable semantic.
So I tend to prefer the one where neighbour()
can return null. However, how will we construct a Range when start.offset(end)
returns null?
And it seems to me that this cannot handle a Range<Whole>
.
[@gavinking] So I have implemented the proposed changes to Enumerable
and Range
that allow them to abstract over:
- "recursive" enumerable types like
DayOfWeek
, and - high-precision integral types like
Whole
.
This solution, in contrast to other options we considered, has the advantage that it accommodates a wide range of enumerable types, without closing off the possibility to make Sequence
an enumerated type, as proposed in #5076.
The approach basically worked out, though the implementation of Range
is surprisingly hairy.
If everyone agrees that this is the way to go, then the remaining tasks are:
- #5362
- tiny update to the spec
- tests for ranges over recursive enumerated types
[@tombentley] I wonder if we could write the termination condition for the iterator()
slightly differently, because while what you currently have certainly works, the different behaviour for recursive and non-recurvsive Range
s makes loops harder to optimize.
It seems to me that we want to continue the iteration until current.offset(last) == 0
and then do one more (because a Range includes its endpoint). We could do that like this:
object iterator
satisfies Iterator<Element> {
variable value current = first;
variable value latch = 1;
shared actual Element|Finished next() {
if ((latch > 0 && current.offset(last) != 0) || latch-- > 0) {
value result = current;
current = outer.next(current);
return result;
}
else {
return finished;
}
}
string => outer.string + ".iterator()";
}
This condition would work for both recursive and non-recursive Range
s and would permit an optimized loop something like this:
Integer last = range.getLast();
int latch = 1;
for (Integer curr = range.getFirst();
(latch > 0 && curr.offset(last) != 0) || latch-- > 0;
curr = range.getIncreasing() ? curr.getSuccessor() : curr.getPredecessor()) {
// ...
}
[@gavinking] I guess we could do that. It does open up more possibility of the loop never terminating for a badly-behaving Enumerable
, but I guess that's no big deal...
[@tombentley] With the iterator()
implemented as above, an unoptimized range loop (i.e. using Range's iterator directly) runs faster (13484.267541ms before cf 11850.233843ms after). The optimized version runs about the same, maybe slightly slower (9696.06184ms before cf 9965.891885ms after), and is still an optimization, though not as noticeable since it's now only ~10% faster.
[@gavinking] OK, well that's certainly something.
[@tombentley] If I write the Range.iterator()
slightly differently again it appears I can make it run a little faster still, which means the optimized loop is now only about 5% faster than the normal iterator loop. At that point I start thinking that that particular optimization is not worth the complexity in the compiler. WDYT?
[@gavinking] 5% is not much. But do we know for sure that 5% is an upper bound?
[@tombentley] You mean you want numbers for different types of element?
[@gavinking] It was more of a rhetorical qn.
[@gavinking] Closing, after updating spec and opening #5369.