maps: remove Keys and Values for Go 1.21
rsc opened this issue Β· 57 comments
Given #61405 and the likelihood of additional operations on those kinds of iterator functions (like Map, Filter, and Reduce), it seems possible that we would want Keys and Values to return iterator functions that can be used with those and do not provoke O(N) storage. I imagine there would also be a slices.Collect that collects values from any iterator into a slice, so existing uses of maps.Keys(m) that still want a slice would become slices.Collect(maps.Keys(m)).
If there's a good chance we're going to want Keys and Values to return iterators in Go 1.22, it seems short-sighted to ship them in Go 1.21 returning slices. I wonder whether we should pull them. It is late in the cycle, but not too late. git grep says there are no mentions of maps.Keys or maps.Values in the standard library.
I'm not clear on the value of using iterators for those methods rather than the normal range loops that exist today. We don't have map, filter, or reduce that work on iterators. Whatever package provides them can also trivially provide a function to adapt any rangeable value to an iterator.
s/Slices/Values/g?
We have relevant experience with something like maps.Keys
and its rather unpleasant one, because most of the time you don't need all keys: you likely don't need 90% of them, since the result of maps.Keys
in most cases is going into something like slices.Filter
with very specific filter pattern.
And the reason it's unpleasant is because you never know how many keys are in the map now - each time there is chance of unbounded memory growth, that could not happened during original code review, but things had changed since then and now instead of 10 keys you constantly filtering from 10000 keys.
But thats our experience anyway.
Instead of slices.Collect(maps.Keys(m))
, what if we keep maps.Keys
but do something like slices.Collect(maps.KeyIter(m))
. This would make the signature reads more clear that Keys
returns an slice, but KeyIter
returns an iterator?
@jimmyfrasche Yes thank you. Changed Keys and Slices -> Keys and Values.
I'm not clear on the value of using iterators for those methods rather than the normal range loops that exist today. We don't have map, filter, or reduce that work on iterators. Whatever package provides them can also trivially provide a function to adapt any rangeable value to an iterator.
That's true, but if the rangeable value is the current maps.Keys(m), then Keys builds the entire slice in memory even if you only just want to iterate through the keys and look at them. That's O(N) space that could be O(1) space instead, a large and potentially important difference for big maps. I don't think we should lock ourselves into the O(N) iteration strategy for map keys and map values.
@DmitriyMV: Yes, exactly, that's the situation I am concerned about. If Keys returned an iterator instead of a []K, you wouldn't have the memory or latency hit of collecting all the keys. You might still have the latency of filtering if you are going to go through all of them, but you avoid the memory hit with a filtered iterator. And if you stop early, you avoid the latency hit too.
@changkun Yes, if we keep Keys and Values as they are we will have to introduce some other iterator form for efficiency, maybe called KeysIter and ValuesIter. But I suspect most uses for Keys and Values work better with the iterator form (which would fix @DmitriyMV's example), meaning we should save the good names Keys and Values for them instead of giving them second-rate names.
instead of giving them second-rate names
Why is the Iter
suffix a second-rate name? Values
vs ValuesIter
Comparably we already have artifact pairs like Handler
vs HandlerFunc
etc.
That's true, but if the rangeable value is the current maps.Keys(m), then Keys builds the entire slice in memory even if you only just want to iterate through the keys and look at them. That's O(N) space that could be O(1) space instead, a large and potentially important difference for big maps. I don't think we should lock ourselves into the O(N) iteration strategy for map keys and map values.
I don't see how we would be "locked into" an O(n) space iteration strategy for map keys or values. If I want O(1) space iteration over the keys of a map m
, I write for k := range m
. I don't see why I would ever use maps.Keys for that. When I use maps.Keys, it is to build a list of all the keys, usually followed by sorting the list. That is the most common purpose for which I use package maps, even. I didn't realize that is unusual.
Of course, I could still do this with a slices.Collect that accepts an iterator. And if that doesn't happen, maps.Keys and maps.Values can of course be added later. It just seems unfortunate to me to revise an already accepted proposal to remove the feature I use most frequently from its experimental version because of a language change that might happen and that might be followed by an API addition that can't even be proposed until that language change is decided.
meaning we should save the good names Keys and Values for them instead of giving them second-rate names.
Another way to provide a "first-class" name for composable map iterators would be to introduce a package iter
which might have iter.Map, iter.Keys, iter.Collect, &c. maps.Keys isn't the only option.
More general formulations of algorithms might want to parameterize on something like type Pair[K comparable, V any] struct{Key K, Value V}
, to preserve the key:value association. Also, there might be a type OrderedValue[V any] Pair[int, V]
for slices and OrderedPair[K comparable, V any] OrderedValue[Pair[K,V]]
for ordered map elements.
I'm quickly unsure if or when this is "bad idea" territory, but it's probably not in some languages - it's just that I definitely agree about not getting locked in.
@rsc I think we can flip this backwards: instead of talking about leaving the preferred name for iterators, think about how not using a suffix will cause confusion in the rest of the standard library. There are things all over that return slices, and with iterators, it could be valuable to add methods and functions that return them. This won't be the only place, so while we can use a suffix and have the new iterators be consistent, we cannot have the other way. Be consistent. Also, "ThingIter" sounds like a name of a thing iterator and "Things" sounds like I already have those things allocated.
Pretty much everything in the standard library that returns a slice today should return an iterator in a future version. Adding "Iter" suffixes may be OK given the history but it's not where we want to end up. A v2 of those packages might flip the default. Or maybe there are good names available that don't use Iter. Either way, maps and slices are brand new packages. We don't want to go straight to a v2 just to fix this detail. We can avoid making the mistake to begin with if we just leave them out for this release and give ourselves another cycle to get it right. Maybe we'll end up back where we are today, but maybe not. I suspect not.
I don't think it's entirely a mistake though. There are choices all over where we use slices as a return instead of some sort of iterator and have performance that is just fine. Anything that isn't CPU bound won't have a noticeable improvement either way.
However, at any given time, most software developers are fairly junior and the concept of an iterator is more difficult than the concept of a slice. Python has had generators and iterators for quite some time, but most of the time you see people using lists. It's totally unnecessary, but outside shared libraries, it rarely causes a difference. I don't think you'll stop seeing slices used a lot more often than iterators even if we one day have a Go 2 where they are the more common option.
That said, I personally haven't seen much indication that we would ever get to a point where such a big break would be worth it, so using words that are less specific in hopes that you can standardize on that eventually doesn't seem particularly helpful
I think iterators shouldn't be the default choice, in general. They should be something the programmer consciously opts in to when needed. It simply follows the principle of least surprise. It is intuitive and unsurprising that maps.Keys
and maps.Values
return slices, just as the names imply. I strongly agree with @zephyrtronium 's comment above:
If I want O(1) space iteration over the keys of a map m, I write for k := range m. I don't see why I would ever use maps.Keys for that.
PS: the title of this issue should be updated too: "remove Keys and Slices" -> "remove Keys and Values"
I think iterators shouldn't be the default choice, in general. They should be something the programmer consciously opts in to when needed.
Normally you would want the most costly and specific option to be consciously opted in and clearly returning a slice is more costly than returning an Iterator. having the full slice of elements upfront is usefull only when you are sure that you would need all of them and can justify the memory used. The more generic case is that you would need only some of the values and/or potentially exit early from the loop making Iterators better suited for the default choice.
You can collect the elements of the iterator in a slice if it's needed but while you still can create an iterator that loops over the slice, the "damage" of the memory use is already done and creating an iterator at that point is almost useless.
But I suspect most uses for Keys and Values work better with the iterator form
I donβt suspect that. Python 3 changed from returning a list to returning an iterator, and it was often a pain because I needed to wrap the keys method in a list conversion. For maps.Keys today, my most frequent operation is to take the keys and sort them, which requires it being a slice. I think there should be an easy to use slice function. If itβs called KeySlice instead of Keys, thatβs fine, but it is so frequently used it really is a pain to have to add a collect step in.
If you need sorted map keys maybe you need a sorted map instead? Even without such a thing, if you need to collect the keys in a slice once it's not a lot of code (with or without stdlib changes) and if you need it more than once you can throw that in a function.
if you need it more than once you can throw that in a function.
And that function could be in the standard library. ;-)
I'm not opposed to functions that collects the keys/values in a slice, but I agree with @rsc that is less general than the iterator form and hence the short name should go to the iter form. For the specific use case you bring up, I wonder if a more generally useful way to expose that would be a sorting func that takes an iterator as input and returns a slice.
I agree that this is a release blocker and should be resolved definitively before 1.21 releases, but I also think this is a bit of retreading previous decisions. Iterators were discussed as part of the planning for the maps package. In #47330 (reply in thread) @ianlancetaylor wrote,
A normal map iterator will be over key/value pairs. I don't think that an iterator over just keys or just values will be the common case. While I can imagine adding them (if we ever have iterators), I don't think we need to reserve better names for them.
Did something change to make us rethink that?
Reading the keys of a map to obtain a sorted slice of keys is a use case that I have encountered several times. Without the slices and maps package, I would have written
i := 0
s := make([]T, len(m))
for k := range m {
s[i] = k
i++
}
sort.Slice(s, func(i, j int) { return s[i] < s[j] })
With the slices and maps packages, you can write
s := maps.Keys(m)
slices.Sort(s)
but with the change of this proposal I suggest introducing a slices.Sorted
function
s := slices.Sorted(maps.Keys(m))
declared as
Sorted[T comparable](iter func(func(T) bool) bool) []T
This would be a good improvement, using only the standard library, compared to the current situation.
A process comment: we say to people that they should write code against release candidates and deploy them to production. At this point these functions have been in RCs for about a month, and I think it's plausible that people have uses functions in their codebase that will break when they update from the last RC to the final release. Are we concerned about the risk of breakage?
The current version of maps.Keys is impl by me. I think https://go-review.googlesource.com/c/go/+/481435 and https://go-review.googlesource.com/c/go/+/481436. I do not know the detail impl of iters. but I think the current version of maps.Keys/Values could be faster that the Iter version. I think performance is also an important consideration.
As many have already mentioned, the main use of maps.Keys is for sorting.
In this case, what we need is not an iterator, but a slice to be sorted.
(The method proposed by @gazerro does not allow us to make([]T, len(m))
before sorting, which would be inefficient.)
@eihigh the Sorted
function could be declared to get an initial cap for the slice
s := slices.Sorted(maps.Keys(m), len(m))
so it would be declared as
Sorted[T comparable](iter func(func(T) bool) bool, n int) []T
For an iterator whose length you don't know in advance, just use 0.
@eihigh the
Sorted
function could be declared to get an initial cap for the slices := slices.Sorted(maps.Keys(m), len(m))
so it would be declared as
Sorted[T comparable](iter func(func(T) bool) bool, n int) []T
For an iterator whose length you don't know in advance, just use 0.
Sorted usually need random access to slice, so I think event use
Sorted[T comparable](iter func(func(T) bool) bool, n int) []T
we still need to get a key slice in Sorted Internal impls.
As many have already mentioned, the main use of maps.Keys is for sorting.
That's a pretty big assumption that many are making. I see this all the time in many new proposed features that people mistake their own use for the common use.
we still need to get a key slice in Sorted Internal impls.
This trivial implementation should be sufficient
func Sorted[T comparable](iter func(func(T) bool) bool, n int) []T {
s := make([]T, 0, n)
for k := range iter {
s = append(s, k)
}
sort.Slice(s, func(i, j int) { return s[i] < s[j] })
return s
}
I see this all the time in many new proposed features that people mistake their own use for the common use.
Okay, but the first page of Github search results has 20 items of which 10 can be seen sorting the keys in the snippet window.
The other takeaway from the search is that we should add maps.Stringify
and bring back the container/sets proposal.
This trivial implementation should be sufficient
A nice thing about a slice of keys is you can use slices.SortFunc
to sort it backwards, etc.
To me the question to resolve here should not be whether to have a maps.Keys
at all, but whether to give it a short name or a long name. There's just no world in which maps.Keys
doesn't pull it own weight. It's an extremely common use, and if we leave it out entirely, people are just going to keep using golang.org/x/exp/maps instead of std/maps.
I'm okay with something maps.KeysSlice
and maps.ValuesSlice
since they are pretty explicit about what you will get. But I think that we should preserve maps.Keys
until iterators arrive, since that is what most people will use.
I'm okay with something
maps.KeysSlice
andmaps.ValuesSlice
since they are pretty explicit about what you will get. But I think that we should preservemaps.Keys
until iterators arrive, since that is what most people will use.
I'm fine with that, but I would call it maps.KeySlice
and maps.ValueSlice
(no s).
Instead of adding specific functions to the maps
package, we could extend the append
and copy
builtins to iterators.
s := make([]T, 0, len(m))
s = append(s, maps.Keys(m)...)
s := make([]T, len(m))
copy(s, maps.Keys(m))
It sounds like there is enough uncertainty that we should pull these from Go 1.21.
The code there is worth preserving; we can just s/Keys/keys/ and s/Values/values/
in the implementation and not delete any code.
Change https://go.dev/cl/513476 mentions this issue: maps: remove Keys and Values
I don't think there is uncertainty about the need for the functions; just uncertainty about the names. It seems unfortunate to have fewer functions in the standard library version than the exp version, leading people to stay on exp for an extra six months.
No change in consensus, so accepted. π
This issue now tracks the work of implementing the proposal.
β rsc for the proposal review group
We started using exp/maps at work some months ago and Keys
and Values
are the only functions we're using so far. I'm sure we'll use more of the API later on as more code is written and as we get more acquainted with the package but I do think that Keys
and Values
represent the bulk of the "expected value" of this package. I don't really see the point of including the maps package in Go 1.21 without these functions.
I tend to agree with @cespare, if there's not enough time to work out Keys/Values before 1.21, perhaps maps should just be held back entirely for the cycle.
I tend to agree with @cespare, if there's not enough time to work out Keys/Values before 1.21, perhaps maps should just be held back entirely for the cycle.
The other functions are useful without these two, I don't see the hard connection. Early adopters who have used x/exp/maps can continue to do so until these functions or alternatives are added.
For me, I already have PRs open called "Upgrade to Go 1.21" that are made against the RCs that use maps.Keys. It would be more work to take those and have them use std/maps in some places and exp/maps in others than to just back that out and use exp/maps everywhere, but keep the slices/constraints changes.
Apologies to everyone who is disappointed. I want to use these in Go 1.21 too. But I also want to use them in future versions, and I want them to fit into those versions well. In the grand scheme of things, having to wait another 6 months to import "maps" instead of "golang.org/x/exp/maps" will only hurt for 6 months, and really not very much. But if we want something different and land this now, it will hurt for much longer.
I tend to agree with @cespare, if there's not enough time to work out Keys/Values before 1.21, perhaps maps should just be held back entirely for the cycle.
The other functions are useful without these two, I don't see the hard connection. Early adopters who have used x/exp/maps can continue to do so until these functions or alternatives are added.
The main issue is namespace collision: if I type maps.
in my editor, gopls will presumably prefer suggestions from the standard library over an external module, especially if I haven't yet added x/exp to my go.mod. If I happen to use maps.Clone before maps.Keys, then I might have to fix my import lines manually. But I agree that it probably won't cause much pain overall.
The concrete thing that confuses me is that it seems like there's general agreement that a (the?) major use of these functions is in the vein of "give me all of the keys of a map so I can sort them and use that slice", e.g. for printing, stable iteration, etc.
I agree that if the goal is to compose multiple iterators together, that you'd want to have a wrapper function that can iterate over the keys or values of a map, such that it can be fed into some other algorithm (filtering and whatnot). But, the previously linked code search seems to imply that people tend to want "the sorted list of keys" to immediately use, and aren't so much applying other transforms.
Of course, if it is iterators that gets added instead, people would write:
keys := slices.Collect(maps.Keys(m))
slices.Sort(keys)
But, I'd worry that this exact code is going to be the majority of uses of maps.Keys
, and it doesn't benefit from having used iterators at all.
Well, TIL I can't just read the weekly review notes summary to be involved, because the one from today is the first mention of this and it's already accepted.
Not to get too off-topic, but I'm hopeful that there'll be a pair of functions for collecting and sorting in one run. Something like
package iters
func AppendSort[T cmp.Ordered](dst []T, iter Iter[T]) []T
func AppendSortFunc[T any](dst []T, iter Iter[T], less func(v1, v2 T) bool) []T
By using a binary search to find where to insert each element as it goes and allowing preallocation, it should be possible to make it pretty efficient, I think. Then things like
keys := maps.Keys(m)
positiveKeys := iters.Filter(keys, func(v int) bool { return v >= 0 })
sortedPositiveKeys := iters.AppendSort(make([]int, 0, len(m)), positiveKeys)
become possible.
@rsc I 100% agree with everything you wrote in #61538 (comment). Waiting an extra 6 months is no big deal. exp/maps works perfectly well for now.
What is slightly annoying is adding the maps package to the stdlib without its most useful functions and creating 6 months of bifurcation because we can't do a one-time switch from exp/maps to maps. At work I expect confusion from users who are now getting accustomed to exp/maps when maps lands without these functions. It's one thing to say "here's maps: it's like exp/maps except Keys
and Values
are now iterators." It's another to have to explain "here's maps: don't bother using it yet because the functions you need are still only in exp/maps."
Why not just wait for 1.22 to add maps?
Another reason to wait is what if we want maps.Copy to take an iterator for its second argument? Seems like a reasonable thing to have. It could be called Update or something, but if weβre holding back on naming, seems like we should be wholistic.
I'm sad to see that these 2 functions got removed in favour of a more complicated solution. What is the point to remove simple well working functions right before the release?
@cristaloleg
As I understand there hasn't been a decision on their final fate yet. They got removed for 1.21 to gain more time and think it through, because the release is imminent. They might get added back in a subsequent release, maybe under different names.
Change https://go.dev/cl/513715 mentions this issue: maps: remove Keys and Values
Reopening because CL 513715 needs to land too.
I opened #61613 to formalize the suggestion from @cespare and @carlmjohnson to hold back the entire package instead of only Keys and Values. I'm not sure how I feel about it personally, but the timeline is very tight, so I think it makes sense to have the discussion now if it will be had.
@perj Sorry for the rush on this proposal. It was rushed by the nearness of the release. Deciding not to do something is always safe, if disconcerting.
@zephyrtronium Nobody has seen any trouble with the other functions in the maps package, and they do have support in that they were successfully used in the x/exp/maps package. I don't see a strong reason to pull them from the 1.21 release.
CL 513715 is landed.
Common cases named map.Keys, maps.Values
for Iter named maps.KeyAll ,maps.ValueAll
That's true, but if the rangeable value is the current maps.Keys(m), then Keys builds the entire slice in memory even if you only just want to iterate through the keys and look at them. That's O(N) space that could be O(1) space instead, a large and potentially important difference for big maps. I don't think we should lock ourselves into the O(N) iteration strategy for map keys and map values.
I don't see how we would be "locked into" an O(n) space iteration strategy for map keys or values. If I want O(1) space iteration over the keys of a map
m
, I writefor k := range m
. I don't see why I would ever use maps.Keys for that. When I use maps.Keys, it is to build a list of all the keys, usually followed by sorting the list. That is the most common purpose for which I use package maps, even. I didn't realize that is unusual.Of course, I could still do this with a slices.Collect that accepts an iterator. And if that doesn't happen, maps.Keys and maps.Values can of course be added later. It just seems unfortunate to me to revise an already accepted proposal to remove the feature I use most frequently from its experimental version because of a language change that might happen and that might be followed by an API addition that can't even be proposed until that language change is decided.
Can't agree more, please add maps.Keys and maps.Values back to the standard library.
This issue is closed. See #61900 for the proposal to bring back maps.Keys and maps.Values as iterators.