golang/go

proposal: spec: add &T(v) to allocate variable of type T, set to v, and return address

chai2010 opened this issue · 40 comments

1. improve new func (by Albert Liu @ jpush)

func new(Type, value ...Type) *Type

2. support &Type(value) grammar

Examples:

px := new(int, 9527)
px := &int(9527)

Some discuss:
https://groups.google.com/d/msg/golang-nuts/I_nxdFuwAmE/jNObXNDy5bEJ

Comment 1:

- What is the ... in the proposed signature of new good for? new returns a pointer to
only one value.
- Having both &T{} and &T() do the same thing would be surprising at minimum.
- Allocating structs is common, allocating non-struct types is not.

Comment 2:

#1
1. value are optional, so we need ... type:
px := new(int)
px := new(int, 123)
px := new([]int, 1, 2, 3)
px := new([]int, x...)
2. &T{} donot support &int{}
3. Please see the discuss.

Comment 3:

"px := new([]int, 1, 2, 3)"
Ah, so it's meant to support even slices? But using a different syntax than a slice
literal has ({[key:] value, ...})? But without the possibility to set len and cap? How
it's supposed to handle maps? `pm := new(map[t]u, 1, 2, 3)`? What is the key and what is
the value? Or map types, as an exception, do not qualify as a 'Type'? Etc.
I think this all shows how much of a bad idea this proposal is.

Comment 4:

For map:
pmap := new(map[string]int, map[string]int{
    "A": 1,
    "B": 2,
})
For map slice:
pmaps := new([]map[string]int,
    map[string]int{
        "A": 1,
        "B": 2,
    },
    map[string]int{
        "A": 1,
        "B": 2,
    },
)

Comment 5:

So for slices a list o values is used (#2)
        px := new([]int, 1, 2, 3)
But for map types it uses a composite literal (#4)
        pmap := new(map[string]int, map[string]int{
            "A": 1,
            "B": 2,
        })
Which case is the norm and which is the exception? Why not in the slice case write
analogically
        px := new([]int, {1, 2, 3}) // ?
It also supports the existing key: val thing
        px := new([]int, {1, 42: 2, 3})
IOW, we're back to the "why the ... "?
If the proposal would be accepted, which I hope is not going to happen, I think that it
would have to be
        new(T, optExpr) // 1 is a literal as is {1, 2, 3}, etc.
Where optExpr is optional, similarly to
        make(T, optExpr1, optExpr2) // [0]
BTW, please let's not forget - the best feature of Go is its lack of "features".
  [0]: http://golang.org/ref/spec#Making_slices_maps_and_channels

Comment 6:

Labels changed: added repo-main, release-none, languagechange, go2.

Comment 7:

#5
Sorry, i made a misstake. I only hope these two `new` type:
    func new(Type) *Type
    func new(Type, value Type) *Type
Not include this `new` type:
    func new([]Type, values ...Type) *[]Type
Beause it will cause this confused code:
    px := new([]int, []int{1})
    px := new([]int, 1) // like new([]int, 1, 2, 3)
Some examples:
    px := new(int)
    px := new(int, 123)
    px := new([]int)
    px := new([]int, []int{1, 2, 3})
    px := new(map[string]int)
    px := new(map[string]int, map[string]int{
        "A": 1,
        "B": 2,
        "C": 3,
    })

Comment 8:

Here's a proposal witch should be related with this issue.
I'd like you to review and make some comments.
https://docs.google.com/document/d/111YaXFZeJbJ9DhOF69CvvFV49YTkUKpIRKiS42woMak/edit?usp=sharing
rsc commented

See also #19966.

I don't see why we need both new(int, 5) and &int(5). It's true that today, if T is a composite type, we permit both new(T) and &T{}. The fact that we permit both means that essentially nobody ever writes new(T) for a composite type T. If we permit &int(5), then nobody will ever write new(int, 5). So, if anything, if we adopt &int(5), we should consider removing new entirely.

For this kind of thing it's interesting to consider the type []interface{}. With the syntax proposed here, &[]interface{}{nil} would return a slice of one element whose value is nil, and &[]interface{}(nil) would return a nil slice of type []interface{}. That in itself is a reason to prefer () here, while reserving {} for composite types.

I think the proposal here should be to add to the language the expression &T(v), for any type T, for any value v assignable to T. This expression will allocate a new variable of type T, set it to v, and return its address.

I like that.

tv42 commented

If &T(v) goes in, perhaps func foo() T should allow &foo() and not just &T(foo()), to get a *T.

@tv42 If I understand you correctly, that is not this proposal, it is #22647.

tv42 commented

@ianlancetaylor That issues seems to contain &foo(), yes. I was brought here mostly by the similarity in syntax between &T(v) and &foo(v), &"bar" is a bit more out there.

I think this is a good proposal (see an "experience report" for something similar at #22647), but I'd vote for the simpler &"foo" or &1234 syntax. That to me seems more obvious than the &T(v) syntax, which looks like type coercion or a function call.

The &"foo" style syntax also seems like a natural extension of the existing &T{...} syntax: you construct a thing, then you take the address of it. And my proposal is that it doesn't matter whether that thing is a struct (like now) or an int or string or something else.

This syntax is what I tried when learning Go, as I just presumed you could prefix an expression with & to take the address and the compiler would figure it out (Go's big on "let the compiler figure out whether something needs to be on the heap or the stack"). This is not just me: other people expect this to work too, because of the &T{...} precedent: see one, two, three, four.

The simpler syntax would also work for expressions, like (single-valued) function calls such as &time.Now(), as well as more general expressions like &(x + 1234) -- the latter would just have to be in parentheses for operator precedence reasons. That said, I think such general expressions would be rare, and in practice it would mostly be people taking the address of a constant or function return value.

&1234 would presumably have the type *int. Sometimes you need, say, int64. So &1234 is not sufficient; there needs to be a way to say: create a variable of type int64 and set it to 1234 and return the address. The proposed &T(v) syntax permits &int64(1234). So it seems to me that we need something like &T(v) regardless.

If we want to permit &v for any expression v, then we can do &int64(v), using a type conversion.

But &v for any expression has some difficulties. Logically it should be possible to take the address of an address expression, which gives us &&v. But that doesn't work because && is an operator with a different meaning.

More importantly, if v is a variable, than &var is quite different from &v where v is an expression other than a variable. &var takes the address of the singular variable var. If called in a loop, it resolves to the same value each time it is executed. &v for a non-variable v allocates a new instance each time, and as such if called in a loop resolves to a different value each time it is executed. That is a rather subtle distinction that seems likely to lead to confusion.

You say above that &"foo" is an extension of &T{...}, but I'm not sure it is. &T{...} is a special case where the type is always required, and, more importantly, which is explicitly defined to allocate a new value each time.

Thanks -- that's reasonable, and I concur that &T(v) solves some of those subtle issues. Though I don't think the &&v issue is really a problem, because it'd be really rare, and if you actually needed that you'd just use parens like &(&v).

Still, the &T(v) approach would mean that my original use case, &time.Now(), would be quite clunky: &time.Time(time.Now()). Does it matter that &var returns the same value each time, and &expr does not? We already have that distinction with &var and &T{}, right?

Yes, &var and &T{} act differently. This is clearly documented, and they also look different. (There was actually sentiment for a while to change the address-of-composite-literal syntax to be (*T){}, which would be more logical, but in the end we stuck with &T{}.) &var and &1 look a lot more similar, so it's more important to be aware of the fact that they behave quite differently.

I agree that &time.Time(time.Now()) looks clunky. That may be a good reason for us to not change anything here. All of this is just syntactic sugar. It has to be useful and it has to be clear.

It's reasonable that &1 is not enough because we want it to be of specific type and numeric constants are untyped in Go. But why not give the compiler more freedom to derive the meaning from the context?

  1. &"foo" - *string
  2. &time.Now() - *Time
  3. &1 - ambiguous. Compiler could throw an error and you would have to use &int64(1) or something like that. But even in this case compiler could use context to determine the exact type. And if you pass it to a function with interface{} argument or create a variable with := you would still have to use &T(v) syntax.

It's seems to me that there's enough context to implement it properly. Just looking at the code you can easily tell which is which. Nothing is magical or surprising.

@creker Numeric constants may be untyped in Go, but when you assign an integer to a variable, it's always type int, like myInt := 1234. So to me it seems obvious that &1234 would mean, unambiguously, &int(1234).

As much as I loathe C++ “uniform initialization”, it might actually be a good example here.

We could allow &{x} (with or without a type after the &) as a general shorthand for taking the address of an anonymous variable. It's visually distinct from taking the address of an ordinary variable or expression, but visually similar to taking the address of a struct literal.

Examples:

px := &{1234}        // px := new(int);       *px = 1234
px64 := &int64{1234} // px64 = new(int64);    *px64 = 1234
pt := &{time.Now()}  // pt := new(time.Time); *pt = time.Now()
pfoo := &{"foo"}     // etc.

In conjunction with #21496, the only special thing about struct literals would be that we do not duplicate the curly braces:

ps := &SomeStructType{"foo", "bar"}  // ps := new(SomeStructType); *ps = {"foo", "bar"}

Ironically, there was a golang-nuts thread about &(*x) just today.

I think that supports Ian's argument that the “address of copy” syntax needs to be visually distinct from “address of an arbitrary expression”.

cznic commented

I'm against this proposal, but should it be accepted, I'd be inclined to just relax the restriction disallowing T in &T{...} to be a simple type like int, string etc. It also solves the untyped constant resulting type problem. &int32{42} or &int64{42} or even &myString{"foo"} are pretty clear about it.

@bcmills I see now why we need distinct syntax but these {} examples look too much like struct initialization in C. It's just confusing why it looks like taking the address of a struct literal when, in fact, it's not struct literal at all.

cznic commented

It's just confusing why it looks like taking the address of a struct literal when, in fact, it's not struct literal at all.

From a certain point of view, it is. &int{42} can be seen as [a shortcut of] &(&struct{ i int }{42}).i, which works today: https://play.golang.org/p/dsaYvDmfGAH just fine. Analogically for other types, ofc.

I did this enough to create a really simple (and maybe silly) package of helper functions:
https://godoc.org/github.com/mwielbut/pointy

I did this enough to create a really simple (and maybe silly) package of helper functions:
https://godoc.org/github.com/mwielbut/pointy

It's not silly at all. It's painful enough, so protobuf package does have those helper functions, too. I hope &v or &T(v) can be added to the spec (and possibly remove new keyword.)

https://godoc.org/github.com/golang/protobuf/proto

I think that supports Ian's argument that the “address of copy” syntax needs to be visually distinct from “address of an arbitrary expression”.

@bcmills can you elaborate? I didn't think there was any "address of copy" syntax.

@icholy, the syntax

&(*x)

today evaluates to the same value as x (https://play.golang.org/p/4GS5_Z9B3HN).

It would be confusing for

&(*x+1)

to suddenly have a dramatically different aliasing behavior — changing from allocating a new value to aliasing an existing one — simply because the +1 is added or removed from the expression.

This would make my life easier and my code cleaner. There are libraries that use *string, *uint64 etc extensively as "optional values" in structs and function arguments, and I always end up writing "helper functions" like this to be able to specify a literal:

func stringPtr(s string) *string {
    return &s
}

I came here from #37302, and this really an annoyance in Go. Everyone is writing these small helper functions to transform a constant into a pointer, such as intPtr(v int) *int { return &v}, etc.

These functions get copied around everywhere, they are even in the Go standard library! Mostly they are used for used in testing: e.g.: src/encoding/asn1/asn1_test.go, src/encoding/json/decode_test.go, etc, and the definition is often redundant, and even named differently.

In stead of these functions, we could really use this &type() syntax to get rid of these little helper functions everywhere, even in the standard library.

How do we move this proposal forward, either to decide for or against it? Specifically I'm referring to the &T(v) syntax -- I agree with @ianlancetaylor that new(T, v) is unnecessary. It seems to me that phrases like &int(42) are clear, unambiguous, and a lot of people would find this syntax useful to avoid an awkward multiline thing with a named temporary variable. This is evidenced by all the libraries (serialization libraries particularly) that have some form of IntPtr style functions in them.

As Ian summarized earlier:

I think the proposal here should be to add to the language the expression &T(v), for any type T, for any value v assignable to T. This expression will allocate a new variable of type T, set it to v, and return its address.

(Separately we could consider &function(...), e.g., &time.Now(), but that seems less generally useful and could be considered separately -- see also #22647.)

What's the best way to follow through with the &T(v) discussion, and come to agreement -- either to include this in a subsequent version of Go, or decide that it's not worth it? A formal proposal, with references to use cases / experience reports? A discussion on golang-dev?

I don't see any clear consensus in the discussion above. Although the emoji voting on &T(v) is good, there are quite a few comments suggesting other approachs.

It's also worth considering that the generics design draft permits writing

package addr

func P(type T)(v T) *T {
    return &v
}

which can then be used as

    p1 := addr.P(1) // p1 has type *int
    p2 := addr.P(iint64(2)) // p2 has type *int64
    p3 := addr.P("hi") // p3 has type *string
    p4 := addr.P(time.Now()) // p4 has type *time.Time

This has some advantages, in that it doesn't require a new language feature (well, doesn't require a language feature other than generics), and it doesn't require writing the type when that is not needed.

So personally I would be inclined to wait until we have generics to see if an approach like that seems sufficient.

That's reasonable thanks. I'm content to wait on that, assuming that generics design is going somewhere in the next couple of years. :-)

@ianlancetaylor, you prefer a third syntax to create pointers?

p := &v        // for any variable
p := &T{...}   // for composite types; consistent with &v
p := new(T)    // for any type, but mostly primitives
p := addr.P(v) // for any value, but mostly constants

That adds to the education/cognitive burden which is often raised re language proposals.

&T(v) is consistent with current syntax; that should count for a lot.

@griesemer any thoughts?

@networkimprov As @ianlancetaylor also mentioned earlier, no clear consensus has emerged yet. I agree that it would be nice to resolve this but there's no urgency. I'd be happy to wait for a truly compelling solution or a strong reason to move forward with one of the existing suggestions. As far as I can tell, this is not blocking anything.

And, just to be clear, adding a third syntax seems not a good plan. We want to make things simpler and clearer, not more complicated.

I want to float another potential approach here.

Given that taking the address of arbitrary expressions seems confusing based on the existing semantics, why don’t we reduce the scope and instead make two specific changes:

  1. Add function return values (when the function returns a single value) to the list of addressable things, which would allow for: &a(), or &time.Now()
  2. Add typecast results to the list of addressable things, which would allow &int(1), &string(“a”).

I think this covers most of the use-cases while requiring minimal language spec changes, and without introducing new potentially ambiguous cases. I also think the meaning of the syntax is easy to understand for a reader as it re-uses the & operator to create a pointer. The main disadvantage is that it may lead people to assume that &1 should work without the typecast, this can be solved by updating the compilation error you get to say “cannot take the address of 1. To take the address use an explicit cast: &int(1)”

The main time that I want these operators to work is when I’m constructing an object literal and the object has pointer-valued fields (this happens mostly today when modeling SQL tables with NULL-able columns, but also in a a variety of APIs that distinguish between the absence and presence of a variable). As a new go programmer I used to create temporary variables, but I have since changed and have written a set of helper functions (similar to #38298)

I decided against the following, because it seemed more complicated, but we could instead of 2. above, do: 2. extend the constant behavior so that & on a constant literal gives you a constant that has a default type of a pointer to the literal’s default type and which gets its definite type from the context in the same way constants do. That would allow i := &1 to set i to a *int; but also allow var x *float64 = &1. The main advantage of this in my mind is that &1 is shorter than &int(1), but the disadvantage is that you might begin to expect &(2*math.PI) to work, which it wouldn’t. &float64(2*math.PI) would work in the proposal above.

I also decided against adding a pointers package (like the other proposal) because this change would work for any type (importantly for me time.Time) and also help beginners who are confused by why they can’t take the address of a return value (me included). And against trying to add another kind of optional syntax to go, pointer values are a good conceptual match for optionals and I don’t think we need more syntax for similar things. (It also seems like a go2 concern!)

I came here from #42690 also same reason of @beoran problem. I wrote library that helps handle nil and non-nil value for a variable with homogenous type. However, it doesn't add significant benefit due to prohibition of implicit conversion. Thus, I should workaround code like this:

myString := "Hello Gophers!"
myNullableString := nullable.NewString(&myString)

instead of directly assign pointer as parameter like this

myNullableString := nullable.NewString(&string("Hello Gophers!"))

and speaking of simplicity, this approach is relatively simple than declaring "home variable" first. It's less line count and seems understandable for newbies. Moreover, I didn't see any syntax collision here with AND operand (&).

These ideas have been taken up again in the new proposal #45624.

I'm going to close this issue in favor of #45624.