golang/go

proposal: Go 2: sum types using interface type lists

Closed this issue Β· 113 comments

This is a speculative issue for discussion about an aspect of the current generics design draft. This is not part of the design draft, but is instead a further language change we could make if the design draft winds up being adopted into the language.

The design draft describes adding type lists to interface types. In the design draft, an interface type with a type list may only be used as a type constraint. This proposal is to discuss removing that restriction.

We would permit interface types with type lists to be used just as any other interface type may be used. A value of type T implements an interface type I with a type list if

  1. the method set of T includes all of the methods in I (if any); and
  2. either T or the underlying type of T is identical to one of the types in the type list of I.

(The latter requirement is intentionally identical to the requirement in the design draft when a type list is used in a type constraint.)

For example, consider:

type MyInt int
type MyOtherInt int
type MyFloat float64
type I1 interface {
    type MyInt, MyFloat
}
type I2 interface {
    type int, float64
}

The types MyInt and MyFloat implement I1. The type MyOtherInt does not implement I1. All three types, MyInt, MyOtherInt, and MyFloat implement I2.

The rules permit an interface type with a type list to permit either exact types (by listing non-builtin defined types) or types with a particular structure (by listing builtin defined types or type literals). There would be no way to permit the type int without also permitting all defined types whose underlying type is int. While this may not be the ideal rule for a sum type, it is the right rule for a type constraint, and it seems like a good idea to use the same rule in both cases.

Edit: This paragraph is withdrawn. We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

In all other ways an interface type with a type list would act exactly like an interface type. There would be no support for using operators with values of the interface type, even though that is permitted when using such a type as a type constraint. This is because in generic code we know that two values of some type parameter are the same type, and may therefore be used with a binary operator such as +. With two values of some interface type, all we know is that both types appear in the type list, but they need not be the same type, and so + may not be well defined. (One could imagine a further extension in which + is permitted but panics if the values are not the same type, but there is no obvious reason why that would be useful in practice.)

In particular, the zero value of an interface type with a type list would be nil, just as for any interface type. So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

As I said above, this is a speculative issue, opened here because it is an obvious extension of the generics design draft. In discussion here, please focus on the benefits and costs of this specific proposal. Discussion of sum types in general, or different proposals for sum types, should remain on #19412. Thanks.

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

I don't understand this. If all types in the type list appear as cases, the default case would never, trigger, correct? Why require both?

Personally, I'm opposed to requiring to mention all types as cases. It makes it impossible to change the list. ISTM at least adding new types to a type-list should be possible. For example, if go/ast used these proposed sum types, we could never add new node-types, because doing so would break any third-party package using ast.Node. That seems counterproductive.

I think requiring a default case is a good idea, but I don't like requiring to mention all types as cases.

There is another related question. It is possible for such a sum-value to satisfy two or more cases simultaneously. Consider

type A int

type X interface {
    type A, int
}

func main() {
    var x X = A(0)
    switch x.(type) {
    case int: // matches, underlying type is int
    case A: // matches, type is A
    }
}

I assume that the rules are the same as for type-switches today, which is that the syntactically first case is matched? I do see some potential for confusion here, though.

[edited]

@Merovius It does say "...and if there are any types in the type list that do not appear as cases in the type switch." Specifically, there is no comma between "default case" and "and". Perhaps that is the cause for the confusion?

Regarding the multiple cases scenario: I think this would be possible, and it's not obvious (to me) what the right answer here would be. One could argue that since the actual type stored in x is A that perhaps that case takes precedence.

It does say "...and if there are any types in the type list that do not appear as cases in the type switch."

Makes sense. I can see how the language is a little ambiguous, the point is it's a compile error if both of those conditions exist.

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

It occurred to me that tooling could spot when a type switch branch was invalid, i.e. the interface type list only contains A and B and your switch checks for C, but it seems best to not make that a compiler error. A linter could warn about it, but being overly restrictive here might harm backwards-compatibility.

Regarding the multiple cases scenario: I think this would be possible, and it's not obvious (to me) what the right answer here would be. One could argue that since the actual type stored in x is A that perhaps that case takes precedence.

I think it makes the most sense for the type switch to behave consistently. It's not clear to me how the type switch would be any different except that the interface being switched on has a type list. You can know at compile-time what branches should be in the switch, but that's it.

Overall I'm in favor, I think the proposal is right on the money. They function like any other interface, (no operators) and zero value is nil. Simple, consistent, unifies semantics with the Generics proposal. πŸ‘

@griesemer Ah, I think I understand now. I actually misparsed the sentence. So AIUI now, the proposal is to require either a default case or to mention all types, correct?

In that case, the proposal makes more sense to me and I'm no longer confused :) I still would prefer to require a default case, though, to get open sums. If it is even allowed to not have a default case, it's impossible to add new types to the type-list (I can't know if any of my reverse dependencies does that for one of my exported types, so if I don't want to break their compilation, I can't add new types). I understand that open sums seem less useful to people who want sum types, though (and I guess that's at the core of why I consider sum types to be less useful than many people think). But IMO open sums are more adherent to Go's general philosophy of large-scale engineering and the whole gradual repair mechanism - and also more useful for almost all use-cases I see sum types suggested for. But that's just my 2Β’.

@Merovius Yes, your new reading is correct.

mvdan commented

In all other ways an interface type with a type list would act exactly like an interface type.
[...] So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

Could you clarify why nil should always be an option in such sum types? I understand this makes them more like a regular interface, but I'm not sure if that consistency benefit outweighs how it makes them less useful.

For example, they could be left out by default, or included by writing nil or untyped nil as one of the elements in the type list.

I understand that the zero value gets trickier if we remove the possibility of nil, which might be the reason behind always including nil. What do other languages do here? Do they simply not allow creating a "zero value" of a sum type?

mvdan commented

To add to my comment above - @rogpeppe's older proposal in #19412 (comment) does indeed make nil opt-in, and the zero value of the sum type becomes the zero value of the first listed type. I quite like that idea.

@mvdan as far as I'm aware other languages with sum types do not have the notion of a zero value and either require a constructor or leave it undefined. It's not ideal to have a nil value but getting something that works both as a type and a metatype for generics is worth the tradeoff, imo.

I guess (as a nit) nil should also be a required case if no default case is given, if we make nil a valid value.

So this https://go2goplay.golang.org/p/5L7T8G9rfLD would print "something else" under the current proposal, correct? The only way to get that value is reflect?

@jimmyfrasche Correct. This proposal doesn't change the way that type switches operate, except for the suggested error if there are omitted cases.

So that means this code would panic? https://go2goplay.golang.org/p/vPC-qtKb7VO
That seems strange and as if it makes these sum types significantly less useful.

I'd like to reiterate my earlier suggestion: https://groups.google.com/g/golang-nuts/c/y-EzmJhW0q8/m/XICtS-Z8BwAJ

Not terribly excited about the new syntax.

That seems strange and as if it makes these sum types significantly less useful.

Well, it panics without the type list. But I get your point, the interface then allows a value for which the compiler won't enforce a branch in a type switch.

Could we perform implicit conversion to one of the listed types when you assign into the interface? The only case I can think of where that's weird is when the interface has methods that those underlying types don't have, i.e. you have type Foo interface{ type int; String() string }, so implicit conversion to int itself violates the interface.

While I really like the idea of unifying interface semantics by allowing type lists in interfaces used as values, rather than just as constraints, perhaps the two use cases are different enough that the interfaces you'd use for each vary significantly. Maybe this problem we're discussing isn't one that we'd encounter in real code? It might be time to break out some concrete examples.

Any explicit syntax would work. I just had to choose something semi-reasonable to write the idea down. At any rate, it wouldn't need to be used very often but having the choice let's everything work reasonably without either use being hindered by the existence of the other.

@Merovius Correct: that code would panic.

I think it's worth discussing whether that would in fact make these sum types significantly less useful. It's not obvious to me, because it's not obvious to me that sum types are often used to store values of types like int. I agree that if that is a common case, then this form of sum types is not all that useful, but when does that come up in practice, and why?

You could always work around it by using type LabeledInt int in the sum but that means having to create additional types. fwif json.Token is a sum type in the standard library that takes bool and float64 and string

@ianlancetaylor Point taken. I can't personally really provide any evidence or make any strong arguments, because I'm not convinced sum types in and off itself are actually very useful :) I was trying to extrapolate. Either way, I also find it objectionable on an aesthetic level, to single out predeclared types in this way - but that's just subjective, of course.

neild commented

Regarding changing a type list being a breaking change: If the type list contains an unexported type, then the rule in @ianlancetaylor's proposal effectively requires that all type switches outside the package containing the sum type contain a default case.

For example,

package p
type mustIncludeDefaultCase struct{}
type MySum interface {
  type int, float64, mustIncludeDefaultCase
}

Regarding nil-ness of sum types: I find it strange that the proposed rules require type switches to exhaustively cover the possible types in the sum or include a default case, but don't require covering the nil case.

type T interface { type int16, int32 }
func main() {
  var x T

  // None of these cases will execute, because x is nil.
  switch x.(type) {
  case int16:
  case int32:
  } 
}

I personally would prefer a design in which the zero value of a sum is the zero value of the first type in the sum. It is easy to add an additional "nothing here" case when desired, but impossible to remove a mandatory nil case when not.

In general I'm in favour of this proposal, but I think there are some issues that need to be solved first.

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

If type switches aren't changed at all, then I don't see how this rule is useful. It feels like it's attempting to define that switches on type-list interfaces are complete, but it's clear that they can never be complete when the type list contains builtin types, because there are any number of other non-builtin types there could be.

In general, even putting the above aside, I don't think the requirement for a type switch statement to fully enumerate the cases or the requirement to have a default fits well with the rest of the language. It's common to just "fall off the bottom" of a switch statement if something doesn't match, and that seems just as apt to me for a type-list type switch as with any other switch or type switch. In general, the rule doesn't feel very "Go-like" to me.

What about type assertions involving type-list interfaces. Can I do this?

type I1 interface {
    type string, []byte
}
var x I1 = "hello"
y := x.(string)

If not, why not? If so, why is this so different from a type switch with a single case and no default branch?

What about this (type asserting to a type list interface) ?

x := y.(I1)

If that works, presumably this could be used to test the underlying type of the dynamic type of an interface, which is something that's not easy to do even with reflect currently.

The rules permit an interface type with a type list to permit either exact types (by listing non-builtin defined types) or types with a particular structure (by listing builtin defined types or type literals). There would be no way to permit the type int without also permitting all defined types whose underlying type is int. While this may not be the ideal rule for a sum type, it is the right rule for a type constraint, and it seems like a good idea to use the same rule in both cases.

I understand why this rule is proposed - using the same rule in both cases is important for consistency and lack of surprises in the language. However, ISTM that this rule gives rise to almost all the discomfort I have with this proposal:

  • we can switch on all the types in the type list without having a guarantee of getting a match
  • there's a many-to-one correspondence between the types named in the interface type and the dynamic types that the interface can take on

If we don't allow an interface type with a type list to match underlying types too, then you end up with surprises with assignments in generic functions. For example, this wouldn't be allowed, because F might be instantiated with a type that isn't int64 or int:

type I interface {
    type int64, int
}

func F[T I](x T) I {
    return x
}

How about working around this by adding the following restriction:

Only generic type parameter constraints can use type-list interfaces that contain builtin types.

So the above example would give a compile error because I, which contains a builtin type, is being used as a normal interface type.

The above restriction would make almost all the issues go away, I think - albeit at the cost of some generality.

There would be no support for using operators with values of the interface type, even though that is permitted when using such a type as a type constraint. This is because in generic code we know that two values of some type parameter are the same type, and may therefore be used with a binary operator such as +. With two values of some interface type, all we know is that both types appear in the type list, but they need not be the same type, and so + may not be well defined. (One could imagine a further extension in which + is permitted but panics if the values are not the same type, but there is no obvious reason why that would be useful in practice.)

Allowing operators is only a problem for binary operators AFAICS. One thing that might be interesting to allow is operators that do not involve more than one instance of the type. For example, given:

type StringOrBytes interface {
     type string, []byte
}

I don't think that there would be any technical problem with allowing:

var s StringOrBytes = "hello"
s1 := s[2:4]
n := len(s)

In particular, the zero value of an interface type with a type list would be nil, just as for any interface type. So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

I think that always having nil as a possibility is a bit of a shame and I don't think it's absolutely necessary as @mvdan pointed out, but I'd still support this proposal even with nil, for the record.

As far as I can tell, this proposal nearly parallels the defined-sum interface types in my previous writeup.

There is one key difference that I would like to explore: assignability. To me, assignability is what makes interface types coherent at all: it is what causes interface types to have regular subtyping relationships, which other Go types lack.

This proposal does not mention assignability at all. That seems like an oversight. In particular:

  • If all of the types in the sum type S are defined (not underlying) types, and all of those types implement the methods of an ordinary interface type I, should a variable of type S be assignable to I?

  • If all of the types in the sum type S1 are also in the sum type S2, should a variable of type S1 be assignable to S2? (This is especially important when I is itself a type-list interface: it seems clear to me that a variable of type interface { int8, int16 } should be assignable to a variable of type interface { int8, int16, int32 }.)

  • Should a variable of any sum type be assignable to interface{}?

@mvdan: for me, the assignability properties are what lead to the conclusion that all sum types should admit nil values. It would be strange for the zero-value of type interface { int8, int16 } to be different for the zero-value of type interface{} if the former is assignable to the latter, and it would be even stranger for the zero-value of type interface { *A, *B } to be assignable to an interface implemented by both *A and *B but to have a non-nil zero-value.

I believe that this proposal is compatible with (in the sense that it would not preclude later addition of) the sum interface types from my previous writeup, which I think are a closer fit to what most users who request β€œsum types” have in mind.

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

This constraint seems like a mistake to me. Due to the underlying-type expansion, even a switch that enumerates all of the types in the type switch could be non-exhaustive.

I think that either the default constraint should be dropped, or a default case should also be required for any type-list interface that includes any type that could be an underlying type (that is, any literal composite type and any predeclared primitive type).

Would unary operators be defined on the types if possible? ie:

type Int interface {
    type int, int32, int64
}
func Neg(i Int) Int {
    return -i
}

I would assume not since unary operators are defined to expand to binary operators, however the unary operators are valid to use since the other operand is untyped. Although this could result in unexpected behavior for sum types with uint types included.

Finally, I would like to note that this proposal on its own would expose an (existing) inconsistency in the generics design draft: the semantics of a type-list interface used as a constraint would be markedly different from the semantics of any other interface type used as a type constraint.

In particular, unlike other interface types, a type-list interface would not satisfy itself as a type constraint. If it did, then the type constraint interface { int8, int16 } would not be sufficient to allow the use of mathematical operators on the constrained type, because the type interface { int8, int16 } itself does not have defined mathematical operators.

(See https://github.com/bcmills/go2go/blob/master/typelist.md#type-lists-are-not-coherent-with-interface-types for more detail.)

Clearly people do not like the type switch part of this proposal, so let's consider that to be withdrawn. It's not the important part.

@rogpeppe

How about working around this by adding the following restriction:

Only generic type parameter constraints can use type-list interfaces that contain builtin types.

We could definitely do that, but I think it's an awkward requirement. The point of this proposal, as I see it, is to keep the rules of the design draft and extend them for sum types. I think that if we start tweaking the rules, we lose the benefit of this proposal, and it would probably be better to consider other approaches for sum types.

That is, I think it's fine if we say "this proposal doesn't give us what we want for sum types, so let's not adopt it." But I would argue that if we say "let's adopt this proposal but make interfaces-with-type-lists behave differently when used as type constraints and when used as ordinary types," then this proposal is no longer providing a benefit that is worth the cost. Better than that would be to adopt a different version of sum types that is not easily confused with type constraints. (Or, of course, not adopt sum types at all.)

@bcmills

If all of the types in the sum type S are defined (not underlying) types, and all of those types implement the methods of an ordinary interface type I, should a variable of type S be assignable to I?

We could certainly discuss that possibility, but, in the current proposal, no.

If all of the types in the sum type S1 are also in the sum type S2, should a variable of type S1 be assignable to S2? (This is especially important when I is itself a type-list interface: it seems clear to me that a variable of type interface { int8, int16 } should be assignable to a variable of type interface { int8, int16, int32 }.)

Yes.

Should a variable of any sum type be assignable to interface{}?

Yes.

(The other interesting cases are what happens with embedding, which it outlined at https://go.googlesource.com/proposal/+/refs/heads/master/design/go2draft-type-parameters.md#type-lists-in-embedded-constraints)

@bcmills

I believe that this proposal is compatible with (in the sense that it would not preclude later addition of) the sum interface types from my previous writeup, which I think are a closer fit to what most users who request β€œsum types” have in mind.

I think that idea is fine, but I want to make clear that in my opinion, if we think we are going to adopt that idea, then we should not adopt this one. We don't need both. Saying that type lists in interface types are only permitted in type constraints is definitely a wart, but I think that wart would be better than having two similar but slightly different ways of defining sum types.

I tend to dislike the idea of having both type-lists for constraints and a separate mechanism for sum-types. That is, I agree that if we add sum-types, they should re-use the mechanism of type-lists for constraints. And if we don't like the semantics that gives us for sum-types, it might be worth it to reconsider the mechanics of type-lists for constraints as well? I know that this isn't an attractive idea, though, because the generics proposal has been discussed at length already.

It's not too late to change the generics design draft, if anybody has any specific suggestions that are clearly better than what we are doing now.

(The current semantics of type lists were in fact chosen to make this proposal possible, but that doesn't mean that there aren't better semantics.)

I don't think that there will be one rule that can satisfy both uses. Type lists could always use identical types when used as sum types but then there are two rules. If there's an explicit syntax to annotate which kind of matching to use for items in a type list, you could satisfy both uses. Individual interfaces might not always make sense as both a sum type and a constraint but that's fine and will likely be true in practice regardless.

@Merovius, type constraints are necessarily concerned with the space of allowed operations, whereas sum-types are concerned with the space of allowed values. The two are related but β€” especially given the existence of binary operations β€” I think they cannot be unified.

(Or, to put it another way: many of the use cases for sum types require a sum type to be a finite set, whereas many of the use cases for generics require the constraint to match an infinite set.)

It's not too late to change the generics design draft, if anybody has any specific suggestions that are clearly better than what we are doing now.

If I were to suggest anything, it would be to avoid the "underlying type matching" semantics completely from type list interfaces. I think they're the source of a lot of the harder problems with the proposals, and I'm not convinced they really provide that much added value.

@rogpeppe Interesting point. I will add that we could always add the "underlying type matching" rule later (at least for constraints) if it turned out to be important. But we couldn't take it away later.

On the other hand, consider the case of a generic min function: It would be a bit sad if we couldn't use it to compute the minimum of say two temperatures; e.g., defined as type Kelvin float32. More generally, anytime people defined a special type to express a unit, this would break down.

If the underlying type matching rule is introduced after sum types that would mean they follow different rules. I can't imagine it would be backwards compatible to change the matching rules for sum types like it would be for generics.

You could introduce an annotation for types in a type list that should follow the "underlying type matching" rule later but then the constraints, slices, etc. packages could face backwards compatibility issues trying to integrate those.

type constraints are necessarily concerned with the space of allowed operations, whereas sum-types are concerned with the space of allowed values. The two are related but β€” especially given the existence of binary operations β€” I think they cannot be unified.

In #27605, the primary syntax proposal was to use operator (T + T) AdderFuncName to define operator functions. I'm not advocating for operator functions here, but I think this syntax (or a similar one) would also be good to define operators in interfaces. For instance:

type Ordered[T] interface {
    operator(T < T)
}

This could be a good way to define constraints on operators, and then the interface { type t1, t2, ... } or something similar could be used for sum types and use exact type matching.

The constraint in practice (with type parameter inference) could look something like:

func Min[T Ordered](slice []T) T { ... }

Regarding changing a type list being a breaking change: If the type list contains an unexported type, then the rule in @ianlancetaylor's proposal effectively requires that all type switches outside the package containing the sum type contain a default case.

Makes sense to me.

Clearly people do not like the type switch part of this proposal, so let's consider that to be withdrawn. It's not the important part.

Without this, I'm not sure what the benefit is of these sum types. The proposal then boils down to "you can use an interface with a type list outside of type constraints". That's fine and improves consistency within the language, but if we don't even attempt to give the developer tools to ensure he's handling the appropriate set of values, I don't know that we should call them sum types.

If there's an explicit syntax to annotate which kind of matching to use for items in a type list, you could satisfy both uses.

With all the above discussion highlighting the tradeoffs and problems, I think I've come around to like at least the spirit of this from @jimmyfrasche and his earlier suggestion. If type switches had a way to match on "all types with the underlying type of X" then we could guarantee the completeness of a switch even when a type list contains predeclared types. Something like this, which is basically an inversion of the syntax @jimmyfrasche proposed:

switch v.(type) {
case float32, float64: // exact matches
case string...: // matches the first type in the type list with an underlying type of string
case string: // exact match
}

Note that it's backwards-compatible, and could also perhaps be used for type assertions.

@deanveloper The problem is not that we couldn't introduce operators in constraints, the problem is that operators alone don't address all problems. We would also need to invent notation for which conversions are permitted, and notation to express permissible constant values: how would we specify that one can assign a string or an integer constant to a value of type parameter type? How do we express that we need to be able to assign values of up to 1234 to a variable of type parameter type? Etc. Type lists elegantly solve this problem, which is why we eventually zoomed in on them.

Looking from the opposite point of view, if we already had (somehow sensibly defined) sum types of sorts, how would they be different from interface types? Would they be sufficiently different from interfaces further constrained by type lists?

@griesemer they could be like discriminated unions: structs that only allow one field to be set at a time. That's very different. In a vacuum I'd prefer that, but if type lists can be reused that wins out.

@tooolbox the difference between an interface with a type list and without is additional compile time safety and tools will be able to read the type list and tell you if you missed a case (you can do this now but you need someway outside Go to say to check the interface and what types are permissible)

@tooolbox the difference between an interface with a type list and without is additional compile time safety and tools will be able to read the type list and tell you if you missed a case (you can do this now but you need someway outside Go to say to check the interface and what types are permissible)

I think you misunderstood; I understand that and agree with that. It seems to me that read(ing) the type list and tell(ing) you if you missed a case would be done in a type switch, and @ianlancetaylor seemed to be withdrawing any effort to achieve safety/completeness in that case. I took that to mean that type lists would then only be enforced when assigning a value to an interface, which is fine, but it seems like any sum types worthy of the name could offer some kind of completeness SLA at the site of "pattern matching" a.k.a. disambiguation a.k.a. type switching.

Stated another way, I'm supporting your earlier suggestion for a syntax to differentiate between matching exact types and matching underlying types in type switches, type assertions, etc. I know my initial reaction was negative, but it seems like a good way to make these sum types useful, while keeping unified semantics between the two different uses of interfaces, and preserving the flexibility we get by allowing "underlying type matching" for type lists.

@jimmyfrasche How much is a discriminated union different from an interface constrained with a type list? I suspect a straight-forward implementation would be the same for both (sum types would represented like interfaces internally). We'd expect in both cases that we can do type switches and asserts. Maybe discriminated unions wouldn't have a nil value, and perhaps the underlying rule would be gone. Is there more?

(I am not saying that these two differences aren't crucial - perhaps they are - I'm just trying to understand if there's something else.)

Note that even if the language does not check that all possible types appear as a type switch case, it would be straightforward for a static checker to do so.

@griesemer I gave a thorough description in the other thread at #19412 (comment) which references #19412 (comment) and there is some good discussion surrounding those posts that github has decided to fold. I don't want to derail this thread with a counterproposal, but there are some significant differences between that and an interface-backed solution.

I like this proposal because it will solve several problems in Go in a very practial way.

Namely, it matches current best practise well, to make a sum type now, we define an interface and a limited list of types that implement that interface.

I have an example in my scripting language MUESLI (https://gitlab.com/beoran/muesli) that would benefit from this proposal.

Consider this function that converts built in Go types to muesli Values:

func (from FloatValue) Convert(to interface{}) error {
	switch toPtr := to.(type) {
		case *string:
			(*toPtr) = from.String()
		case *int8:
			(*toPtr) = int8(from)
		case *int16:
			(*toPtr) = int16(from)
		case *int32:
			(*toPtr) = int32(from)
		case *int64:
			(*toPtr) = int64(from)
		case *int:
			(*toPtr) = int(from)
		case *bool:			 
			(*toPtr) = (from != 0) 		
		case *float32:
			(*toPtr) = float32(from)
		case *float64:
			(*toPtr) = float64(from)
		case *FloatValue:
			(*toPtr) = from
		case *Value:
			(*toPtr) = from
		default:
			return NewErrorValuef("Cannot convert FloatValue value %v to %v", from, to)
	}
	return nil
}

I could get rid of interface{} here and change this to interface { type *string, *int8, *int16, *int32, *int64, *int, *bool, *float32, *float64, *FloatValue, Value }, to make it more clear to the caller that only certain types are allowed. EDIT: I assume the compiler will also type check the passed to variable then and error if it isn't one of the mentioned types? The great thing about this proposal is that it allows me to get rid of many interface{}, which are a constant source of problems, much like the void * pointer in C.

It would be even better if there was an option to make the type switch above exhaustive, so the default case is not needed. Maybe we could take advantage of the range keyword as new syntax for exhaustive type switches, like this:

switch range toPtr := to.(type) {
// EDIT: or maybe like this:
switch toPtr := range to.(type) {

In which case the type switch must mention all types mentioned in the interface, as well as nil, but a default case is not needed.

@griesemer

wouldn't the memory layout of a sum type be totally different from an interface? Presumably, it would be like a C union, where the largest member defines the total memory of the type, plus the tag indicating which type is actually in the sum. Of course, that would not be a straight-forward implementation. In terms of usage however, it's probably not that different.

Presumably, it would be like a C union, where the largest member defines the total memory of the type, plus the tag indicating which type is actually in the sum.

I suspect you couldn't quite do that because then the GC would need to know about the tags, which would slow it down. You'd need to align the components so that pointers were in the same place. That still might be worth doing (consider a type list that mentions only types without pointers), and certainly the spec would want to leave the possibility open, even if the implementation was only "straight-forward" initially.

@griesemer

On the other hand, consider the case of a generic min function: It would be a bit sad if we couldn't use it to compute the minimum of say two temperatures; e.g., defined as type Kelvin float32. More generally, anytime people defined a special type to express a unit, this would break down.

That's true to an extent, but it wouldn't be hard to provide an interface that lets such a special type opt in to ordering.
For example: https://go2goplay.golang.org/p/axg6RrzLzbb

// Under can be implemented by a type to return itself
// as its underlying type.
type Under[T any] interface {
	Underlying() T
}

func MinU[T Under[U], U constraints.Ordered](a, b T) T {
	if a.Underlying() < b.Underlying() {
		return a
	}
	return b
}

We only have this problem with named types, and such a method can be added backwardly compatibly to a named type. To me a workaround like this seems preferable to making the entire generics proposal significantly more complex.

Of course, this could also be used to allow the MinU function to work on types that would otherwise not be comparable.

A down side of this approach is that it doesn't work for arithmetic operations such as addition. It's possible to work around that too though: https://go2goplay.golang.org/p/FXJ_ZfgAE3Z.

Another issue is that type inference doesn't work, but I suspect that could be worked around by allowing slightly more sophisticated type inference rules.

It could also be argued that it's also cleaner for non-builtin types to have to opt into arithmetic operations (does it make sense to use Min on an arbitrary enum-like type?) rather than automatically satisfying all the built-in operators.

It could also be argued that it's also cleaner for non-builtin types to have to opt into arithmetic operations ….

Ooh, that's an excellent point! (Compare #30209, in which I would like to remove unchecked arithmetic operations from integer types.)

(I suppose you could also opt out of arithmetic operations by wrapping the type in a struct type, but if you do that then the type can no longer be initialized from constants, and can no longer be used for constants.)

It could also be argued that it's also cleaner for non-builtin types to have to opt into arithmetic operations ...

We wouldn't be able to do this without breaking backward-compatibility.

We wouldn't be able to do this without breaking backward-compatibility.

Yup. I really meant that if something purports to operate generically on any builtin number type, it doesn't necessarily follow that it should automatically be allowed to operate on non-builtin types too. Using arithmetic on a non-builtin directly feels somewhat different to me.

So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

What are the downsides of always having an implicit nil in the type list? It doesn’t seem too bad to have to check for nil case, that’s already what we have with Go interfaces. The suggestion of the sum type’s zero value being the zero value of the first type in the list also doesn’t sound bad, what are the downsides of choosing that instead?

@bokwoon95 theoretically it breaks the semigroup structure of the type system that would otherwise be introduced by sum types being added which would probably make some refactorings harder. Practically it would be like having to always add a default case to an enum: sometimes what you wanted anyway, sometimes very annoying.

The zero value in Go currently is always all bits zero which makes it very easy to set in all situations. Since this is proposal is backed by an interface value having the initial value be the zero value of the first type in the type list would mean allocating that value and setting up the interface value instead of just setting everything to 0. That's probably trivial to do most of the time but I'm sure there are some weird cases where it would get incredibly tricky.

Since this is proposal is backed by an interface value having the initial value be the zero value of the first type in the type list would mean allocating that value and setting up the interface value instead of just setting everything to 0.

I don't think that's necessarily the case. Just because it says it's an interface type doesn't mean that the underlying representation needs to be like other interface types. There's no reason AFAICS that the representation couldn't use an integer tag to mark the type - the actual type information could be retrieved by indexing into a global array specific to the sum type. If you do that, then the zero value can still be all bits zero even when the type is non-nil.

I think it's a bad idea for interfaces with and without a type list to have any semantic differences (besides what types satisfy them). If they look like interfaces, they should be exactly interfaces in every way except that they can only store the listed types.

@rogpeppe The value also needs to be initialized though. How would you manage something like this, for example?

type A [2]int

type B struct{
    x int
    y *int
}

type X interface {
    type A, B
}

ISTM that for the GC, the value would need to be a pointer (like interfaces currently need to be), but then the zero-value would have to store a pointer to a valid [2]int that is zero. There might be a way out with re-ordering fields or whatever. And/or I can't see the forest for the trees right now. I guess the compiler could explicitly check for nil whenever reading the value and substitute an appropriate pointer in that case, but that seems inefficient.

I think it's worth discussing whether that would in fact make these sum types significantly less useful. It's not obvious to me, because it's not obvious to me that sum types are often used to store values of types like int. I agree that if that is a common case, then this form of sum types is not all that useful, but when does that come up in practice, and why?

I think this mainly benefits library authors who want to provide a better API for their users: for example there is a very useful pattern where you want to receive anything string-like from the user.

type StringLike interface {
    type string, fmt.Stringer
}
func gimmeAString(s StringLike) {
    switch s := s.(type) {
        case nil:
            fmt.Printf("your string is nil\n")
        case string:
            fmt.Printf("your string is %s\n", s)
        case fmt.Stringer:
            fmt.Println("your string is %s\n", s.String())
    }
}

Right now there are only two workarounds for something like this: either pass in an interface{} (and the user gets to pass in anything), or write functions to handle each type (assuming that the type list is finite and manageable). I have an specific example of this in my jOOQ-like library: currently I write methods Eq, EqInt and EqFloat64 to handle multiple numeric types, I would really prefer the method signature of Eq to be func (f NumberField) Eq(s NumberLike) Predicate, where NumberLike is a type list of NumberField, int, int32, int64, float32, float64 (you can see I only included methods to handle int and floats because I felt anything else would bloat my API). This would very much bring it closer to what jOOQ offers with its .eq() method that accepts either Integer, Field<Integer> and I assume any other number-like type.

Edit: my example (that I striked out) is invalid, it would never have worked even with the acceptance of this Sum Types proposal because Go's Type Parameters proposal do not allow type parameters on methods. The convenience of sum types will still apply to top level functions, however.

@bokwoon95 I think even that example is surprising due to the underlying-type matching, though: that switch looks exhaustive, but isn't (https://play.golang.org/p/DpYtXT52Bct).

I think it's a bad idea for interfaces with and without a type list to have any semantic differences (besides what types satisfy them). If they look like interfaces, they should be exactly interfaces in every way except that they can only store the listed types.

I was talking about implementation differences, not semantic differences. From a language p.o.v. they'd still be interfaces in every way except that they can only store the listed types (not including nil unless explicitly allowed).

@Merovius

type A [2]int

type B struct{
    x int
    y *int
}

type X interface {
    type A, B
}

For that example, the runtime representation of X could be:

[tag, int, int, *int]

i.e. the two possible component types could be partially unioned.
For types consisting entirely of non-pointers, all the values could fit into the same space.
For something like:

type X interface {
    type struct {A int; B *int}, struct {A *int, B int}
}

it's likely that the only possible representation would be to keep the values in separate spaces; for example:

 [tag, int, *int, *int, int]

@rogpeppe I'd consider that a semantic difference and a rather large one at that. If we're reusing interfaces we should keep it simple and reuse interfaces as-is. If we're going to introduce large differences, might as well make it a new kind of type, but then it's going to fall out of step with the generics proposal.

As I see it the options are

  1. do nothing, interfaces with type lists can only be used as constraints and perhaps have a separate kind of sum types later which would be an unfortunate overlap with type lists
  2. get rid of underlying-type-matching in the generics draft which removes a good deal of demonstrable utility from generics but then one rule can be used for generics and sum types
  3. have different rules for type lists when used as constraints vs as interfaces which is subtle and could be confusing
  4. introduce a notation for selecting how types in a type lists are matched. This is explicit but does mean that some interfaces with type lists wouldn't be very useful for sum types and some wouldn't be very useful as constraints, though that would probably be the case regardless.

2 and 4 would need to be settled before the generics can be released or it's too late to make that decision.

1-3 don't seem especially nice to me but 3 would be the most useful and wouldn't hold anything up.

I still think 4 is the way to go. Even if sum types aren't added, I'd rather have an explicit notation. The current rule is simple enough but it doesn't feel like a good place to be implicit.

@rogpeppe Okay. We could do the same thing for interfaces, but we aren't, so I wouldn't want to assume that. But either way - another nitpick against that implementation is that you could use reflect to create new types that match the interface - at least if we're keeping the underlying type matching. So you can't really use a static array of possible types. But, TBH, I don't super care either way. It's probably possible to invent ourselves around any implementation restrictions one way or another :)

I don't quite see why all these complications are needed. The original proposal is just to allow an interface with a type list everywhere. While that doesn't quite give us classic sum types, it does help a lot in getting rid of the empty interface in many places, and is consistent with generics. The fact that this interface may be nil is not really a problem, we have to nil check interfaces and pointers today also. Let's do the simplest thing that could work.

@jimmyfrasche

I'd consider that a semantic difference and a rather large one at that

I don't understand why you'd consider it that. The underlying representation of interfaces has always been undefined and varies according to the type of interface type itself and historically (for example , the empty interface doesn't have a method table; non-pointer types used to be stored directly in the interface).

I don't see that it's changing anything significant to say that the compiler can optimise the underlying storage for type-list interfaces if it likes, which is eminently possible when you know exactly what kind values can be stored in the type.

It could still do that even if type-list interfaces always have a zero value of nil of course. I was just pointing out that there's no inherent need for the zero value to have no dynamic type.

@rogpeppe Okay. We could do the same thing for interfaces, but we aren't, so I wouldn't want to assume that.

As you point out yourself, that's not possible with normal interfaces because it's always possible to create a new type and assign it to that interface.

But we could so that with sum types, and that's actually a significant part of their potential attraction, I believe, although it probably wouldn't be done as the first implementation.

@beoran

I don't quite see why all these complications are needed.

I think it may well be useful to be able to let the zero value of a type-list interface be useful, and I don't see any technical reason why it would be hard to do so.

That said, I don't feel that strongly about whether it should be possible to preclude nil - I see both approaches as reasonable, and neither seems to me to be necessarily more or less efficient than the other.

Thanks for clarifying that.

In my withdrawn range type proposal, "the zero value is the value of the first member". In this case it would also have the type of the first member. With such a rule, this proposal would indeed be even more useful than with a nil zero value.

However, whike I agree that a non nil zero value would be useful, it is not orthogonal with how interfaces work now in Go. The zero value of an interface now is nil. It's more consistent and easier to learn if that stays that way.

I'm not concerned with the representation as long as there are no observable differences. If you had an interface with a type list that contained [1<<20]byte you'd probably want to pass values of that interface around with a pointer. That may not be a semantic difference, depending on definitions, but it's still an observable difference.

The difference I was referring to was the change in the zero value, however. That's arguably a superset of normal interfaces behavior and sensible but it's still behaving differently than an interface without a type list.

I like both the compact representation and the default zero value in the abstract, but I'm not sure they quite square with interfaces so they should probably be their own kind of type, but that means not unifying sum types and generic constraints.

@jimmyfrasche

If you had an interface with a type list that contained [1<<20]byte you'd probably want to pass values of that interface around with a pointer.

That's surely up to the compiler to decide? You could say that about structs too. ISTR there are other places in the runtime where similar kinds of decision are driven by object size. Many performance trade-offs are "observable" but that doesn't mean that they are unreasonable (the historical change to allocate when assigning an int to interface is a good example there).

The difference I was referring to was the change in the zero value, however. That's arguably a superset of normal interfaces behavior and sensible but it's still behaving differently than an interface without a type list.

Type list interfaces do behave differently from interfaces without type lists. That's already the case in the proposal and would continue to be the case were they to be adopted as sum types too. But what behaviour would be appropriate for this new type? Are there concrete cases where the lack of nil would be a surprise or a problem? Are there cases where the opposite is true.

I like both the compact representation and the default zero value in the abstract, but I'm not sure they quite square with interfaces so they should probably be their own kind of type

They are their own kind of type, exactly by virtue of the fact that they've got a type list inside, in my view.

Maybe using | is more readable and compact than type list.

Example 1:

type SignedInteger int | int8 | int16 | int32 | int64

vs

type SignedInteger interface {
    type int, int8, int16, int32, int64
}

Example 2:

type Cat struct {}
type Dog struct {}
type Lion struct {}

type Animal Cat | Dog | Lion

vs

type Cat struct {}
type Dog struct {}
type Lion struct {}

type Animal interface {
    type Cat, Dog, Lion
}

Above is about anonymous usage. It's more convincing

Example 3 functions:

func eat(animal  Cat | Dog | Lion) {}

vs

func eat(animal interface { type Cat, Dog, Lion } ) {}

Example 4 structs:

type Zoo struct {
    Animals []Cat|Dog|Lion
}

vs

type Zoo struct {
      Animals []interface { type Cat, Dog, Lion }
}

@ianlancetaylor

Does this proposal allow recursive sum type declaration?
Something like:

type Nil struct{}
type Leaf[type T] struct{ Value T }
type Branch[type T] struct {
  Value T
  L, R  BinaryTree[T]
}

type BinaryTree[type T] interface {
  type Nil, Leaf[T], Branch[T]
}

Would this, possibly with some extensions, theoretically allow for conditional JSON parsing, such as

type Location interface {
  type GPS, Approximate
}

type GPS struct {
  Type string `json:"type,const:gps"`
  Latitude float64 `json:"latitude"`
  Longitude float64 `json:"longitude"`
  Height float64 `json:"height"`
}

type Approximate struct {
  Type string `json:"type,const:approximate"`
  Country string `json:"country"`
  Region string `json:"region"`
}

// Elsewhere:

var loc Location
err := json.Unmarshal(data, &loc)
if err != nil {
  panic(err)
}
// loc is now either a GPS or an Approximate.

I've been wanting a feature like this ever since I discovered constant-based type variants in Typescript and a few other languages. This wouldn't be quite as clean as that as struct tags are kind of a bit of a hack for it, but I'd still use it.

Would this, possibly with some extensions, theoretically allow for conditional JSON parsing

You could implement the custom JSON unmarshalling logic yourself

func (loc *Location) UnmarshalJSON(data []byte) error {
    gps, approx := GPS{}, Approximate{}
    err := json.Unmarshal(data, &gps)
    if err != nil {
        return err
    }
    switch gps.Type {
    case "gps":
        *loc = gps
    case "approximate":
        err := json.Unmarshal(data, &approx)
        if err != nil {
            return err
        }
        *loc = approx
    }
    return nil
}

var loc Location
err := json.Unmarshal(data, &loc)
if err != nil {
  panic(err)
}

You can't implement methods on an interface. That won't compile. I could wrap the interface in a struct so that I can implement that method for it, but it'll be kind of clunky.

If I have complete control over the JSON, then I can also do it via a type field and a sub-object which I can unmarshal into a json.RawMessage and then manually unmarshal that based on the type field, but it's clunky and has to be manually redone for everything, not to mention not working at all with APIs that don't format their data that way.

I fear this is getting more than a bit off-topic, though, so I think we should probably end this conversation here. Sorry to everyone else for all the pings.

@DeedleFake Yes, I think that's a great use case for sum types, although the details would need to be worked out properly.

I really dislike the fact that builtin types are treated differently from user defined types. That's counter intuitive and a possible source of errors. Instead of treating them special in type lists, I would suggest using some kind of special syntax that means "any type that has the same underlying type as X". The reason is that nowhere else in the language, some type names mean "the concrete type X" whereas others mean "anything with X as its underlying type".

My suggestion would be to call it interface X, for the lack of a better fitting keyword name.

So this interface would be satisfied only by the concrete types int and uint:

type ConcreteInt interface {
    type int, uint
}

On the other hand, this interface would be satisfied by all types that have int or uint as their underlying type:

type AnyInt interface {
    type interface int, interface uint
}

This syntax should not only be allowed in type lists, but also in type switches:

func WhatIsThis(x AnyInt) {
    switch x := x.(type) {
    case int:
        fmt.Println("concrete int")
    case interface int:
        fmt.Println("underlying type is int, value is", int(x))
    case uint:
        fmt.Println("concrete uint")
    case interface uint:
        fmt.Println("underlying type is uint, value is", uint(x))
    case nil:
        fmt.Println("nil")
    }
}

The type of x in the second case would be interface int, which supports being converted to any type with underlying type int, but not any other operations. When you have a type MyInt defined as type MyInt int, the types interface int and interface MyInt would be completely equivalent.

Ideally, the syntax should be allowed anywhere where types are allowed, so you should be able to write this code:

func TypeAndValue(x interface int) (string, int) {
    return fmt.Sprintf("%T", x), int(x)
}

This would still panic for TypeAndValue(nil), but one could adapt it:

func TypeAndValue(x interface int) (string, int) {
    if x, ok := x.(interface int); ok {
        return fmt.Sprintf("%T", x), int(x)
    }
    return "nil", 0
}

With generics, there could also be situations in which such interfaces are suitable constraints, e.g.

func SortAnyFloat64s[T interface float64](x []T) {
    sort.Float64s([]float64(x))
}

I think that when type lists are used in a type constraint, people will usually want the underlying type matching for builtin types. So I think it would be unfortunate and error prone to require them to write interface int in that scenario.

I'm not sure what's unfortunate and error prone about having to be explicit. Even if an annotation is made or forgotten in error, it's easy to spot by inspection.

I think it's error prone because both type int and type interface int are accepted, and the difference between them is quite subtle.

It would be less error prone if something were required either way. Then people would have to think about what they mean.

Not annotating an entry makes it look like how types look everywhere else and match like how types are match everywhere else. Annotating an entry makes it look and match differently.

Different annotations for each case seem redundant since one case is the "like everywhere else" caseβ€”but I'd be fine with that, I suppose.

Changing matching based on accident of provenance seems more subtle and error prone than either of the above to me and the most likely to trip up newcomers who haven't learned the rules and may not realize there is a rule until they trip over it.

There is no case today where a type list involves matching types. So while I understand the intuition, I don't agree that this is how types are matched everywhere else. They aren't matched anywhere else. We have operations like assignment and conversion, and those operations do not work exactly like any of matching mechanisms suggested for type lists.

Also, note that if we use type lists at all, we are going to use them in type constraints before we use them as ordinary types. It's quite possible that we will use them in type constraints and never use them in ordinary types (that is, that we will accept a generics proposal but not accept this proposal). That would leave us saying type interface int for all time. That can't be right.

What if we change the 'mode' (concrete or underlying) of the type switch depending on the keyword passed to it?

switch s := s.(type) {
case int: // <-- matches only concrete ints
}

switch s := s.(interface) {
case int: // <-- matches types whose underlying type is also int
}

@bokwoon95, the keyword interface seems to subtle to me for that use-case. (interface to me does not say β€œunderlying type”.)

switch s.(underlying) seems reasonable, though, or switch underlying(s).

i forgot who suggested this, but someone had suggested much earlier that int... could mean β€œint and anything with underlying type int”

[Edit: Sorry, I messed up the formatting, should be right now]

I doubt that the main usage of type lists would be for constraints. Most users will probably just use the predefined constraints from the library, or combine them using interface embedding, such as here:

type OrderedNumber interface {
    constraints.Ordered
    constraints.Number
}

Like with any other interfaces, this should allow to intersect the sets of types satisfied by those two interfaces. On the other hand, if users wanted the union of those types, so all types that satisfy constraints.Ordered or constraints.Number, they could still use type lists without any extra keyword, like this:

type OrderedOrNumber interface {
    type constraints.Ordered, constraints.Number
}

This should be the general behavior of all interfaces in type lists: they add all types to the type list that the interface satisfies. That's the logical meaning of an interface in a type list. I don't know if this usage of interfaces in type lists is already in the proposal, but it should at least be considered as a future addition.

This leads to a very simple rule of how type lists work:

  • Concrete type in type list: this exact concrete type is satisfied by the new interface
  • Interface in type list: all concrete types satisfied by this interface are satisfied by the new interface

Currently, builtin types break this rule. They are concrete types, yet including them in a type list doesn't just add a single concrete type, it adds the whole set of types that have the same underlying type.

The solution would be to have an interface type that is satisfied not based on the methods, but on the underlying type. That's where interface int comes in. It's explicitly a set of concrete types (like all interface types), not something that usually refers to a concrete type but changes its meaning in the context of a type list.

On the other hand, the current semantics are very error prone. Imagine that in a package, the author wants a function SetTitle to be able to accept either a normal string or any fmt.Stringer to produce the title. They would write it like this

type Stringer interface {
    type string, fmt.Stringer
}

var title string

func SetTitle(str Stringer) {
    switch str := str.(type) {
    case string:
        title = str
    case fmt.Stringer:
        title = str.String()
    }
}

Now, consider somebody using this package. They can call the function SetTitle not only with an argument of a type the package author intended to, but also with any other type that has string as its underlying type. However, in that case, the function will panic. Even worse, there is not even a simple way for the package author to fix it. In the definition of the interface, there is no way to specify that only the builtin string type should satisfy the interface, not any other user defined type thay happens to have string as its underlying type. So the package author may decide to just allow all types that are backed by a string, but then they realize that this isn't really straight forward either, because there's no easy way to match for such a type in a type switch. They would have to introduce a new type like this:

type anyString interface {
    type string
}

Then, they would have to match for it, and the implementation would have to allow a type conversion in this case, using title = string(str). This is not very intuitive, especially because type conversions on interfaces are not common. Using a type assertion wouldn't work in this use case because they also only match the exact concrete type, not the underlying one.

If the interface string syntax would be adapted, the package author would have the choice:

  • Either only the builtin string type is allowed, and no user defined type (except the ones that satisfy fmt.Stringer). Then the syntax would be exactly like in the above example.
  • Or explicitly allow all types that are backed by a string. This means having to use interface string in both the type list and the type switch, and having to use an explicit type conversion doing title = string(str). In this case the type conversion is less surprising because the type conversion operator is essentially the "interface method" of the the interface string.

This really simplifies the language, and I would argue it also makes the language a bit more orthogonal. Currently, interfaces define a set of types by the common methods they implement. By adding this syntax, it would add a new dimension to the way interfaces can build a set of types, namely by having a common backing type, and having type conversion as a sort of "method" for this interface. This could even be used independent of type lists and constraints and sum types, just as regular interfaces. For example one could write this:

type StringerString interface {
    fmt.Stringer
    interface string
}

The type StringerString would only be satisfied by user defined types that are backed by a string and also have a String method. It could be used like this:

func InsideAndOutside(s StringerString) {
    fmt.Println("internal representation:", string(s), "pretty printed:", s.String())
}

Now, this isn't terribly useful and necessary by itself, but having the concept of an interface that satisfies all concrete types that have the same underlying type helps bridge the gap between what we want for constraints in generics and what we want in sum types, and allow us to use the same interface notation for both.

@ianlancetaylor

There is no case today where a type list involves matching types. So while I understand the intuition, I don't agree that this is how types are matched everywhere else. They aren't matched anywhere else. We have operations like assignment and conversion, and those operations do not work exactly like any of matching mechanisms suggested for type lists.

That's certainly correct. What I mean is that given

type G interface {
  type A, B, C
}
func F[T G](x T) {}

where none of A, B, C are primitive, you can reason that T be each of A, B, and C in turn by substitution and that when T = A you can consider the signature of F to be F(x A) and so on for B and C. That's what I mean about is being "like everywhere else".

Just to drive it home, under the current draft and the rule in this proposal if, say, int is added to the type list of G when T = int you cannot consider the signature of F to be F(x int) since it also accepts the whole family of types whose underlying type is int, which is unlike anything else in the language today.

Also, note that if we use type lists at all, we are going to use them in type constraints before we use them as ordinary types. It's quite possible that we will use them in type constraints and never use them in ordinary types (that is, that we will accept a generics proposal but not accept this proposal). That would leave us saying type interface int for all time. That can't be right.

If we end up with generics but not this proposal, I'd still rather have an explicit annotation because (a) explicit πŸ‘ and (b) I might want to write something that is int and only int. (b) is not very important but (a) is. There's surely more syntax to learn but it's more uniform and straightforward. The new idea has a new form. If you're someone learning Go generics you can spot the new case because it has it a new syntax.

As long as everyone is bikeshedding syntax, let's say it's ~T instead of interface T. With a new syntax it can be expanded to work in type assertions and switches. Even if there's just generics that would let you write

type G interface {
  type A, ~B
}
func F[T G](x T) {
  switch any(x).(type) {
  case A:
  case ~B:
  }
}

and handle the case when T has underlying type B. This is less clear if there's annotations for both cases since one of them would be unnecessary for switches and assertions.

I doubt it would be of any use but it would also be possible to expand an explicit syntax later and allow things like

type G interface {
  type []~T, ~[]S, ~[]~U
}

I agree that it would be useful to be able to specify both in type switches and in type lists whether we want either the exact type or all types with the given underlying type. However, for ease of use with generics, I think the former case should be marked somehow and the latter not.

@ianlancetaylor

I think that when type lists are used in a type constraint, people will usually want the underlying type matching for builtin types.

That may or may not be true, but it seems clear to me that it's this particular aspect that is causing the most fundamental problems with the whole generics proposal. It's this behaviour which makes type switches on generic types problematic, and it's also this behaviour that makes this type-lists-as-sum-types proposal hard too.

It might be hard to let go of, but if we do, I think the language will be a lot cleaner, even if slightly less convenient, and some things become possible that are not currently.

As an alternative to allowing generics to work automatically on underlying types, we could extend the type conversion operator to allow conversion between values with the same underlying type, even when those types aren't actually at the top level.

For example, this code could be allowed because b has the same underlying type as byte:

 type b byte
 x := []b{1,2,3}
 y := []byte(x)

If we allow that (it was actually allowed in a previous version of the spec in fact), then a generic function specified on a type list type parameter can still be invoked for a named type, albeit at the cost of explicit type conversions for argument and/or return values.

@rogpeppe Thanks. I remain absolutely convinced that it is not acceptable to require type conversions for Min(x, y) if x and y happen to have values of type Celsius. It must be possible to write that in some way. Maybe we have the wrong approach to making that work, but it is not a matter of being hard to let go of. It absolutely must work.

I still think explicit syntax is the way to go, but I'm warming up to the possibility of two rules

  1. in generic constraints, type lists always use or-underlying matching
  2. in sum types, type lists always use normal matching

It doesn't feel ideal to have two rules but as long as they're both simple maybe it's not too bad.

Sum types in Go is really good idea.

Using sum type is a good way to define different states using types.
But when it can be nil value, sum type always have an additional and maybe unnecessary state. Every time when you define N variants, you can get N+1 variants.
So,

type Result[T any] interface {
    type T, error
}

can be error, T or nil.

I think it should not be nil-able, it should not have zero value instead.
We can add check that sum type is always initialized by some variant.

type SomeVariant sturct{}
type SomeVariant2 sturct{}

type SumType interface {
    type SomeVariant, SomeVariant2
}

func var() {
    var s SumType // compiler error: s is uninitialized
    var s2 SumType = SomeVariant{} // OK
}

func result() (r SumType) { // compiler error: r is uninitialized
    return
}

func result2() (r SumType) { // OK, r is always initialized 
    r = SomeVariant{} 
    return
}

func result3() SumType { // OK, r is always initialized
    if condition() {
        return SomeVariant{}
    }

    return SomeVariant2{}  
}

type SomeType struct {
    S SumType
    value int
}

func insideStruct() SomeType {
    var s SomeType // compiler error: field S is uninitialized
    var s2 *SomeType // OK, if pointer is nil, then dereferencing or s2.S will cause panic
    s1 := SomeType{value: 10} // compiler error: field S is uninitialized
    s2 := &SomeType{} // compiler error: field S is uninitialized
    s3 := SomeType{S: SomeVariant{}} // OK  
}

type Constr interface {
     type []byte, SomeType 
}

func insideTypeSum[T Constr]() T {
       var zero T // compiler error: T can be SomeType and is uninitialized
       return zero
}

Also, if we really need nil-able sum type, we can allow nil variant explicitly.
It that case it can't be used as constraint.

type SumType interface {
    type nil, SomeVariant
}

And there is intersting case to use this feature:

type box[T any] struct {
  value *T
}

func (b box[T]) Get() *T {
  return b.value
}

type NonNil[T any] interface {
  type box[T]
  Get() *T
}

func Take[T any](t T) NonNil[T] {
  return box[T]{value: &t}
}
  • NonNil can't have zero value, so it always be initialized using box[T]
  • box[T] can't be created from another package without Take[T]
  • Take[T] can't create box[T] with value = nil

So, there no way to create NonNil[T] type where function Get() *T can return a nil pointer.

tv42 commented

@tdakkota

it should not have zero value

That would be a huge change to Go, definitely much more so than the proposal here. You have left out so many edge cases, e.g. reflect.Zero, arrays/slices, maps, new and so on. Your idea would be better served by its own proposal (if one doesn't already exist), where its feasibility can be discussed without adding noise to this much narrower proposal.

@tdakkota There is a lot of discussion of zero values and sum types over at #19412.

@tv42

That would be a huge change to Go, definitely much more so than the proposal here. You have left out so many edge cases, e.g. reflect.Zero, arrays/slices, maps, new and so on. Your idea would be better served by its own proposal (if one doesn't already exist), where its feasibility can be discussed without adding noise to this much narrower proposal.

I did not propose deny zero values for all types or types inside type list. I proposed to deny creation of type sum with nil type.
I don't think it's a so huge change.
In that case

type ByteString interface {
     type []byte, []rune, string
}

Variable of type ByteString can be []byte(nil), []rune(nil) or string("")
But it always has a concrete type.

var slice []SumType // OK, there are no elements - no zero value type sum
var map map[string]SumType // OK, there are no elements - no zero value type sum
var channel chan SumType // OK, there are no elements - no zero value type sum

var array [2]SumType // compiler error: elements are uninitialized

new(SumType) is literally same as &SumType{}, so it should cause a compiler error.

I am not sure about how reflection would be work in Go2, but I think reflect.Zero should cause panic. Also, we can add a reflect.CanZero function to make sure that type can have zero value.

I don't think it's a so huge change.

I'm afraid it would be a huge change. Zero values crop up in Go in many places.

var slice []SumType // OK, there are no elements - no zero value type sum

What about: slice = append(slice, s) where s is of type SumType?
Surely that must be OK. But then it might have a capacity larger than its length, so what
would slice[:cap(slice)][cap(slice)-1] return?

var m map[string]SumType // OK, there are no elements - no zero value type sum

What's the result of m[key] when key doesn't exist in m.

var c chan SumType // OK, there are no elements - no zero value type sum

What's the value of v in v, ok := <-c ?

You would have to forbid use of reflect.Zero(t) for any type t that contains a SumType.

In short, your suggestion is not viable. All types in Go must have a zero value.

In #19412, I suggested that the zero value could
be the first type in the list. I still think that's a viable approach rather than always including nil
as a possible value.

Got me wondering if nil should not be made a special type that belongs to every sumtype.

The "bottom" I think it's called in type theory. Everything I read in the proposal summary seems fine to me so far. Especially if we consider types as merely named interfaces or in Go, interfaces with a method that returns a unique name.
Sum types would just be a union of disjoint interfaces.
int would just be an embedding of type MyInt. (just from an abstract pov, a type being an interface around raw bits)

Issue is that people would not like having to deal with nil everywhere but it might be needed for function composition with sumtype arguments, amongst other things.
It's already the type of the nil value somehow, especially useful for interface comparison (if err!=nil).
Also, it allows for optional value parameters in functions (variable arity functions even) , a kind of "Maybe(x)". (or in struct fields but then, encoding will have to change, thinking about json and sql etc).
Nilable types would be a restricted kind of intersection types then between a value type and the nil type. Not sure of what it brings to the table but interesting nonetheless.

Also means that if we wants sumtypes of interfaces, they have to have fully orthogonal method sets (only a union of disjoint sets of types should work, a traditional go interface being an intensional encoding of a set of types)

Also important to note that set union is commutative. So should be interfaces. The zero-value of a sumtype being the zero-value of the first or second type component would preclude that.

If we are going to use interface { ... } to designate sum types, then it should behave like an interface, and nil would be its zero value.

I'm not saying that we need to use nil as the zero value, however if we don't want to use nil as the zero value, then we should pick a different syntax from interface.

thwd commented

What we're calling "underlying" in this thread, really just refers to a type's kind. reflect.Kind speaks about memory representation of types. We should separate the two concepts; type and kind. Types define abstractions, kinds define memory representation (and operators).

Types in Go1 are invariant. MyInt and int are not assignable to each other and must be converted. If we introduce underlying-matching; we introduce a type hierarchy. Consider:

type MyInt int
type YourInt Myint
type OurInt YourInt

Having int match all three of these and YourInt match two of these and MyInt match one of these induces a type hierarchy (int :> MyInt :> YourInt :> OurInt) while the rest of the language is strictly invariant.

In my opinion, the interesting question is not how to string and int, but how to struct{ F func(map[string]chan []*MyInt) } in kind-constraints.

  • Do we even allow structs? Say we didn't.
  • Do we allow funcs? Say we didn't.
  • Do we allow maps/slices? Say we didn't.
  • Does *MyInt match *int? (covariance)

In my opinion: introducing underlying-matching fundamentally challenges Go1's design and opens a wholly new problem space much larger than what has been considered in this thread so far.