JuliaLang/julia

default field values

Closed this issue ยท 69 comments

This comes up periodically since people seem to expect it to work:

type Foo
    bar::Int=0
    baz::String=""
end

Currently, this doesn't do what people expect it to do at all, but I suspect we should change that. This would be a useful feature and the fact that so many people expect it seems like a pretty strong argument in its favor.

+1000

I've often written macros to automatically give me both a type definition and an all-keywords constructor function with sane default values.

This certainly is a useful feature to have.

same thought recently #6122 (comment)

If breaking the current semantics of how = inside a type definition works would somehow be a major problem, then maybe we could use different syntax for this - I think => inside a type definition would currently be invalid?

No, I think we should just bite the bullet and change it. Very little code would get broken. This is used once or twice in base and probably very few other places.

I agree we should just change it.

Doesn't this complicate the semantics of inner constructors even more? The distinction between inner and outer constructors is a tricky concept to explain to people unfamiliar with the language.

How does this interact with evaluating arbitrary expressions for type fields?

type Foo
    foo::Int=rand(1:10)
    baz::String=randstring()
end

I feel allowing side-effects here is a bit weird.

I assume this would only work for default constructors.

So the following would retain the behavior we have now (expressions would be ignored)?

type Foo
    foo::Int=rand(1:10)
    baz::String=randstring()

    Foo() = new()
    Foo(f) = new(f)
end

Given that the inner/outer constructor business is one of the most complicated parts of the (otherwise reasonably simple) language, I wonder if this is worth the increased complexity?

Wouldn't documenting the pattern of creating a no-arg inner constructor suffice?

Wouldn't documenting the pattern of creating a no-arg inner constructor suffice?

I'd argue that the no-arg constructor (which shouldn't be the inner constructor, but only one outer constructor) is an anti-pattern.

Why not use defaults in new for all arguments that aren't provided? This would work very well with new with keywords since you can provide whatever subset of values you want to and get defaults for the rest. Keep in mind that if you're going to add a language feature, you might as well make it powerful.

I'm assuming that the use of keywords in new will slow down construction of types even when you don't use keyword arguments, which seems like a large risk.

Keyword arguments are slow currently, but that's not an inherent aspect of that language feature โ€“ they could be made faster with time and effort.

Keyword arguments in a new call are certainly possible, but they would be a totally different beast in terms of implementation. new is not even a real function, and certainly doesn't have multiple methods. So explicit keywords (e.g. new(a=b)) can be handled with just a rewrite in the front end. Splatted keywords would be a big pain, and are inherently slow since they must be sorted at run time.

Currently, I'm pretty sure keywords are only expensive for calls that actually use them.

I was looking into defaults and keyword constructors just now: Parameters.jl. I came up with this approach to default constructors:

  • no inner constructors given: provide inner keyword & inner positional constructor
  • if inner constructors are given, check whether one of them is the positional constructor for all fields:
    • if found, chain the inner keyword constructor to this
    • if not found, provide no keyword constructor. If defaults are given, then error.
  • if inner normal positional constructor is given, provide an outer positional constructor and a copy constructor: MyType(mt::MyType; kw...) = ....

Example:

immutable MT{R<:Real}
    a::R = 5
    b::R
    MT(a,b) = (@assert a>b; new(a,b))
end

would become

immutable MT{R<:Real}
    a::R
    b::R
    MT(a,b) = (@assert a > b; new(a,b))
    # This is now chained to above constructor:
    MT(; a=5,b=error("Field '" * "b" * "' has no default, supply it with keyword.")) = (MT{R})(a,b)
end
MT{R<:Real}(a::R,b::R) = MT{R}(a,b) # default outer positional constructor.
# These two create new instances like pp but with some changed fields:
MT(pp::MT; kws...) = reconstruct(pp,kws) 
MT(pp::MT,di::Union(Associative,((Symbol,Any)...,))) =  reconstruct(pp,di)

I'm pretty adamant that if any user-defined constructors are present, no default constructors should be added. Among other issues, if you don't want them, how do you get rid of them?

+1 to what Jeff said. When creating type, one of the biggest design
questions is what the "core" constructor(s) are and if those are different
from user-facing constructors. This process has almost always required
tight control over which methods get defined or if default constructors are
appropriate.

On Tue, Feb 10, 2015 at 2:08 PM, Jeff Bezanson notifications@github.com
wrote:

I'm pretty adamant that if any user-defined constructors are present, no
default constructors should be added. Among other issues, if you don't want
them, how do you get rid of them?

โ€”
Reply to this email directly or view it on GitHub
#10146 (comment).

Fair enough. But wouldn't the same argument also apply to new having keywords? How would you do a call which leaves all fields uninitialized?

I think the simplest, most conservative design is to allow default values only if the default constructors are used. Specifying both default values and a user-defined constructor would be an error.

The next step in complexity is to allow user-defined constructors, and have default values act as positional default arguments to new. In that case, if there are default values it becomes impossible to leave all fields uninitialized. I don't think it makes much sense to have a type with default values for fields, where those same fields are sometimes uninitialized.

I will appeal to my hugely breaking and never going to be implemented desire to do away with inner constructors entirely. I feel that this issue is a manifestation of the current system being a bit magical. Users expect it to work because they cannot figure out the current implementation. This is unsurprising as it is hidden from them.

The simplest design would be to have type syntax be purely declarative (describing what something is) and use outer constructors to create instances of these types (describe how to construct it). I feel this would be much simpler to teach to people new to the language as it removes all magic. Users could reclaim the default behavior of auto generated constructors through macros as it is a purely syntactic transformation. This makes the generation of these default constructors opt-in as opposed to opt-out. It also simplifies scoping issues with TypeVars and inner constructors, a concept which seems to trip a lot of people up initially.

I admit that declaring a type and not immediately be able to do something with it defeats the interactive feel of the language. It makes code a bit more verbose and repetitive. This proposal also complicates some optimizations that are currently possible with the current system (such as statically removing #undef checks in type construction and field access). One of the arguments in favor of the current inner / outer distinction is that it enables a way to express invariants. It is currently the only way to express that a certain subset of methods with given arguments / argument types cannot be overridden or extended in the system. To me, the ability to "seal" methods is broadly useful. It seems odd we could potentially have two different ways of expressing the same concept if we gain the ability to seal general methods in the future.

I don't really know why I wrote this, but I feel that we are moving farther away from this ideal.

+1 to what Jake said. While I'm sure there are (possibly
huge) implications, it would be great to see where this could lead.

On Tuesday, February 10, 2015, Jake Bolewski notifications@github.com
wrote:

I will appeal to my hugely breaking and never going to be implemented
desire to do away with inner constructors entirely. I feel that this issue
is a manifestation of the current system being a bit magical. Users expect
it to work because they cannot figure out the current implementation. This
is unsurprising as it is hidden from them.

The simplest design would be to have type syntax be purely declarative
(describing what something is) and use outer constructors to create
instances of these types (describe how to construct it). I feel this would
be much simpler to teach to people new to the language as it removes all
magic. Users could reclaim the default behavior of auto generated
constructors through macros as it is a purely syntactic transformation.
This makes the generation of these default constructors opt-in as opposed
to opt-out. It also simplifies scoping issues with TypeVars and inner
constructors, a concept which seems to trip a lot of people up initially.

I admit that declaring a type and not immediately be able to do something
with it defeats the interactive feel of the language. It makes code a bit
more verbose and repetitive. This proposal also complicates some
optimizations that are currently possible with the current system (such as
statically removing #undef checks in type construction and field access).
One of the arguments in favor of the current inner / outer distinction is
that it enables a way to express invariants. It is currently the only way
to express that a certain subset of methods with given arguments / argument
types cannot be overridden or extended in the system. To me, the ability to
"seal" methods is broadly useful. It seems odd we could potentially have
two different ways of expressing the same concept if we gain the ability to
seal general methods in the future.

I don't really know why I wrote this, but I feel that we are moving
farther away from this ideal.

โ€”
Reply to this email directly or view it on GitHub
#10146 (comment).

+1 to jake!
Biggest issue you name (also for me) is I admit that declaring a type and not immediately be able to do something with it defeats the interactive feel of the language.
How will you be able to create the instance in the outer constructor than?
Otherwise, default constructors for any custom type can be declared non magically with call{T}(x::Type{T}, data...).
This can also be used for declaring default keyword constructors:

#Obviously only for mutables
function Base.call{T}(x::Type{T}; data...)
    instance = new(T)
    for (fieldname, value) in data
        instance.(fieldname) = value
    end
    instance
end

Similarly this can be implemented for immutables with some added overhead.
If keyword arguments and varargs would introduce their own signature, the immutable case could be made fast as well via stagedfunctions.

I don't really know why I wrote this, but I feel that we are moving farther away from this ideal.

I'm glad you wrote it. The inner/outer constructor business is one of my least favorite things. It always confuses people. The fact that the inner constructor A(x) for a type A{T} is invoked as A{T}(x) while the outer constructor A{T}(x) is invoked as A(x) is just perversely confusing. Especially now with call overloading, the whole things is feeling a little stretched thin.

Silly me, my default keyword constructor obviously doesn't solve the problem of default values for fields and just gives an order independent way of constructing types.

@SimonDanisch my proposal would be to overload new as we do for call currently. new would take the type to construct as the first argument, followed by positional arguments.

Ex.

type Foo
    bar::Int=0
    baz::String=""
end

would be expressed as

type Foo
    bar::Int
    baz::String
end

Foo(bar::Int=0, baz::String="") = new(Foo, bar, baz)

Obviously more verbose, but much more explicit.

I feel that we could largely reclaim the performance issues if we just mandated that immutable types be fully initialized.

Yes, something like this seems reasonable.
This somehow excites me more than it should! Guess it is because parameters & inner/outer constructors where the language constructs I struggled the longest time with.
Like this, we can also remove the parameter inconsistency:

immutable Field{Sym} end
Field(s::Symbol) = new(Field{s}) # "unused" parameter is now part of the type

I'm not willing to give up enforced invariants. It also doesn't seem necessary at all. new only existing inside type blocks is not the confusing part. The confusing part is A(x) vs. A{T}(x), just as Stefan said.

call overloading fully unifies "inner" and "outer" constructors. Inner constructors have the form

call{T}(::Type{MyType{T}}, ...) =

and "outer" constructors have the form

call(::Type{MyType}, ...) =

Please read #8135 for a good discussion. My comment #8135 (comment) shows how to use outer-only constructors but keep enforced invariants.

All the syntax mentioned there works already. If people want to switch to explicit call overloading for constructors, and remove the front-end rewrite "magic" I would be 100% fine with that.

While new only existing inside of type blocks is not confusing, there is sometimes confusion about which methods should go inside the type block and which shouldn't. In the non-parametric case it's pretty clear: if a method needs to use new, it goes inside; additional constructor methods that just provide more convenient wrappers for "core" constructors that actually construct objects using new go outside of the type block. In the parametric case, that's not the only consideration since you couldn't โ€“ before call overloading โ€“ add methods to the A{T} type except for inside the type block. If there was no such choice, then that confusion would be eliminated. Not necessarily advocating for that since I feel that making new only available in the type block is an elegant way to control how objects can be constructed, but just pointing out that there is some confusion.

Here's another idea: require two sets of { } for defining constructors. That way an "outer" constructor would look like

MyType{}{T,...}(x::T) = MyType{T}(x)

and an inner constructor would look like

MyType{T}{}(x::T) = new(x)

or, outside the type block

MyType{T}{}() = MyType{T}(0)

That way the prefix is always the thing you're defining a constructor for, followed by method parameters. Both forms can be used inside or outside a type block, and they look identical in either case. The only difference is whether new is available.

One further subtlety:

MyType{T}{T}( ... ) =

would define a constructor for MyType{T} for all T, and

MyType{Int}{}( ... ) =

would define a constructor for just MyType{Int}.

I kind of like this.

Am I understanding correctly that the issue is that constructor syntax is currently an ambiguous intersection between method static parameters and type parameters? And in your proposed solution, the first set of {} represents the type definition, whereas the second are the method's static parameters? This seems overly complicated for typical use of defining a constructor for a parameterless type. Wouldn't the TypeVars then have to propagate 'backwards' -- MyType{T}{T<:Integer}()?

If I remember right, as I was learning, the issue was a case of WYSIWY-don't-G. Inside type definitions, you define constructors without parameters, but you have to call them with parameters. And it's the other way around for methods outside. I don't know if there's a simple way of making that more congruent without making the common cases require more syntax. Perhaps one solution would be to completely change the name of the constructor method inside type definitions -- make it init or something similar. We're not defining methods for the type name in any case.

Perhaps one solution would be to completely change the name of the constructor method inside type definitions -- make it init or something similar. We're not defining methods for the type name in any case.

This is actually exactly what we discussed once when trying to figure out better ways to do this. Still don't really like it.

There is a lot to be said for making definitions look as much like uses as
possible, and I think my proposal does that.

In that case, a single pair of brackets for the outer constructor would be more faithfull, since you don't call it as MyType{}(args...). So in th definition of MyType(args...) there would be a single pair of brackets for thefunction parameters, and in the definition for MyType{T}(args...), there would be two pairs of brackets.

Yes, we could allow that, and it's actually equivalent to allowing the existing syntax for parameterless types.

The main reason to prefer always using 2 sets of curly brackets is that it makes it impossible to misread Foo{T}(x) = ... as a constructor for Foo{T}.

Yes, i thought about the confusion the second i pressed " Comment "

+1 to double curly braces

The minimum action to take here in 0.4 would be to reserve the the default field value syntax, which currently means something entirely different and mostly useless and probably confusing if anyone tries it.

Reserving the syntax sounds like a good idea.

In https://github.com/mauro3/Parameters.jl I'm basically implementing this functionality with a macro:

@with_kw immutable PhysicalPara{R<:Real}
    rw::R = 1000.
    ri::R = 900.
end

It would be great if the syntax reservation propsed by @StefanKarpinski would not break this package. I find it very handy in my code and would hate to see it go.

We can reserve the syntax after parsing.

Cool, thanks!

Related issue/PR about keyword constructors: #5333, #6122

Would this work?

type A{R<:Range}
    a::R = 1:5
end
A() # -> A{UnitRange{Int64}}(1:5)
A(a=1.0:10.0) # -> A{FloatRange{Float64}}(1.0:1.0:10.0)

i.e. could R be inferred from the default value, and from using a keyword constructor?

Any news on this. I just tried doing the same thing.
Error: ... inside type definition is reserved.
Would be a really handy feature to have

@shakisparki Parameters.jl?

Yea I meant it will be nice if it was built into Julia itself like @StefanKarpinski suggested.
rather than having to use Parameters.jl

I mean if it's ready then the solution would be to just PR Parameters.jl into Base IMO. It's a very solid and stable package which is pretty widely used, and solves the problem very well. If you need the functionality right now, there's no reason to avoid it.

I thought d5ad19a already had ๐Ÿ™‚
(@

julia/base/util.jl

Lines 776 to 793 in a593118

"""
@kwdef typedef
This is a helper macro that automatically defines a keyword-based constructor for the type
declared in the expression `typedef`, which must be a `struct` or `mutable struct`
expression. The default argument is supplied by declaring fields of the form `field::T =
default`. If no default is provided then the default is provided by the `kwdef_val(T)`
function.
```julia
@kwdef struct Foo
a::Cint # implied default Cint(0)
b::Cint = 1 # specified default
z::Cstring # implied default Cstring(C_NULL)
y::Bar # implied default Bar()
end
```
"""
)

Ahh, seems you are right. Haha, forgot I was working in 0.5.2

Can this issue be closed then? Perhaps export @kwdef?
Example in the OP:

julia> Base.@kwdef mutable struct Foo
           bar::Int
           baz::String=""
       end
Foo

julia> Foo()
Foo(0, "")

I was under the impression that a non-macro based solution was aimed for. If macros are ok, porting parts of Parameters.jl to Base could work.

Note that there is quite a bit more to a full implementation than what @kwdef provides, for instance:

julia> Base.@kwdef mutable struct Foo
                         bar::Int=5
                         baz::String=""
                         Foo(bar,baz) = (@assert length(baz)<bar; new(bar,baz))
              end                                                                                                                                    
ERROR: UndefVarError: bar not defined                                                                                                                

julia> Base.@kwdef mutable struct Foo{T}
                  bar::T=5
                  baz::String=""
              end                                                                                                                                    
WARNING: static parameter T does not occur in signature for Type.                                                                                    
The method will not be callable.
Foo

julia> Base.@kwdef mutable struct Foo
                  bar=5
                  baz::String=""
              end                                                                                                                                    
ERROR: type Symbol has no field args                                                                                                                 

In my opinion this feature should be supported language wise and not using a macro. Its not intuitive that one has to put Base.@kwdef just to initialize the fields.

Yes, this is a feature that people just intuitively expect to work โ€“ I've witnessed this dozens of times in person over the years. Requiring a macro for it to work is unnecessarily ugly.

I was just suggesting that what Parameters.jl already does could be ported to Base, though with the macro essentially being implicit. I agree it shouldn't need a macro and this should be very standard functionality for type definitions.

The real question is what does it mean for a field to have a default value? At the very minimum, I would argue that calling new() with a field with a default omitted should give that field its default value rather than leaving it uninitialized.

This came up "in the wild" again:

https://discourse.julialang.org/t/default-values-for-composite-types-in-julia/4479

@juliohm has an interesting question because he wants type parameters to work. Since Parameters.jl uses keyword arguments, that won't work out all too well with the current setup, though #16580 would solve that issue. But it won't solve the issue that maybe a user may want to just set the "type", i.e. for

immutable GaussianVariogram{T<:Real}
  sill::T -> one(T)
  range::T -> one(T)
  nugget::T -> zero(T)
end

it feels like GaussianVariogram{Float64}() should work, which is not something Parameters.jl can do IIRC.

At the very minimum, I would argue that calling new() with a field with a default omitted should give that field its default value rather than leaving it uninitialized.

I agree, though would it be an issue that then there is no way to have a constructor with uninitialized fields? To me it sounds like a weird edge case that probably doesn't need special syntax to cover (example: new(has_default = Uninitialized()). Just seems weird to have something for this one case), but it would be a limitation.

it feels like GaussianVariogram{Float64}() should work, which is not something Parameters.jl can do IIRC.

It does:

julia> using Parameters

julia> @with_kw struct Blag{T}
          x::T=one(T)
          y::T=one(T)
       end
Blag

julia> Blag{Float64}()
Blag{Float64}
  x: Float64 1.0
  y: Float64 1.0

Wow cool, never knew that. Though it does have the lack of inferrability:

julia> @with_kw struct Blag{T}
                 x::T=one(T)
                 y::T=one(T)
              end
Blag

julia> @code_warntype Blag{Float64}()
Variables:
  #self#::Type{Blag{Float64}}

Body:
  begin
      return ((Core.getfield)($(QuoteNode(Core.Box(#call#1))), :contents)::Any)((Base.sitofp)(Float64, 1)::Float64, (Base.sitofp)(Float64, 1)::Float64, #self#::Type{Blag{Float64}})::Any
  end::Any

It doesn't for me:

using Parameters

@with_kw immutable Foo{T<:Real}
  a::T = one(T)
end

Foo()

EDIT:
Got it, you have to be explicit about the type in order for it to work.

@StefanKarpinski: I assumed that this issue would be solved by automatically generating a keyword constructor rather than modifying new. But your approach maybe easier.

For below type this would mean

struct A
  a
  b=5
end

What would new's signatures be? You suggest, I think:

new(a, b) = ...
new(a; b=5) = ...
new(;a=undef, b=5) = ...

with the automatically defined inner constructors:

A(a,b) = new(a,b)
A(;kws...) = new(;kws...)

I'm changing the milestone to 1.x since this seems like a feature that we can add post 1.0 as long as we've made the appropriate things errors in advance.

as long as we've made the appropriate things errors in advance.

Is it clear if the appropriate things error in the current state of master, or if something needs to be prepared for this to be simple to do post v1.0?

It's a syntax error now, so we're all good:

julia> struct Foo
           a::Int = 0
       end
ERROR: syntax: "a::Int = 0" inside type definition is reserved

Could default field values be part of Julia v1.3? I wonder if we can also have the @unpack functionality from Parameters.jl as well.

Since the feature freeze is today, no. But if someone implements it in the next four months, it could be in the 1.4 release.

For v1.6?