ziglang/zig

Proposal: default struct field initialization values

Closed this issue ยท 18 comments

Since there is a proposal for default function arguments, I'm putting forward the idea of default initialization values in structs for the sake of completeness. Consider

const Foo = struct {
    a: u8,
    b: u8 = 42
}

const foo_default_b = Foo {.a = 0}; // .b = 42
const foo_defined_b = Foo {.a = 0, .b = 0};
const foo_skipped_b = Foo {.a = 0, .b = undefined};

I'll elaborate on some use cases later today.

Yes please.
This looks ugly -

pub const t = struct {
    map: HashMap([]const u8, usize, mem.hash_slice_u8, mem.eql_slice_u8),

    pub fn init() t {
        return t {
            .map= HashMap([]const u8, usize, mem.hash_slice_u8, mem.eql_slice_u8).init(allocator),
        };
    }
}

(There may be better ways to do this, but I couldn't find any)

You can simply assign the type to a new identifier in these cases.

pub const t = struct {
    const StringMap = HashMap([]const u8, usize, mem.hash_slice_u8, mem.eql_slice_u8);

    map: StringMap,

    pub fn init() t {
        return t {
            .map= StringMap.init(allocator),
        };
    }
}

Want this.

I never elaborated on those use cases - I admittedly didn't give this much thought after posting it. It fills some of the roles optional function arguments intended to, but has a few caveats.

Default field initialization values could be abused to make structs carry implicit allocators, or even hide conditional declaration of fields.

A lesser version of this feature that would prevent the above abuse but is still somewhat useful is to only allow specific values to be set (like zeroes), or only allow fields to be omitted from initialization if their default is undefined

const Foo = struct {
    // must always be specified
    a: u8,

    // numbers can default to zero
    b: u8 = 0,

    // slices can default to an empty slice
    c: []u8 = []u8{},

    // number arrays can default to zeroes 
    c: [3]u8 = [3]u8 { 0, 0, 0 },
    // maybe hidden behind a keyword/special syntax
    d: [3]u8 = zeroes,

    // nullables can default to null
    e: ?T = null,
    f: ?*T = null,

    // everything can default to undefined.
    // pointers and slices should probably only ever default to undefined
    u: *T = undefined,
    v: []T = undefined
}

Zig used to have the concept of zeroes, I can't remember why it was scrapped. To keep things simple, default fields could be specified by two keywords: empty and undefined.

const Foo = struct {
    a: u8,
    b: u8 = empty, // 0
    c: []u8 = empty, // []u8{}
    e: ?T = empty, // null
    f: f32 = empty, // 0.0
    u: *T = undefined,
    v: []T = undefined
}

even default would be a reasonable keyword here.

The status quo solution is to instantiate structs with a function, which is simple enough, refactor-friendly and probably good practice anyway... I'm actually pretty satisfied without this feature.

EDIT: Clarified phrasing

Isn't this a slippery slope to more implicit function calls. I agree init can be tedious, but it is definitely simpler.

Yeah, either the instantiation or the function call would end up more implicit.

I'd expect less implicit behavior when instantiating a struct because it seems like a more primitive operation than calling a function. A function call is basically control flow and you explicitly know something is going on behind the scenes.

Would be very handy for struct kevent โ€” it has an (unused) ext: [4]u64 field on FreeBSD, but not on Darwin.

There should definitely be a limit on what the default values should be, and the most obvious restriction is the same one for global var/const initializers: the value has to be known at comptime. So you can't get clients to call functions by omitting a field or anything.

This can be used for optional arguments:

const OpenFileOptions = struct{
    flags: i32 = posix.O_READ | posix.O_EXCL,
    mode: i32 = 0o744,
};
fn openFile(path: []const u8, options: OpenFileOptions) File {
    ...
}

test "asdf" {
    _ = openFile("asdf", OpenFileOptions{});
    _ = openFile("asdf", OpenFileOptions{.flags = 0});
    _ = openFile("asdf", OpenFileOptions{.flags = 0, .mode = 0});
}

The name of the struct is awkward, but that's mitigated with this proposal #208 (comment) where you could do openFile("asdf", .{}); and define the struct type in the function signature.

An idea from @MasonRemaley is that you should have to opt in to the optional values when constructing a struct. It might look like this:

const OpenFileOptions = struct{
    flags: i32 = posix.O_READ | posix.O_EXCL,
    mode: i32 = 0o744,
};

test "asdf" {
    _ = openFile("asdf", OpenFileOptions{}); // ERROR
    _ = openFile("asdf", OpenFileOptions{...}); // OK
    _ = openFile("asdf", OpenFileOptions{.flags = 0}); // ERROR
    _ = openFile("asdf", OpenFileOptions{.flags = 0, ...}); // OK
    _ = openFile("asdf", OpenFileOptions{.flags = 0, .mode = 0}); // OK
    _ = openFile("asdf", OpenFileOptions{.flags = 0, .mode = 0, ...}); // OK
}

(and a trailing comma after the ... should be allowed.)

This has advantages and disadvantages. When the reader sees the ..., they know to go look for default values, which is good. But a disadvantage is that if a library wants to add fields to a struct without breaking compatibility, then the clients would have needed to already be including the ... to avoid compile errors, which means this proposal simply doesn't work to avoid breaking compatibility.

My reasoning in favor:

  • ability for libraries to add new fields and only bump minor version
  • ability for functions to provide default arguments as @thejoshwolfe pointed out
  • hot code swapping (#68)
  • it allows zig coders to prefer direct struct initialization over function calls where possible, because it's easier on the reader; one can tell from the initialization site that it is Plain Old Data; one need not inspect the init() function to discover this information.
  • use case: API previously allowed direct struct initialization; now it wants to require an init() be called. Library can resolve this problem by introducing a new dummy field in debug mode only, that is initialized by init. Then all the initialization sites get compile errors.

Against:

  • someone could put multiple defaults that depend on each other, and then at the initialization site, only one is specified, and then the other default doesn't make sense.
  • introduces another way to do things
  • makes the language slightly bigger

Decisions:

  • The values must be comptime known.
  • No ... opt in thing.
  • Best practice is: don't create defaults for multiple values that depend on each other. Otherwise it's possible to override only one of them and get unexpected behavior.
  • Idiomatic zig: if an initialization produces comptime-known Plain Old Data, prefer direct struct initialization. If more sophisticated logic is required, prefer an init() method.
  • someone could put multiple defaults that depend on each other

Doesn't this conflict with "The values must be comptime known."?

I mean a logical dependency, not a literal dependency. Something like this:

const S = struct {
    action: FileAction = .Open,
    flags: u32 = posix.O_READ | posix.O_EXCL,
};

Now if you do S{.action = .Delete}, flags still gets the flags that applied to Open.
Contrived example, but hopefully it illustrates the point.

This seems like a reasonable take--I made almost all the same tradeoffs in my language with the exception of the ... to opt in. I'll let you know if after living with this feature for a while I'm forced to reevaluate any of this.

I am sooo excited for this feature :)

Mainly because I want to wrap my C cross-platform headers into zig interfaces, and those have been designed with C99 designated initialization in mind (many calls have desc-struct arguments, which sometimes have dozens of items, but usually only a few of them are differing from the default values).

It's surprisingly hard to transfer this idea into other languages (often they work around the problem with builder functions, which is a lot of boilerplate).

Here's an example C99 program, so you know what I'm talking about:

https://github.com/floooh/sokol#sokol_gfxh

\o/

This is great!

What about setting the default values of fields of imported C structs to zero? This way one can initialize a C struct nearly the same way one would do in C:

c:

typedef struct {
   int a;
   int b;
} my_type;

my_type zero = my_type { 0 }; // a = 0, b = 0
my_type partial = my_type { .a = 1 }; // b = 0

zig:

var zero = c.my_type { }; // a = 0, b = 0
var partial = c.my_type { .a = 1 }; // b = 0

As it is now, one must set all field values when initializing a C struct in zig. This quickly becomes cumbersome, especially if the C struct contains arrays.

UPDATE: This was mentioned in #1031 but that was before default struct field initialization was accepted

+1 On having C structs use 0 as defaults for everything to make interop with C code more seamless.