ziglang/zig

Field Aliases

marler8997 opened this issue ยท 17 comments

There are currently discussions around proposal #985 (Support nested anonymous struct and unions) and I had an alternative idea that solves that issue but also solves another issue. I've put together a formal description and a proposal for the idea here so that it can be considered before deciding on #985.

Two Problems

The first problem is that in Zig, types that contain fields that are meant to be used directly must give up "composability". What does it mean for a type to be "composable"? It means you can take that type, wrap it in another type as a field, and the new type can expose the same public interface. When a field is meant to be used directly, types can no longer do this.

The problem is there's no way to forward a field from from a nested type to the parent. Unlike functions that can forward calls to other functions, fields have no ability to do this. Some languages solve this by implementing "Properties" which are special functions that look like fields. However, Zig wouldn't support "Properties" because if something looks like a field it should just be a field (not arbitrary code execution). Furthermore, the alternative of using getter/setter functions comes with its own set of problems: naming conventions, the choice to return values or references, handling mutable vs const references to @This(), and requiring the caller to go through a function that can execute arbitrary code when all they want to do is access the field.

The second similar problem is that Zig doesn't have a way to represent anonymous C structs/unions that have been translated from C.

  • #985 (Supported nested anonymous structs and unions)
  • #6349 (improve unions for c interop)

I've noticed that these two problems are similar, and finding a solution to one likely means a solution for the other.

Potential Solution

The idea is to make it possible to redirect a field access to a subfield. For example, make it possible to declare that foo.x is equivalent to foo.bar.x. So in this case, foo would have a "field alias" x that redirects to the sub field bar.x.

So what are these field aliases? I'll start by saying what they are not:

Field Aliases are not Properties

Using a field alias looks like using a normal field and that's all it is. It's still just accessing data at a fixed offset into a structure.

Field Aliases do not introduce any new types

I have not found any benefit to adding an Alias type so this would only serve to complicate the type system.

Field Aliases do not go into TypeInfo.Struct.fields

Aliases only affect the struct's view at the source code level, they have no affect on the runtime structure. Zig structs/unions work, and this proposal is meant to add a feature without messing with the underlying struct/union semantics, including what is in TypeInfo.

Most code that deals with struct fields in TypeInfo will be concerned with the runtime structure and should therefore ignore aliases. So adding them to TypeInfo.Struct.fields would mean unnecessary boilerplate to ignore them. Note that this doesn't preclude them from being available in TypeInfo.Struct, they could still be available if deemed necessary in their own field_aliases array separate from the fields array.

So what is a Field Alias then?

Assuming this feature is used sparingly, our syntax sensibilities are that "longer is ok" and existing constructs are favorable to new ones. Given this, we can avoid modifying the language grammar by opting for a builtin function:

fn @fieldAlias(comptime field_name: []const u8, comptime sub_field_name: []const u8) (field alias)

This builtin creates a field alias to a subfield one level deep. It's purposely limited to one level of depth because that's all we need to solve the problems described above (NOTE: a field alias can point to another field alias). Also note that this only works for subfields, it doesn't allow the same field to be renamed in within the same struct. The reason for excluding arbitrary field name remapping is that it doesn't help solve the problems described above.

Example:

const List = struct {
    items: []u8,
};
const ThreadSafeList = struct {
    list: List,
    items: @fieldAlias("list", "items"), // items is a field alias to list.items
};

var foo = ThreadSafeList {...};
foo.items // same as foo.list.items

Note that even though the items alias is not a normal field that has a type and takes up space in the struct, I've opted to use a "field-like" syntax to declare it because code that accesses it will access it like a field. Like other fields, it's accessed through instances of the struct rather than through the struct type. Furthermore, if a programmer sees foo.x;, when they look at the definition of Foo they will expect to see x as a field like x: ... rather than pub const x = ...;.

Also, note that we're using @fieldAlias rather than @FieldAlias because @fieldAlias is not returning a type, it's returning some special internal compiler datastructure to represent a field alias.

NOTE: the @field builtin should also work with field aliases, i.e. @field(foo, "x") should work even if x is a field alias.

Comparison to anonymous structs/unions

The main difference between Field Aliases and supporting anonymous structs and unions (#985) is that field aliases also solves the composability problem described above. Furthermore, field aliases are a smaller feature overall. With anonymous structs/unions more questions arise about what kinds of fields/decls are forwarded to the parent and rules need to be created and learned by the users to understand them. A field alias on the other hand is only forwarding a single field to a single sub-field which makes it much less open for interpretation. Also, this proposal is a "source only" feature. These aliases only exist at comptime and once we are running, its as if they never existed. Field aliases have a specific, focused role and are restricted accordingly to fill that role to minimize enabling misuse of the language.

As an alternative, I mentioned this the other day in discord that this could be accomplished with a fully featured @Type if #6478 (comment) was accepted.

@darnimator I'm not seeing how the semantics in your proposal would allow you to implement the example in the proposal above. I can see it allows you to specify a custom memory layout, but the issue is not that we can't define the memory layout, the issue is forwarding the "field" interface from a subfield to the containing parent type. Maybe if you reproduced the example with your semantics it would help.

I'm not seeing how the semantics in your proposal would allow you to implement the example in the proposal above. I can see it allows you to specify a custom memory layout, but the issue is not that we can't define the memory layout, the issue is forwarding the "field" interface from a subfield to the containing parent type.

If you can specify the specific offsets, then you can specify overlapping offsets:

const List = struct {
    items: []u8,
};

const ThreadSafeList = @Type(.{
    .Struct = .{
        // .layout =
        .fields = &[_]std.builtin.TypeInfo.StructField{
            .{
                .name = "list",
                .field_type = List,
                .offset = 0x00000000,
            },
            .{
                .name = "items",
                .field_type = []u8,
                .offset = 0x00000000+@offsetOf(List, "items"),
            },
        },
        // .decls = .{},
    },
});

This can of course be wrapped up with helper functions for ergonomics.

Oh I didn't realize your proposal would also include semantics to support overlapping fields. The jump it takes to go from being able to "reify" struct types with offsets to supporting field overlap is the same jump this proposal makes on regular fields. In this proposal I was careful to keep existing field semantics separate from aliases to avoid potential issues with overlapping fields (i.e. ensuring the size of a type's fields does not exceed the size of the type itself, or assuming that each field points to a unique part of memory). Of course, reification could be modified to support aliases as well as an alternative to supporting field overlap. In fact I think if we accepted field aliases, we may want to add it to type info. So if we added support for struct reification, it could look like this:

const List = struct {
    items: []u8,
};

const ThreadSafeList = @Type(.{
    .Struct = .{
        // .layout =
        .fields = &[_]std.builtin.TypeInfo.StructField{
            .{
                .name = "list",
                .field_type = List,
                .offset = 0x00000000,
            },
        },
        .field_aliases = &[_]std.builtin.TypeInfo.FieldAlias{
            .{
                .name = "items",
                .field = "list",
                .sub_field = "items",
            },
        },
        // .decls = .{},
    },
});

Update: proposal #985 was just accepted, so this proposal now only solves one problem instead of two. Now this proposal only provides a solution for allowing types to be composable that make use of fields in their public interface.

Update: proposal #985 is now rejected, so this proposal is now a viable path to C interop for anonymous struct/unions and Plan 9 style structs/unions.

This is an interesting idea. I created a slightly different proposal #7698 to show a different, perhaps more flexible, way of thinking about this.

One thing I noted here is that you have the potential problem of mismatched var and const.

var Foo = struct {
    items: []u8
    const sentinel: u8 = 0x42;
};
const ThreadSafeList = struct {
    list: List,
    items: @fieldAlias("list", "items"), // items is a field alias to list.items
    sentinel: @fieldAlias("list", "sentinel"),
};

var foo = ThreadSafeList {...};
foo.items // same as foo.list.items

What happens here? Is ThreadSafeList.sentinel mutable? It shouldn't be, right?

@kyle-github Foo.sentinel is a declaration, not a field.

@xackus, Ugh, right. Perhaps it is not possible to construct. What I was trying to figure out is if it was possible to somehow make a field alias from a const into a var. I feel like there is a way to do it and that would be a problem, if true.

Perhaps this is a slightly stronger case or another one at any rate...

When you go to refactor data, one of the problems you have is that of fixing up all the code that references that data. Normally this is not that hard but there can be cases where it is onerous to do this all at once (perhaps it is part of a heavily used API that has a slower release cycle). In that case, being able to do the refactoring and move data around inside the struct but putting in aliases to make existing usage still work can be a really nice bridge. And, the fact that there is an alias is a good reminder (and easy-to-search key) that there is work to do in the future.

Normally what you end up with (in the case of an API for instance) is three sets of data structures: one that is exposed in the API, one that is used as a shim, and one that is used for the internal implementation. Once it is there, it is really hard to get rid of that shim and especially hard to work your way back down to just one set of data structures. It think that these aliases provide a good scaffolding with which to keep some sanity as you slowly reduce your technical debt. It is not a panacea but a lot of data refactoring involves more levels of indirection which is exactly what this can paper over temporarily.

I like this. I also think that problem 1 (from the original proposal) applies to functions as well. So perhaps this is something to consider in parallel:

fn @fnAlias(comptime field_name: []const u8, comptime sub_fn_name: []const u8) (fn alias)

I don't have a good example to offer at the moment, but I think this might be worth discussing in the context of a better mixin/inheritance model than what Zig offers at the moment.

It's true that @fieldAlias could be extended to include functions as well. However, you can emulate function aliases by using inline functions, i.e.

pub fn Foo = struct {
    bar: Bar,
    pub inline fn something(self: Foo) void {
        self.bar.something();
    }
}

It's more verbose than your proposed @fnAlias (especially when you have many parameters), but it can still be done. The interesting thing is that there is no way in Zig to create a field alias. Adding a feature like @fnAlias would have a higher standard because it would be considered "syntax sugar". @SpexGuy gave a great description about Zig's standard for ideas like this: #9838 (comment)

@marler8997 as you have mentioned, so far it seems as only syntactic sugar.
And you've mentioned the comment about high standards, which is basically saying that unless you have code that makes you scream "I'm dying over here", it has no place in the language.

So I'm wondering why the issue is open. Do you have a project where this feature seems like a necessity?

This proposal addresses 2 holes in the language. The first is "perfect forwarding" of types that use fields as apart of their interface. ArrayList is an example of this. So the hole is, there's no possible way to create a wrapper type around ArrayList that exposes the same interface.

pub fn MyCoolArrayList(comptime T: type) type {
    return struct {
        underlying_list: ArrayList(T),
 
        // we need this to be able to expose the same interface
        items: @fieldAlias("underlying_list", "items"),
    };
}

To be clear, by "interface" I mean "the way code interacts with the type". In the case of ArrayList, code is meant to reference the items field to access the current slice of items. In order for something to be "syntax sugar", there would need to be a way to do "the thing" in the language, but Zig has no way to perfectly forward a type that uses fields in its interface, thus, I consider it a "hole" in the language rather than "sugar" for something you can already do.

The second "hole" is #6349, namely, Zig doesn't have a way to represent C/C++ types that contain anonymous struct/unions. Instead zig must assign "names" to each sub struct/union, and selecting these names creates a new problem, namely, we need a way to create predictible names that don't change accross each compilation, and don't conflict with any other symbol. I think getting this to work in 100% of cases is actually impossible, so without this feature Zig is in a sticky situation trying to tackle a difficult problem with no good solution and bad tradeoffs all around. With @fieldAlias we still have to address the symbol conflict problem, but, our symbol names no longer need to be the same or predictable because it's only the aliases that the code will be using, the sub struct/names will be internal to the type only.

@marler8997 when you want a generic interface, you use functions(getters/setters). But that wasn't my question. I was asking specifically whether you have a real project where the feature is necessary. Otherwise I don't see how a feature can be deemed necessary.

Representing C types is a whole another issue.

This is just syntactic sugar. I don't zig has much syntactic sugar other than try.

Over IRC Andrew states that this proposal adds more complexity to the language than benefits. I think it's a fair criticism of the feature and would also add the following.

My original argument which states that you can't achieve "perfect forwarding" in the language, is mitigated by the fact that you can copy the underlying implementation to achieve the same "effect". In my example where I wrapped std.ArrayList and used a field alias to expose the same interface, the alternative is to copy the entirety of the std.ArrayList implementation and make modifications to the copy. This has some obvious downsides, but, it's a solution you can can achieve without adding anything to the language.

As for the second problem of representing/using C types with anonymous fields. Andrew says he's OK with Zig accessing those fields through a "named" subfield. This is simpler for the language and makes working with C types more consistent with working with Zig types. It seems like anything we add to the Zig language to make this easier would add complexity to Zig only for the sake of making it easier to work with C. To make "translate C" work, we can continue what we're doing today which is to generate names for the anonymous fields. Maybe this naming scheme can be improved. It's also possible we could improve this situation with a library solution, but that's yet to be determined.