JuliaLang/julia

abstract types with fields

Opened this issue ยท 158 comments

This would look something like

abstract Foo with
    x::Int
    y::String
end

which will cause every subtype of Foo to begin with those fields.

Some parts of the language internals already anticipate this; it's a matter of hooking up the syntax and filling in a few missing pieces.

๐Ÿ‘ to this

+1 (!)

I'm wondering if the with keyword is useful, necessary, and/or deliberate?

If he didn't have a with keyword, every declaration of abstract will need an end. Currently abstract is a oneliner, but type and immutable is mulitiline until a end marker.

Right, thanks @ivarne.

+1 this will be very useful!

I'd actually be ok with changing abstract to always require an end, although that would make this a breaking change, which it currently isn't. The with thing feels pretty clunky to me and our current abstract declarations have always felt a little jarringly open-ended to me.

I kind of agree with Stefan. Making the visual appearance of abstract more like that of type and immutable seems like a gain to me.

Yeah, FWIW I was going to say the same.

Fourth-ed

The big problem with making a syntactic change like that is it's going to become a watershed for all the code out there that declares abstract types, splitting that code into before and after versions. Since half of our community likes to live on the edge while the other half likes to use 0.2 (making up numbers here, but half-and-half seems reasonable), that's kind of a big problem. If there was some way we could deprecate the open-ended abstract type declaration, that would avoid the issue.

Now that 0.2 is out, I actually think we should tell people not to use master for work that's not focused on the direct development of Julia itself. I intend to only work from 0.2 until the 0.3 release while developing packages.

Maybe you can hack a temporary thing to end an abstract block if the next line does not start with a field declaration? Backporting this to 0.2.x would allow moving progressively, then you would introduce a deprecation warning, and make it an error with 0.3.

I think that's very reasonable, although it does cut down on the number of people testing out 0.3, which is unfortunate, but probably unavoidable.

@nalimilan, yes, I was thinking something along those lines, but it does feel kind of awful.

As a transitioning solution we might update 0.2.1 to allow a end on the same line after abstract. Then in 0.3 we might issue a warning if it is missing and in 0.4 we can require it. That makes this a rather lengthy process though.

Why don't we enable inheriting from type and immutable instead? It keeps the abstract keyword reserved for grouping types. It will also be cleaner if a immutable can't inherit from a type. Will it cause trouble somewhere if we have a abstract and a concrete type with the same name?

I like the first approach, but it is very, very slow, unfortunately. We definitely cannot allow inheriting from type or immutable. The fact that concrete types are final is crucial. Otherwise when you write Array{Complex{Float64}} you can't store them inline because someone could subtype Complex and add more fields, which means that the things in the array might be bigger than 16 bytes. Game over for all numerical work.

That is a good point. It will be too hard to know if Complex should be interpreted as an abstract or concrete type when it is used as a type parameter.

What about this?

abstract type Foo
    x::Int
    y::String
end

That does not introduce a new keyword, and it does not make old code break.

Very nice idea. So far that seems perfect.

That also potentially allows abstract immutable, which could require all subtypes to be immutable.

very cool

Yes, I like that idea. We can also make abstract Foo end allowed โ€“ optionally for now โ€“ and eventually require the end and make allow leaving out the type like we do with immutable. Or maybe we just leave it the way it is.

I'm unreasonably excited about this :)

New language features are like Christmas.

more support for this from https://groups.google.com/forum/#!topic/julia-users/6ohvsWpX6u0

(you're doing an amazing job here - i can't believe how far you've got and how good this is...)

There is a small question of how to handle constructors with this feature. The obvious thing is for it to behave as if you simply copy & pasted the parent type's fields into the new subtype declaration. However, this creates extra coupling, since changing the parent type can require changes to all subtype code:

abstract type Parent
    x
    y
end

type Child <: Parent
    z

    Child(q) = new(x, y, z)
end

The Child constructor has to know about the parent fields. Bug or feature?

It's very non-local, which I don't care for. One thought is that the subtype would have to repeat the declaration and match it. I know that's not very DRY but it's an immediate, easy-to-diagnose error when it happens, and it means that the child declaration is completely self-contained. The point of having fields in the abstract type declaration is to allow the compiler to know that anything of that type will have those fields and know what offset they're at so that you can emit efficient generic code for accessing those fields for all things of that type without needing to know the precise subtype. I don't think the feature is really about avoiding typing fields.

Isn't this a natural coupling which you will always have if you change a parent type?

Maybe it would be cleaner to have: Child(q) = new(Parent(x,y),z) i.e. the parent has to be the first value for new (and outer constructors as well).

It's an interesting point that all the value is in making sure the fields are there. Avoiding typing them is much less important.

Maybe it would be cleaner to have: Child(q) = new(Parent(x,y),z) i.e. the parent has to be the first value for new (and outer constructors as well).

This occurred to me also, but something about it doesn't quite feel right.

I like @tknopp s idea, but the syntax must be improved. There is also the issue that the Parent constructor needs to get a pointer from the child constructor to know where to initialize itself.

I am also not sure which form I like more. Introducing the super constructor for abstract types makes it more complicated. But it provides a better distinction which fields are from the parent and which are from the child.

What does Parent(x,y) return?

Child(q) = Parent(x, y)(z) - Parent() returns a thing like new()?

But what kind of thing? There are no instances of abstract types (or they wouldn't be abstract) so what type of object does it return?

I think all this is syntactic sugar and would have do be rewritten into the initially proposed form by the compiler. The Child(q) = new(Parent(x,y),z) syntax says a little bit more explicit that there is a Parent type nested into the Child type. It still reads like the members would be represented in memory.

But again, I am also not sure if this is worth it. The syntax proposed by @JeffBezanson is also fine and quite natural.

A possibility might be to support simple construction of the Child if Parent has default values for its members.

abstract type Countable
    c::Int = 0
end
type Object <: Countable; a; b; end

Now Object can be instantiated by in a inner constructor by new(c,a,b) or new(a,b).

Yeah, that's the thing. It looks like a function call but isn't at all, which is why I don't care for it. If you're going to do that, you may as well just write new(x,y,z), which is shorter and isn't just pointless syntax.

We don't allow for default values of fields at this point. Constructors can have defaults but fields don't have default values. The idea of allowing new(a,b,c) or new(b,c) won't work because we also allow new(a,b) to not assign a value to c.

Maybe this is obvious to everyone but it's good if whatever solution is chosen plays nicely with multiple inheritance in the, possibly distant, future.

multiple inheritance sounds like it would eventually require support for field renaming, which suggests repeating fields. in that case, could there be a macro that copies the values for the simple case? so

abstract Foo
    bar::Int
end
type Bar <: Foo
    @fields_from Foo   # equivalent to bar::Int
    baz::Float64
end

About the value of fields in abstract types: I think that we should aim higher than to just use this to make sure that the fields are there. To me, it would seem that one of the most useful parts of this would be to let the abstract super type support some given abstraction in a way that the subtypes don't have to know about. It should be possible to change the super type's implementation of that abstraction without having to change the sub types (as long as they only rely on the abstraction, and not the actual inherited fields etc.). Or maybe that it is too much to aim for; there's no way to completely avoid name clashes between fields from the super type and sub types if they don't know about each others' implementations. I guess what we really need to figure out is what kind of usage this feature is meant to support.

I don't think that there are many practicle examples where parent and child type are so loosly coupled that the parent fields can be constructed without feeding them through the child constructor.

Thinking more about the syntax for base field initialization I think that it would be very usefull if there would be a way to define an own constructor for the parent type. If base field initialization is non-trivial this would reduce code duplication. It should however be optional to call this parent constructor in the child constructor in order override the default behavior.

Stop me if I'm getting off on a whole nother topic but how about Type Factory and Static Members. I was thinking of how to make a type to represent Decibels. The frustrating thing is there's many slightly different, related definitions of decibels. I'd like if the datum could carry around which definition of Decibel it was using as part of it's type without having to copy all of the information for each one. For example, one definition of Decibel is dBm:

Units: "MilliWatts"
Base: 10
Scale: 10
Reference: 1 mW

You convert a quantity, Q of MilliWatts to dBm with 10*log(10, Q/1.0).

But then you also have dBV. It works similarly except:

Units: Volts
Base: 10
Scale: 20
Reference: 1 Volt

And so the conversion is 20*log(10.0, Q/1.0).

There are literally dozens of such definitions which could easily share code: http://en.wikipedia.org/wiki/Decibel

A Type Factory is a function that returns types. I don't think Julia has this. A Static Member is a value associated with the type itself that need not be repeated for each instance. I don't think Julia has this either. But you can imagine:

type Decibel{T}
    static units
    static scale
    static base
    static reference
    value
end

function decibelfactory(param, units, scale, base, reference)
    return Decibel{param}(units, scale, base, reference)
end

dBm = decibelfactory(:dBm, "MilliWatts", 10.0, 10.0, 1.0)

linearize(q::DB) = q.scale * log(q.base, q.value/q.reference)

x = dBm(6)

julia> typeof(x)
Decibel{:dBm}("MilliWatts", 10.0, 10.0, 1.0)

julia> linearize(x)
7.781512503836435
pao commented

A Type Factory is a function that returns types. I don't think Julia has this.

Early versions of what is now StrPack.jl dynamically created types using a macro, so this is entirely possible to do.

I guess we can get "tempted" to start doing

abstract type AbstractParent
  x
end
type Parent <: AbstractParent
  # x is inherited
end

abstract type AbstractChild1 <: AbstractParent
  # x is inherited
  y
end
type Child1 <: AbstractChild1
  # x is inherited from parent
  # y is inherited
end

abstract type AbstractChild2 <: AbstractParent
  # x is inherited
  z
end
type Child2 <: AbstractChild2
  # x is inherited from parent
  # z is inherited
end

to achieve some kind of inheritance from concrete types...

I have asked myself what the difference is between an abstract type with fields and the ability to inherit from a concrete type. It seems they are almost equal but the abstract type with fields needs one trivial concretization.

They differ in that you can specify fields and collections that only allow that "trivial concretization", whereas there is no way to express that when the abstract type and the trivial concretization are condensed to the same thing.

If you could inherit from concrete types, then I could (1) construct an array of Float64, (2) define a new subtype of Float64, (3) try to store it into the array. We don't want to allow that. Of course that could also be achieved by having some types be "final", but we felt it was simpler to make all concrete types final rather than have an extra keyword like that.

When I first start learning Julia I felt it badly lacked some features (classes, inheritance) I was used to from Python and C++. However, I soon start to love the simple and clean, yet powerful, ways to code in Julia.

Now I can't even figure a greater benefit from having abstract fields, apart from constant field offsets for all subtypes, as pointed by @StefanKarpinski. I really like APIs relying only on methods, thus hiding internals and fields. But it seems that the benefit from constant field offset is lost if using methods to access fields.

E.g., imagine we read an API with some abstract class like

abstract type A
  n::Int
end
size(a::A) = a.n

It can lead us to suppose that size() will always return filed n, allowing inlining, and benefiting from constant offset efficient code for all subtypes, right?

Of Course not, e.g.:

type B <: A
  # n::Int is inherited
end
size(b::B) = b.n^2

If we have function which takes an array of different values subtypes of A, and iterate over them:

arrayofA = A[B(1), B(2), _and_other_concrete_subtypes_of_A_...]

for a in arrayofA
          # if we do
  size(a)
          # size() can have been specialized, so no inlining and
          #  no constant offset efficient code here

          # if instead we do
  a.n
          # it can benefit from constant offset optimizations
          # however, it can yield different results from
          #  specialized `size()` functions
end

So either we stick to methods, getting no really advantage from efficient offset access code (thus, no advantage from abstract fields), or we start using fields directly, losing the internals hiding...

DISCLAIMER: I may be missing something and therefore saying a lot of BS :)

One could go even further, and point out that overloading . reduces the need for this feature even more, since I can effectively add read-only fields using

getfield(A::MyAbstractType, ::Field{:x}) = 0

So fields in abstract types really only add something in the rare case where being able to store some piece of state in a value is part of the interface.

@JeffBezanson I totally get that you need to finalize types in order to have unboxed array content. But with this PR on the table it seems to be that this will allow subtyping "with tricks" and I wonder if it will get a common pattern to define all methods on the "almost concrete" type and then put a trivial concretization on top of that.

In several cases that already happens (e.g. AbstractDataFrame), and is fairly common in OO languages generally. If people want to think of concrete types as just a final declaration, that's fine with me. We're happy as long as "final" types are possible, and that people are encouraged to use them.

I am also not totally sure if the lack of inheritance of concrete types is really an issue. The nice thing would be that one inherits all methods of a parent type. But maybe the cleaner solution is to use a "has a" relation instead of a "is a" relation anyway. But this currently means redefining various methods which feels like a drawback compared to inheritance.

I am starting to wonder if this feature is really that necessary. It's kind of hard to see what crucial problem it's solving. The main benefits seem to be:

  1. Guarantee that obj.field will always work for all subtypes of an abstract type that has .field. Otherwise someone could define a subtype and forget to have this field, leading to errors.
  2. Guarantee that obj.field is stored at a consistent location in all subtypes.

I actually think that 1. might be an argument against this feature: if we allowed overloading of obj.field then a subtype could define a getter and/or setter for .field instead of actually storing the field and still work fine with the abstract behavior of the super-type. I'm not sure if 2. is actually enough of a benefit to warrant the entire feature.

In fact, you can imagine a subtype being forced to have a .foo field but wanting to overriding the .foo syntax. Then it would be forced to have a vestigial .foo field that's just wasted and forced on it for no good reason.

Following up Point 1, part of the appeal of abstract types with fields is that they define a kind of interface you know all subtypes will support. For example, linear regression, logistic regression, SVM's and other models all will have a specific weight vector that you'd like to be sure is available. Whether it's through a field or a function is much less important than checking that the implementation satisfies the stated protocol.

To that end, I think that having a formalization of protocols/interfaces is much more useful and important than fields for abstract types, which only addresses a tiny portion of this much bigger issue.

I agree that it's much more important. But there is something nice about the minimal typing required when you have a lot of concrete types that are small variants on a parent abstract type that has almost all of the fields that concrete types will need.

I would say I'm on the "probably don't need this" side. If it's really just saving some typing, I don't think it's worth it. I think not having this in addition to the great feature of no sub-typing concrete types makes for very legible code. I think it's been mentioned many times that one of Julia's strengths is the code readability and having a type's fields not be explicit seems like a bummer to me. Any time I see a type that sub-types, I then have to track down that type's parent, moving up the chain until I finally parse all the fields this type happens to inherit. That or use names(), which also feels a little clunky.

Right, John, I see your point about minimal typing but my stance on this was to still require specifying the fields, which would completely undermine that benefit. I'm still not convinced of any design here that actually would reduce typing at all.

ok, so if we don't need this how i would solve the problem i had that originally led me to this page?

i am writing an api - say, for genetic algorithms - and i have a data structure (a type) that is passed to several functions. this structure will contain information that the "general" genetic algorithm uses (a population, parameters describing how to breed, etc) and also some customizable information that the api user should add.

so, when i write the library, i know some "part" of this data structure, but not others. other parts will be extended by the library user. for example, they might need to store some parameters that are needed to generate new individuals.

at the function level, i can structure this just fine. i write my "general" functions that do the tasks of breeding, etc. the user writes functions (with a name i choose) that i call from my code when i want to create new instances (for example). the user function is passed this same data structure.

but at the type level, i don't understand how i can do this without what was described here. i need to pre-define some fields, which are used by the library code. the user needs to extend that with other fields after the library is written.

if we don't have fields on abstract types, how do we solve the above elegantly?

[what i ended up doing was adding a single field, which the user can extend. so the user then defines their own structure and stores it there. that works, but it seems clunky to me - for example, it makes extension by two parties at once difficult, since they must agree between themselves what this single extra thing is. but maybe that's the best that is possible in julia. i just wanted to set out a clear example in case people are missing a use case...]

[i am also worried about "Whether it's through a field or a function is much less important..." since functions are not context-dependent in the same way as data (and closures cannot be assigned to package functions, as far as i can see). in other words, if you're calling the library twice, how do you make functions specific to a particular call?]

finally, i think a more formal way of saying the above is that this is the "expression problem" (wadler et al). although i haven't looked at that in some time and may be wrong.

I think the point is that one has to repeat the getter/setter methods for all(!) child types of an abstract type. Using this proposal one only need the abstract type definition with fields and is done.

To @StefanKarpinski concern with the "lost field" when overriding. I think in most situations the overriden field is still in use. I use in C# properties a lot in the following way:

double myProperty;
double MyProperty
{
  get { return myProperty; }
  set 
  {
    // perform a range check to look if value is in a valid range (e.g. > 0)
    myProperty = value;
    // update some dependent properties
  }
}

In the combination with the GUI toolkit WPF these properties can be bound to GUI elements, which makes it very convenient in practice. Without this, the relevance of this feature might not be so high.

I think the point is that one has to repeat the getter/setter methods for all(!) child types of an abstract type.

The most specific version of a function gets called for each set of arguments. By defining a method for the abstract type, it will be used for all child types, unless there's an even more specific definition.

As discussed earlier today with @JeffBezanson:

A concrete example for which this could be useful would be to simplify defining methods for Diagonal, Bidiagonal, SymTridiagonal, Tridiagonal, and a future hypothetical Banded(N) matrix types. Each of these matrix types would require a field for the diagonal elements. Most of these would need also at least one sub/superdiagonal field, and quite possibly more.

A hypothetical supertype of these particular matrix types would simplify the implementation of basic linear algebra functions. For example, diag(A,n) should retrieve the appropriate super/sub/diagonal field, or otherwise generate a zero vector of the correct length.

I'll mention that a definition like this is valid, since we don't strictly check things:

diag(x::AbstractDiagonal) = x.d

Then each subtype just has to have a field of that name.

I looked through this entire thread. I am still not convinced that abstract type with fields is necessary. I have worked on a dozen of Julia packages, and doesn't come across a single time where I want a super-type to enforce that all subtypes have to share some common fields.

There are plenty of cases that one would want to enforce that all subtypes can provide information of some sort. However, all these can be done through (multi-dispatch) methods instead of fields. Requiring methods to be implemented is far more flexible than requiring the presence of a particular set of fields.

The following example should illustrate this point. A common information that should be provided by all kinds of matrices is the number of rows & columns. Then, should we do the following?

# This forces all subtypes of AbstractMat to have fields nrows & ncols
abstract AbstractMat
    nrows::Int   
    ncols::Int 
end

Of course not. There are numerous ways to represent the shape, and using two integers is just one of them. For example, I can use a tuple or a vector, etc, or if I want to implement a SquareMatrix type, I can just use one integer to represent the shape. I think the current Julian way does this right -- it requires the size(a, d) method to be implemented instead of requiring what fields should be present.

To me, fields are almost always about implementation details. Interface should be expressed using methods. Abstract types with fields are kind of making the fields part of the programming interface (API). I am yet to be convinced that this is a good idea.

Using fields in abstract types also make things unnecessarily complicated. What if the subtype want the fields to be of different types than those being declared in the abstract type?

People mentioned the usefulness to allow properties to be inherited. I agree with this.

However, properties are more like methods than fields.

In terms of saving typing, one can always use macros. If you find your self writing a lot of types that share a subset of fields, you can write a macro to generate those shared parts so that you don't have to repeat them many times.

I now agree this is a kind of marginal feature. It's easy to misuse; as you point out it's undesirable for read-only properties.

@timholy I wanted to come up with an example like that one that @JeffBezanson provided, which I thought would not be possible. Is that pattern used anywhere in the Julia source code? In combination with field overloads this could be very interesting. One could provide default implementations on abstract types that make certain assumptions on the fields available. A concrete type either has provide the field or provide an equivalent field overload.

I kind of agree with @lindahua that currenty methods are used as public interfaces while fields are implementation details. Making fields overloadable can break this view. Then the fields/properties can become part of the interface. In C# usually the convention is used that properties start with an upper case letter to make clear that this is part of the interface.

I am actually not totally sure if we need properties in Julia. The nice thing about them is a) the point syntax b) that getter and setter have the same name. b) might not be that important in Julia as we have this nice ! notation. So one could define properties as

wheel( car ) # gets the wheel
wheel!( car, anotherWheel ) #sets the wheel in car

The dot syntax seems to be a really huge deal for many people. It is arguably one of the most popular bits of syntax among all modern languages. Modern languages need to support dot-oriented programming :)

Well, from my point of view it is a plus that Julia does not support the dot syntax for member functions. But fields and properties are a different thing these are things that definately belong to an object. But on the other hand it would be kind of consequent to not allow field overloads and do all getters/setters with methods like I outlined above. Then one has a cleaner separation between what is an interface and what is the implementation detail

I agree that many properties should be methods, like size(x). I would like to add dot overloading, but I don't want to see a profusion of things like x.size as a replacement for these.

If something like

getfield(A::MyAbstractType, ::Field{:x}) = 0

can be done, then nothing stops one to start doing at the beginning of the code

getfield{S}(o::Any, ::Field{S}) = @eval $S($o)

and do

a = [1 2; 3 4]
a.size

everywhere.

( How I tested it:

abstract Field{S}
getfield{S}(o::Any, ::Type{Field{S}}) = @eval $S($o)
a = [1 2; 3 4]
getfield(a, Field{:size})

( When I first arrive at Julia, I tried to do something like that (because I though Julia dot syntax was broken :D ))

That is the danger. The C# developer in me wants it but it will break the view that fields are implementation details. Maybe it needs a more well defined use case. I think @stevengj wanted this for pycall.

That hack fills me with dread. Not to mention that getfield(::Any, ::Field) = 0 would probably just break the whole system.

@timholy I wanted to come up with an example like that one that @JeffBezanson provided, which I thought would not be possible. Is that pattern used anywhere in the Julia source code? In combination with field overloads this could be very interesting. One could provide default implementations on abstract types that make certain assumptions on the fields available. A concrete type either has provide the field or provide an equivalent field overload.

If you give it a try, you'll see it works. (Images uses this technique.) There is no compile-time guarantee that the fields are there, but if they are not you'll get a clear run-time error, and to me that seems adequate.

I think @stevengj wanted this for pycall.

Yes, and I want it for the same reason in JavaCall.

I had a few different implementations of an OrderedDict which were only slight modification of Base.Dict. One version (#2548) of this added an AbstractDict class above Dict and OrderedDict, which basically assumed that most of the current fields of Dict existed, and added two or three more for OrderedDicts. Jeff didn't like that at the time (see his comment in #2548), although without allowing fields in abstract types, that would probably be the most efficient way forward.

I have another datapoint for a use case where this would be very handy. I think that the general concept is when the operations defined on the abstract type require some state to be attached to the object (which is what @JeffBezanson mentioned above)

In AudioIO.jl the audio processing is implemented by creating a graph of AudioNode subtypes that each implement their own render function to generate audio (e.g. SinOsc <: AudioNode renders a sinusoid, AudioMixer <: AudioNode calls the render function on all of its inputs and mixes them together). I wanted to enable waiting on AudioNodes, so I implemented Base.wait(node::AudioNode), which waits on a Condition stored with the object. In order to do this I had to track down all the concrete subtypes and add the condition field to all of them. That's manageable if they're all implemented in this module, but as the number of AudioNode types grows and possibly becomes split across different libraries it's infeasible to have to go in and add a field to all of them.

Allowing fields on abstract types seems like a win in this case, but there are alternatives:

  1. Do what I'm doing now which is to manually define all required fields in each subtype. This is error-prone, it's easy to forget one, and it especially problematic across libraries
  2. Create a AudioNodeState type, and all subtypes are required to have a field node_state::AudioNodeState. That way if I add behavior to AudioNode that requires some state I can add it to the AudioNodeState definition. This actually seems like a pretty good solution that's conceptually simple and explicit. There's only one thing for subtype implementers to remember, and if they forget to add the field it will get found out the first time any state is accessed.
  3. Add a macro that defines the proper fields. This feels more magical than #2 and doesn't seem to gain much.
  4. Make AudioNode a concrete parametric type with the specific renderer contained as a field within, like
type AudioNode{T <: AudioRenderer}
    cond::Condition
    active::Bool
    renderer::T
end

This feels a little heavy/complicated, but probably worth trying on for size.

Given that we have a couple of seemingly pretty-good options, I'm actually less convinced than I was before that fields on abstract types are the right fix for this problem. It seems like having fewer patterns to choose from is a good thing, and I definitely agree with @karbarcca that the locality and explicitness of Julia type declarations makes the code a lot easier to read.

Thanks @StefanKarpinski and @JeffBezanson for the discussion and ideas today, which helped to crystalize a lot of this.

I think the 2nd and the 4th alternatives are the most julian ones. And for this particular case, the 4th option seems to be the most meaningful.

Somehow I hadn't seen this thread.

The dot syntax seems to be a really huge deal for many people. It is arguably one of the most popular bits of syntax among all modern languages. Modern languages need to support dot-oriented programming :)

I somewhat feel like most of the requests for it are from people writing inter-op code for these other languages. (although this is off-topic for this thread)

I now agree this is a kind of marginal feature. It's easy to misuse; as you point out it's undesirable for read-only properties.

Let's close this issue then. I've never felt it would be a significant savings in any of my code. And it further confuses the difference between the "thing" -- a type -- and the behavior -- the abstract. If anything, I would propose trying to make those more distinct (but that's a different topic for later).

Also late to the party :)

One of the common use-cases put forward for fields on abstract types is to provide some base data and implementation that user code extends by derivation.

But instead of derivation, isn't the Julian way to do this to make a generic type with the part the user adds being a type parameter.

Then you separate the basic functionality and the extension parts cleanly, you properly express that the basic functionality can't be used without the extended functionality, you create an appropriate concrete type when its extended, and you even save typing :)

instead of abstract types with fields, what if this were reversed: concrete types with inheritance? same underlying machinery, but the user can choose whether to inherit the fields or replace them (extending the fields is not allowed. abstract immutable is not allowed). this avoids the two pitfalls of: forcing the user to have fields they don't actually need and constructor dependencies.

abstract type A
  field1
  field2
end

type B <: A
  field1
  field3
  field4
end

type C <: =A

A(1,2)
B(1,3,4)
C(1,2)

inner constructors would be inherited also

i think this would make wrapping Gtk.jl much nicer and user-friendly, since Gtk has many of these inheritable concrete types. On the Julia side, most of these simply have a handle::Ptr{GObject} field in julia (and an identical constructor), but a few of which have something else.

An example of how this will be useful is in creating an abstract AbstractTimeSeries.

abstract type AbstractTimeSeries{T,N}
  timestamp
  values::Array{T,N}
  colnames
  # inner constructor enforcing invariants
end

This makes creating custom time series types much simpler.

immutable FinancialTimeSeries{T<:Float64,N} <: AbstractTimeSeries
  # 3 fields plus inner constructor for free
  instrument::Stock
end

type OrderBook{T<:ASCIIString,2} <: AbstractTimeSeries
  # 3 fields plus inner constructor for free
  instrument::Stock
end

type Blotter{T<:Float64,2} <: AbstractTimeSeries
   # 3 fields plus inner constructor for free
  instrument::Stock
end

type FinancialPortfolio{T<:Float64,2} <: AbstractTimeSeries
 # 3 fields plus inner constructor for free
   blotters::Vector{Blotter}
end

type FinancialAccount{T<:Float64,2} <: AbstractTimeSeries
 # 3 fields plus inner constructor for free
   portfolios::Vector{FinancialPortfolio}
end

Though I likely mucked up the syntax, the basic idea is that new custom time series types are easy to construct, and new fields can be added.

Would we allow new inner constructors in the derived type?
Would new in the derived type invoke an inner constructor of the base type?

Yes, I would think it useful to add invariants, but every derived type would at least have the abstract invariants, which includes the length of the time array matches size(values,1) and colnames matches size(values,2), as well as dates must be sequential and in descending order.

just to be clear, my proposal explicitly disallows partial inheritance of fields. it is strictly limited to selectively allowing concrete types to be used as abstract type names. (with special syntax to indicate that the derived class has exactly the same fields and constructors as the original type)

I think that we have considered abstract types with fields in the context of trying to solve a number of different problems, but looking at this discussion it seems to me that if they should be used for anything, it should be to address the cases where you would currently create either

  • a parametric type with a specialization field or (the subtype's data is a field in the supertype)
  • a common type that would be used for the same named field in each of the subtypes (the supertype's data is a field in each subtype)

(alternatives 4 and 2 above from @ssfrr respectively), because you want to create a family of types with some common storage and behavior.

Once upon a time, there was an idea to disallow field access such as obj.x in all cases except when the concrete type of obj was statically known. If you didn't know the type of obj when you wrote the code, how could you know what its fields signify?

The way I understand it, this restriction was not implemented because it was deemed too useful to be able to have a family of types containing some same named fields. But this seems to be exactly what abstract types with fields would address!

How about restricting field access like obj.x to cases where it is statically known that obj.x exists? The statically known type of obj need not be concrete, as long as the fields in question would exist in it.

To ensure the separation between abstract type and subtypes, access to the fields of the abstract type could require that that type be used, not just a subtype of it. This would give a complete namespace separation between the fields of the supertype and subtype. It could go a long way to avoiding the fragile base class problem, by forcing a proper interface between supertype and subtype.

Of course, it remains to be defined what the statically known type of an expression would mean.

dpo commented

+1000

Once we make a.b syntax overloadable then it will be just like writing f(a,Field{:b}). We never statically disallow generic function application, but this would be the only place in the language where we do that? That doesn't really make any sense.

I will readily admit that I'm not at all convinced that this is the way forward. What troubles me most is introducing and defining an entirely new mechanism in the language. Still, I find the idea interesting enough to investigate it a bit further:

I agree that we shouldn't statically disallow generic function application. Another way would be that, instead of having a.b as syntactic sugar for getfield(a, Field(:b)), to make it stand for

getfield(a, Field(:b), static_type_of_a)

where the last argument gives the static type that the field is looked up in. Actual fields would correspond to signatures like

getfield(a::A, ::Field(:b), ::Type{A}) = ...  # implicitly created for field b in type A

so that they could only be accessed using the same type as they were defined in.

Properties, on the other hand, could and probably would be made to apply for a range of static types, i.e.

getfield(a::A, Field(:myproperty), ::Type) = ...  
getfield{T<:MySuperType}(a::A, Field(:myproperty), ::Type{T}) = ...  

to make them available regardless of static type, or given any static type <:MySuperType respectively.

This is still different from the mechanisms that exist now. The reason to make it different would be that access to actual fields (not properties) is different - it is an implementation detail and not an interface. But I am still not sure whether it would be worth it.

That's why this whole issue gives me pause. Inheritance in Julia is about behavior, not structure. And that's a good thing. Conflating inheritance of behavior and inheritance of structure is precisely the mistake that C++, Java, et al. have made and it causes all sorts of problems. It may make sense to introduce some mechanism for structural inheritance in Julia, but I don't think it should be confused with โ€“ or tied to โ€“ behavioral inheritance.

Yes, I think you are right. If it shouldn't be tied to behavioral inheritance, then I guess it shouldn't be tied to subtyping at all. It would be interesting if we could come up with a way to address the family-of-similar-types problems that have been discussed here that is not tied to subtyping, but I suppose that this issue is not the forum to discuss it.

The distinction between inherited behavior and not structure is useful in navigating how to think about this problem. I was certainly getting lost in the thread before you spelled it out that way.

In my point of view this PR is closely relate to #1974 and #5. If we decide field overloading should not be done this PR makes a lot more sense.
While I agree about behavior vs. structure thing, there are cases like Gtk.jl as Jameson mentioned where abstract types with fields are handy. But #4935, #1974 and #5 should be seen in shared context. Abstract multiple inheritance and a way to define (and check) interfaces in a formal way is IMHO the most important of these issues.

I'd like to point out that the decision to disallow concrete type inheritance may also be supported by the fact the in the c++ world inheriting from concrete types is considered conceptually problematic, too:
"Item 33: Make non-leaf classes abstract"
http://ptgmedia.pearsoncmg.com/images/020163371x/items/item33.html

@pbazant Coming from the c++ world I disagree. The design the author chooses from that article doesn't look like a good design to me, but I don't see how it argues against inheriting from concrete types.

There are plenty of places in both the standard libraries, semi-official libraries (e.g. boost), and others which use inheritance.

In fact if you don't allow inheritance from concrete types you essentially have Java's interfaces. There are many complaints you can find for why this can make for bad design.

I understand the arguments against multiple inheritance with concrete types. That's why some languages like Ruby, Scala, and Swift have some concept of mix-ins.