allow overloading of a.b field access syntax

Question

allow overloading of a.b field access syntax

StefanKarpinski opened this issue 11 years ago · 249 comments

Brought up here: #1263.

+1

Answer 1 · 2013-01-10T07:15:36.000Z

The ability to use dots as syntactic sugar for mutator/accessor methods would be nice for lots of things. I've always appreciated this in languages that provide it, so that you can turn structure fields into more complicated abstractions without breaking the API.

Answer 2 · 2013-05-06T01:42:43.000Z

I have an absolutely awesome way to implement this.

Answer 3 · 2013-05-06T01:56:28.000Z

Interested in talking about it? I know that Tom Short is really interested in having this for DataFrames, although I've come to be increasingly skeptical about the wisdom of using this feature.

Answer 4 · 2013-10-22T22:58:53.000Z

This would make calling Python code (via PyCall) significantly nicer, since currently I'm forced to do a[:b] instead of a.b.

Answer 5 · 2014-01-13T19:27:17.000Z

@JeffBezanson, any chance of having this for 0.3? Would be great for inter-language interop, both for PyCall and for JavaCall (cc @aviks).

Answer 6 · 2014-01-13T21:29:44.000Z

@JeffBezanson if not, is there any chance you could give some direction on how you want this implemented? (I have an absolutely awesome way to implement this.)

Answer 7 · 2014-01-13T22:02:39.000Z

In my experience, there is no faster nor surer way to get Jeff to implement something than to implement a version of it that he doesn't like ;-)

Answer 8 · 2014-01-14T00:01:41.000Z

The basic idea is that you implement

getfield(x::MyType, ::Field{:name}) = ...

so that you can overload it per-field. That allows access to "real" fields to keep working transparently. With suitable fallbacks getfield(::MyType, ::Symbol) also works.

The biggest issue is that modules have special behavior with respect to .. In theory, this would just be another method of getfield, but the problem is that we need to resolve module references earlier since they basically behave like global variables. I think we will have to keep this a special case in the behavior of .. There is also a bit of a compiler efficiency concern, due to analyzing (# types) * (# fields) extra function definitions. But for that we will just see what happens.

Answer 9 · 2014-01-17T14:42:32.000Z

@JeffBezanson Do you also refer to const behavior in modules? It would be useful to have a user type emulating a module and be able to tell the compiler when the result of a dynamic field lookup is infact constant. (another approach would be to start with an actual module and be able to "trap" a failed jl_get_global and inject new bindings on demand)

I would find that to be very useful in combination with #5395. Then one be able to intercept a call to a undefined function or method MyMod.newfunction(new signature) and generate bindings to a (possibly large) API on demand. This would then be cached as usual const bindings I guess.

Answer 10 · 2014-01-30T03:40:26.000Z

Let me, a simple Julia newbie, present a little concern: I think the possibility to overload the dot operator might imply that field access "purity" is somehow lost.

The user would generally lose the knowledge if doing a.b is just an access to a reference/value or if there can be a huge function machinery being called behind. I'm not sure how that could be bad though, it is just a feeling...

On the other hand, I see that indeed this is a big wish for syntax sugar for many cases (PyCall, Dataframes...), which is perfectly understandable.
Maybe it is time for .. #2614?

Answer 11 · 2014-01-30T03:47:05.000Z

I support doing this.

But the purity does have something to say for it, even if one can use names(Foo) to figure out what the real components of Foo are.

The purity argument is closely related to the main practical concern I have, which is how one handles name conflicts when the fields of the type interfere with names you might hope to use. In DataFrames, I think we'd resolve this by banning the use of columns and colindex as column names, but wanted to know what people's plan was for this.

Answer 12 · 2014-01-30T04:23:19.000Z

I guess getfield(x::MyType, ::Field{:foo}) = ... would have to be forbidden when MyType has a field foo, otherwise the access to the real field would be lost (or a way to force access to the field would have to be available).
But then getfield could only be defined for concrete types, since abstract ones know nothing about fields.

(Meanwhile, I stumbled upon this about C++.)

Answer 13 · 2014-01-30T04:34:13.000Z

It's not a major problem. We can provide something like Core.getfield(x, :f) to force access to the real fields.

Answer 14 · 2014-01-30T05:52:03.000Z

Ok, maybe I'm sold. But then defining a shortcut to Core.getfield(x, :f) (e.g., x..f) will be nice, otherwise internal code of types overloading the . for all symbols (dataframes, probably dictionaries) have to be crowded with Core.getfields.

Answer 15 · 2014-01-30T06:18:29.000Z

I'm not worried about the purity aspect - until we have this, the only code
that should be using field access at all is code that belongs to the
implementation of a given type. When field access is part of an api, you
have to document it, as with any api. I agree that it might be handy with
some shortcut syntax for core.getfield though, when writing those
implementations.

Answer 16 · 2014-01-30T07:21:46.000Z

It had already been pointed out in #4935, but let's pull it to here: dot overloading can overlap a little with classical Julian multiple dispatch if not properly used, since we can start doing

getfield(x::MyType, ::Field{:size}) = .........
for i=1:y.size .....

instead of

size(x::MyType) = ..........
for i=1:size(y) ....

While the dot would be great to access items in collections (Dataframes, Dicts, PyObjects), it can somehow change the way object properties (not fields) are accessed.

Answer 17 · 2014-01-30T09:43:19.000Z

I think one thing to consider is that if you can overload accessing field, you should also be able to overload setting a field. Else this will be inconsistent and frustrating. Are you OK to go that far?

Answer 18 · 2014-01-30T14:56:13.000Z

@nalimilan, one absolutely needs a setfield! in addition to getfield. (Similar to setindex! vs. getindex for []). I don't think this is controversial.

Answer 19 · 2014-01-30T15:40:17.000Z

Agree with @stevengj: DataFrames will definitely be implementing setfield! for columns.

Answer 20 · 2014-01-30T17:19:36.000Z

I support this.

Experience with other languages (e.g. C# and Python) does show that the dot syntax does have a lot of practical value. The way that it is implemented through specialized methods largely addresses the concern of performance regression.

It is, however, important to ensure that the inlineability of a method won't be seriously affected by this change. For example, something like f(x) = g(x.a) + h(x.b) won't become suddenly un-inlineable after this lands.

If we decide to make this happen, it is useful to also provide macros to make the definition of property easier, which might look like:

# let A be a type, and foo a property name
@property (a::A).foo = begin
    # compute the return the property value
end

# for simpler cases, this can be simplified to
@property (a::A).foo2 = (2 * a.foo)

# set property 
@setproperty (a::A).foo v::V begin
    # codes for setting value v to a property a.foo
end

Behind the scene, all these can be translated to the method definitions.

Answer 21 · 2014-01-30T17:55:29.000Z

I'm not convinced that @property (a::A).foo = is all that much easier than getproperty(a::A, ::Field{foo}) = ...

In any case, better syntactic sugar is something that can be added after the basic functionality lands.

Regarding inlining, as long the field access is inlined before the decision is made whether to inline the surrounding function, then I don't see why it would be impacted. But maybe this is not the order in which inlining is currently done?

Answer 22 · 2014-01-30T18:00:02.000Z

getproperty(a::A, ::Field{:foo}) = strikes me as there are too many colons :-) I agree that this is a minor thing, and probably we don't need to worry about that right now.

My concern is whether this would cause performance regression. I am not very clear about the internal code generation mechanism. @JeffBezanson may probably say something about this?

Answer 23 · 2014-01-30T18:13:35.000Z

Field access is very low-level, so I won't do this without making sure performance is preserved.

Answer 24 · 2014-03-26T13:02:56.000Z

After all I'm not convinced overloading fields is a good idea. With this proposal, there would always be two ways of setting a property: x.property = value and property!(x, value). If field overloading is implemented, we'll need a very strong style guide to avoid ending in a total mess where you never know in advance which solution the author has chosen for a given type.

And then there would be the question of whether fields are public or private. Not allowing field overloading would make the type system clearer: fields would always be private. Methods would be public, and types would be able to declare they implement interfaces/protocol/traits, i.e. that they provide a given set of methods. This would go against @stevengj's #1974 (comment) about overloading fields with methods to avoid breaking an API: only offer methods as part of the API, and never fields.

The only place where I would regret field overloading is for DataFrames, since df[:a] is not as nice as df.a. But that doesn't sound like it should require alone such a major change. The other use case seems to be PyCall, which may indicate that field overloading should be allowed, but only for highly specific, non-Julian use cases. But how to prevent people from misusing a feature once it's available? Hide it in a special module?

Answer 25 · 2014-03-26T15:07:22.000Z

@nalimilan, I would say that the preference should be to use x.property syntax as much as possible. The thing is that people really like this syntax – it is very pleasant. Taking such a nice syntax and specifically saying that it should only ever be used for internal access to objects seems downright perverse – "hah, this nice syntax exists; don't use it!" It seems much more reasonable to make the syntax to access private things less convenient and pretty instead of forcing APIs to use the uglier syntax. Perhaps this is a good use case for the .. operator: the private real field access operator.

I actually think that this change can make things clearer and more consistent rather than less so. Consider ranges – currently there's a sort of hideous mix of step(r) versus r.step styles out there right now. Especially since I introduced FloatRange this is dangerous because only code that uses step(r) is correct. The reason for the mix is that some properties of ranges are stored and some are computed – but those have changed over time and are in fact different for different types of ranges. It would be better style if every access was of the step(r) style except the definition of step(r) itself. But there are some steep psychological barriers against that. If we make r.step a method call that defaults to r..step, then people can just do what they're naturally inclined to do.

To play devil's advocate (with myself), should we write r.length or length(r)? Inconsistency between generic functions and methods are a problem that has afflicted Python, while Ruby committed fully to the r.length style.

Answer 26 · 2014-03-26T15:11:42.000Z

+1 for .. as Core.getfield!

Answer 27 · 2014-03-26T15:23:25.000Z

@StefanKarpinski Makes sense, but then you'll need to add syntax for private fields, and interfaces will have to specify both methods and public fields. And indeed you need a style guide to ensure some consistency; the case of length is a difficult one, but then there is also e.g. size, which is very similar but needs a dimension index. This decision opens a can of worms...

In that case, I also support .. to access actual fields, and . to access fields, be they methods or real values.

Answer 28 · 2014-03-26T15:28:43.000Z

To play devil's advocate (with myself), should we write r.length or length(r)? Inconsistency between generic functions and methods are a problem that has afflicted Python, while Ruby committed fully to the r.length style.

The key factor that may be disambiguating for this issue is whether you want to be able to use something as a higher order function or not. I.e. the f in f(x) is something you can map over a collection, whereas the f in x.f is not (short of writing x -> x.f) – which is the same situation for all methods in single-dispatch languages.

Answer 29 · 2014-03-26T15:47:21.000Z

Why stop at field access? What about having x.foo(args...) equivalent to getfield(x::MyType, ::Field{:foo}, args...) = ... ? Then we could have x.size(1) for size along first dimension. (not sure whether I'm fond of my suggestion, but maybe something to consider. Or probably not, as people will just write OO look-alike code?)

Answer 30 · 2014-03-26T15:50:17.000Z

That would be possible with this functionality. Which is one of the things that gives me pause. I don't have a problem with o.o. style code like that – as I said, it's fairly pleasant and people really like it – but it does introduce enough choice in ways to write things that we really need a strong policy about what you should do since you'll be very free with what you can do.

Answer 31 · 2014-03-26T17:41:02.000Z

When I started to learn Julia, the no-dot syntax helped me a lot to mentally let go of OO-programming style. So for that reason alone, I think that my suggestion is bad.

Also, for simple overloading (i.e. just a.b sans (args...)), I agree with @nalimilan's comment above. In issue #4935 the consensus seems to be that fields should not be part of the API but only methods; consequently it seems that that issue will be closed. Having the .-overloading syntax will make it much less clear that normal-fields should not be part of the API and will probably encourage to make fields part of the API.

Answer 32 · 2014-03-26T17:52:06.000Z

But yes, the . syntax is convenient...

How about: the single . should only be syntactic sugar for getfield(x::MyType, ::Field{:name}) = ... and field access is only through .. (i.e. what . is now).

This would allow to make the clear distinction:

the . is for public API to access value-like things of type-instances
the .. is for field access and should generally not be used in the public API

Of course, this would be a breaking change.

Answer 33 · 2014-03-26T17:53:36.000Z

That's basically what I was proposing, except that . defaults to .. so it's not breaking.

Answer 34 · 2014-03-26T18:24:57.000Z

Sorry, I should have re-read!

But I think the . not defaulting to .. might actually be nice (apart from that it is breaking), as it would force a decision on the developer about what is public API and what not. Also, if the user uses a .. than he can expect that his code may break, whereas . should not.

Answer 35 · 2014-03-26T18:31:25.000Z

That's a good point. We can go that route by having a.b default to a..b with a deprecation warning.

Answer 36 · 2014-03-26T18:47:35.000Z

From a style perspective, I think I'd much prefer to see

a = [1:10]
a.length()
a.size()

than

a.length
a.size

I think it helps preserve the idea that a function is being called instead of just a property being retrieved that is somehow stored in the type (back to the "purity" concern above). I wonder if there's a way to help ensure this kind of style so things don't get as messy as it is in some other languages.

Answer 37 · 2014-03-26T18:50:47.000Z

I don't really like

a.length()

since then I can't tell if there was a function field in the original type. If . never accesses fields, that's obviously not an issue. Otherwise, it seems confusing to me.

Answer 38 · 2014-03-26T19:16:34.000Z

A priori, I feel that we shouldn't do either a.length() or a.length. But the question is why? What makes r.step different from r.length? Is it different? If they're not different, should we use step(r) and length(r) or r.step and r.length?

Answer 39 · 2014-03-26T19:18:46.000Z

With the semantics suggested by Stefan and the addition by me it would be clear that . always is a function call (just like + too), whereas .. is always a field access.

On the issue whether a.length, etc is a good idea: how about . access should only be used to access actual data in the type, more or less as one would use the entries of a dict. Whereas we stick with functions for the none-data properties like, size, length, step etc. Because some of them will need extra parameters and, I think, the a.size(1) type of syntax is bad.

Answer 40 · 2014-03-26T19:51:15.000Z

Here is my take on this topic:

The dot syntax should only be used for attributes of a type/class. Please keep in mind that this is not only about getters but also setters and something like a.property() = ... feels completely wrong.
While I kind of like the current situation where function define the public API and fields are private, I share Stefans opinion that the dot syntax is too nice to be forbidden for public APIs. But please lets restrict this to simple attributes. a.length is a good example, a.size(1) not because it requires an additional argument.
Please let . default to ... Julia is not known to be a boilerplate language. Lets keep it that way

Answer 41 · 2014-03-26T20:02:16.000Z

Please let . default to ... Julia is not known to be a boilerplate language. Lets keep it that way

I do tend to agree with this. The syntax for setting even a synthetic property would just be a.property = b, not a.property() = b.

Answer 42 · 2014-03-26T20:04:30.000Z

Sure, I just wanted to make clear why a.property() as a syntax is IMHO not nice

Answer 43 · 2014-03-26T20:11:11.000Z

Or more clearly: The important thing about the dot syntax is not that one can associate functions with types/classes but its the ability to write getters/setters in a nice way. And getters/setters are important for data encapsulation (keep the interface stable but change the implementation)

Answer 44 · 2014-03-26T20:11:54.000Z

This change would be great from an API designers perspective but I agree that it should come with some sort of style guide to limit any future inconsistency.

This would enable Ruby like dsl's...

amt = 1.dollar + 2.dollars + 3.dollars.20.cents

But be prepared for java like madness:

object.propert1.property2.property3 ....

Answer 45 · 2014-03-26T20:13:53.000Z

Just a few thoughts:

I most want the . syntax for Dicts with Symbols as keys. Its just nicer to use d.key then d[:key]. But in the end it's not critical.
I think that a->property reads better than a..property. But again it is not that big a deal and I don't know if it would work with julia syntax.

Answer 46 · 2014-03-26T20:20:42.000Z

@BobPortmann I disagree. A dictionary is a container object, the API for container objects is obj[index] or obj[key]. Right now because we don't have properties in Julia, the container API is overloaded to provide this functionality in libraries like PyCall and in OpenCL. This change helps to strengthen the distinction of the container API as it will not be overloaded to provide additional functionality.

Answer 47 · 2014-03-26T20:27:20.000Z

Using a->property for private fields would be a good way to keep C hackers away from Julia ;-)

I kind of like the .. syntax.

Answer 48 · 2014-03-26T20:28:12.000Z

The a->property syntax is already spoken for – that's an anonymous function. The a..b operator has been up for grabs for a while, however. There are some cases where you want something that's dict-like but has lots of optional fields. Using getter/setter syntax for that would be nicer than dict indexing syntax.

Answer 49 · 2014-03-26T20:36:34.000Z

"The a->property syntax is already spoken for – that's an anonymous function."

Yes, of course. It didn't look like it without spaces around the ->.

Answer 50 · 2014-03-27T01:29:57.000Z

As a style guideline, how about recommending that property(x) be used for read-only properties and that x.property be used for read/write properties?

For writable properties, x.foo = bar is really much nicer than set_foo!(x, bar).

Answer 51 · 2014-03-27T08:45:08.000Z

Having foo(x) for reading and x.foo for writing is quite confusing. Actually this is what properties make so appealing. Having the same syntax for read and write access, i.e. the most simple syntax one can get (for getters and setters)

Regarding style there is the big question whether we want to have both x.length and length(x) if this feature gets implemented or whether the later form should be deprecated and removed.

My opinion is that we should only have one way of doing it and only use x.length in the future. And regarding style I think its quite simple. Everything that is a simple property of a type should be implemented using the field syntax. Everything else with functions. I have used properties in C# a lot and rarely found a case where I was unsure whether something should be a property or not.

Answer 52 · 2014-03-27T14:27:34.000Z

I'm against changing a randomly-chosen set of 1-argument functions to x.f syntax. I think @mauro3 made a good point that doing this obscures the nature of the language.

a.b is, at least visually, kind of a scoping construct. The b need not be a globally-visible identifier. This is a crucial difference. For example, matrix factorizations with an upper part have a .U property, but this is not really a generic thing --- we don't want a global function U. Of course this is a bit subjective, especially since you can easily define U(x) = x.U. But length is a different kind of thing. It is more useful for it to be first class (e.g. map(length, lst)).

Answer 53 · 2014-03-27T14:49:03.000Z

Here are the guidelines I would suggest. The foo.bar notation is appropriate when:

foo actually has a field named bar. Example: (1:10).start.
foo is an instance of a group of related types, some of which actually have a field named .bar; even if foo doesn't actually have a bar field, the value of that field is implied by its type. Examples: (1:10).step, (0.1:0.1:0.3).step.
Although foo doesn't explicitly store bar, it stores equivalent information in a more compact or efficient form that is less convenient to use. Example: lufact(rand(5,5)).U.
You are emulating an API from another like Python or Java.

It may make sense for the bar property to be assignable in cases 1 and 3 but not 2. In case 2, since you cannot change the type of a value, you cannot mutate the bar property that is implied by that type. In such cases, you probably want to disallow mutation of the bar property of the other related types, either by making them immutable or by explicitly making foo.bar = baz an error.

Answer 54 · 2014-03-27T15:55:19.000Z

@tknopp, I wasn't suggesting using x.foo for writing and foo(x) for reading. My suggestion was that if a property is both readable and writable, then probably you want to both read and write it with x.foo.

Answer 55 · 2014-03-27T15:58:41.000Z

@StefanKarpinski: But isn't length a case of 3. where the sizes are whats usually stored and length is the product of the sizes?

I see Jeffs point though that this change would make these functions not first class anymore.

@stevengj: I see. Sorry for confusing that.

Answer 56 · 2014-03-27T16:14:06.000Z

@tknopp – the length is derived from the sizes, but not equivalent to them. If you know the sizes you can compute the length but not vice versa. Of course, this is a bit of a blurry line. The main reason this is acceptable for lufact is that we haven't figured out a better API than that. Another approach would be to define upper and lower generic functions that give the upper-triangular and lower-triangular parts of general matrices. However, this approach doesn't generalize to QR factorizations, for example.

Answer 57 · 2014-03-27T16:22:50.000Z

It's telling that there are only a few cases that really seem to ask for this syntax: pycall, factorizations, and maybe dataframes.
I'm quite worried about ending up with a random jumble of f(x) vs. x.f; it would make the system much harder to learn.

Answer 58 · 2014-03-27T16:28:46.000Z

Doesn't point 1 of @StefanKarpinski's list mean that any field of a type automatically belongs to public API?

At the moment I can tell what is the public API of a module: all exported functions and types (but not their fields). After this change, it would not be possible to tell which fields are supposed to belong to the public API and which not. We could start naming private fields a._foo or so, like in python, but that seems not so nice.

Answer 59 · 2014-03-27T16:33:24.000Z

Personally I think the DataFrames case is a little superfluous. If we do this, I'll add the functionality to DataFrames, but I find the loss of consistency much more troubling than saving a few characters.

Answer 60 · 2014-03-27T16:37:33.000Z

I would also not make the decision dependent on DataFrames, PyCall (and Gtk wants it also). Either we want it because we think that fields should be part of a public interface (because it "looks nice") or we don't want it.

Answer 61 · 2014-03-27T16:41:54.000Z

... pycall ...

and JavaCall

Answer 62 · 2014-03-27T16:46:38.000Z

Since the main use case for this seems to be interactions with non-Julia systems, what about using the proposed .. operator instead of overloading .?

Answer 63 · 2014-03-27T16:54:49.000Z

I wonder if a simpler solution here is a more general hat-tip to OO:

#we already do
A[b] => getindex(A,b)
#we could have
A.b(args...) => b(A, args...)
# while
A..b => getfield(A,::Field{:b})
# with default
getfield(A, ::Field{:b}) = getfield(A, :b)

It seems like this would allow JavaCall/PyCall to do method definitions "in" classes, while also allowing a general style if people want to have some OO type code, though it's very transparent A.b() is just a rewrite. I think this would be very natural for people coming from OO.
Also having the new getfield with A..b to allow overloading there, though overloading here is strongly discouraged and only to be used for field-like/properties (I suspect it wouldn't be used very widely due to the slight scariness of overloading getfield(A, ::Field{:field}).

Answer 64 · 2014-03-27T17:20:31.000Z

@mauro3:

Doesn't point 1 of @StefanKarpinski's list mean that any field of a type automatically belongs to public API?

That was a list of when it's ok to use foo.bar notation, not when it's necessary. You can disable the foo.bar notation for "private" fields, which would then only be accessible via foo..bar.

@karbarcca: I'm not super clear on what you're proposing here.

Answer 65 · 2014-03-27T17:25:56.000Z

fwiw, I'm a fan of taking the consenting-adults-by-convention approach and making . fully overloadable. I think the double-dot proposal would lead to more confusion rather than less.

Answer 66 · 2014-03-27T17:31:12.000Z

@ihnorton – as in you're against using a..b as the (unoverloadble) core syntax for field access or against using a..b for the overloadable syntax?

Answer 67 · 2014-03-27T17:39:16.000Z

One of julia's best features is its simplicity. Overloading x.y feels like the first step on the road to C++.

Answer 68 · 2014-03-27T17:44:26.000Z

@StefanKarpinski but then this would mean quite a shift in paradigm from default private fields to default public fields.

A realization I just had, probably this was clear to others all along. Full OO-style programming can be done with the basic .-overloading (albeit it's ugly). Defining

getfield(x::MyType, ::Field{:foo}) = args -> foofun(x, args...) # a method, i.e. returns a function
getfield(x::MyType, ::Field{:bar}) = x..bar+2                  # field access, i.e. returns a value

then x.foo(a,b) and x.bar work. So the discussion on whether x.size(1) should be implemented or only x.size is moot.

Answer 69 · 2014-03-27T17:48:16.000Z

@StefanKarpinski against generally overloadable a..b and lukewarm about a..b -> Core.getfield(a,b).

Answer 70 · 2014-03-27T17:51:05.000Z

I do start to see the need for another operator here, but a..b is not quite convincing. Needing two characters feels very... second class. Maybe a@b, a$b, or a|b (bitwise operators are just not used that often). An outside possibility is also ab`, which the parser could probably distinguish from commands.

I'd be ok with using the "ugly" operator for primitive field access. I think experience has shown that since it is a concrete operation it is rarely used, and indeed somewhat dangerous to use.

Answer 71 · 2014-03-27T17:51:39.000Z

I'm suggesting allowing simulating OO single dispatch by the convention/rewriting:

type Type end
# I can define methods with my Type as 1st argument
method(T, args...) = # method body
t = Type()
# then I can call that method, exactly like Java/Python methods, via:
t.method(args...)
# so
t.method(args...) 
# is just a rewrite to
method(t, args...)

The justification here is we already do similar syntax rewrites for getindex/setindex!, so let's allow full OO syntax with this. That way, PyCall and JavaCall don't have to do

my_dna[:find]("ACT")
# they can do
my_dna.find("ACT")
# by defining the appropriate find( ::PyObject, args...) method when importing modules from Python/Java

I like this because it's a fairly clear transformation, just like getindex/setindex, but allows simulating a single dispatch OO system if desired, particularly for OO language packages.

I was then suggesting the use of the .. operator for field access, with the option to overload. The use here would be allowing PyCall/JavaCall to simulate field access by overloading calls to .., allowing DataFrames to overload .. for column access, etc. This would also be the new default field access in general for any type.

Answer 72 · 2014-03-27T17:56:29.000Z

I do have a soft spot for pure syntax rewrites. It's arguably a bad thing that you can write a.f(x) right now and have it work but mean something confusingly different than most OO languages.

Of course the other side of that coin is horrible style fragmentation, and the fact that a.f has nothing in common with a.f(), causing the illusion to break down quickly.

Answer 73 · 2014-03-27T17:58:24.000Z

One of julia's best features is its simplicity. Overloading x.y feels like the first step on the road to C++.

Same feeling here. I was considering, if the actual need for this is really for a limited number of interop types, what about only making it valid if explicitly asked in the type declaration? E.g. an additional keyword besides type and immutable could be ootype or something.

Answer 74 · 2014-03-27T18:01:13.000Z

and the fact that a.f has nothing in common with a.f(), causing the illusion to break down quickly.

Can you clarify what this means @JeffBezanson?

Answer 75 · 2014-03-27T18:02:02.000Z

I'd expect that a.f is some kind of method object if a.f() works.

Answer 76 · 2014-03-27T18:03:07.000Z

Ah, got it. Yeah, you definitely wouldn't be able to do something like map(t.method,collection).

Answer 77 · 2014-03-27T18:04:28.000Z

I'm going to agree with @mauro3 that by allowing obj.method(...), there is a risk that new users may just see julia as another object-oriented language trying to compete with python, ruby etc., and not fully appreciate the awesomeness that is multiple-dispatch. The other risk is that standard oo style then become predominant, as this is what users are more familiar with, as opposed to the more julian style developed so far.

Since the use case, other than DataFrames, is restricted to inter-op with oo languages, could this just all be handled by macros? i.e. @oo obj.method(a) becomes method(obj,a)?

Answer 78 · 2014-03-27T18:15:32.000Z

@karbarcca this would mean that automatically everything could be written in two ways:

x = 3
x.sin()
sin(x)
x + 2
x.+(2) # ?!

Answer 79 · 2014-03-27T18:23:31.000Z

@karbarcca #1974 (comment)

t.method(args...)

is just a rewrite to

method(t, args...)

That would not be necessary to PyCall since the overloadable dot could just be used to call pyobj[:func] by pyobj.func. Then pyobj.func() would be in fact (pyobj.func)() .

Answer 80 · 2014-03-27T18:26:08.000Z

Rewriting a.foo(x) as foo(a, x) would not solve the problem for PyCall, because foo isn't and cannot be a Julia method, it is something I need to look up dynamically at runtime. I need to rewrite a.foo(x) as getfield(a, Field{:foo})(x) or similar [or possibly as getfield(a, Field{:foo}, x)] so that my getfield{S}(::PyObject, ::Type{Field{S}}) can do the right thing.

Answer 81 · 2014-03-27T18:32:15.000Z

@JeffBezanson #1974 (comment)

I do start to see the need for another operator here, but a..b is not quite convincing. Needing two characters feels very... second class

I would say that, on the other hand, .. is typed much more quickly than $, @ or | as no shift key needs to be pressed, and while being two characters the finger stays on the same key 😄

Answer 82 · 2014-03-27T18:36:37.000Z

@stevengj Ah, I see. But my point still stands, that the rewriting could be done with a macro.

Answer 83 · 2014-03-27T19:07:20.000Z

For JavaCall, I actually only need essentially a unknownProperty handler. I dont actually need to rewrite or intercept existing property read or write. So would a rule that "a.x gets re-written to getfield(a, :x) only when x is not an existing property" help keep things sane?

Answer 84 · 2014-03-27T19:08:50.000Z

@simonbyrne, requiring a macro would defeat the desire for clean and transparent interlanguage calling. Also, it would be hard to make it work reliably. For example, suppose that you have a type Foo; p::PyObject; end, and for an object f::Foo you want to do foo.p.bar where bar is a Python property lookup. It's hard to imagine a macro that could reliably distinguish the meanings of the two dots in foo.p.bar.

Honestly, I don't see the big deal with style. High-quality packages will imitate the style of Base and other packages where possible, and some people will write weird code no matter what we do. If we put dot overloading in a later section of the manual, and recommend its use only in a few carefully selected cases (e.g. inter-language interoperability, read/write properties, maybe for avoiding namespace pollution for things like factor.U, and in general as a cleaner alternative to foo[:bar]), then I don't think we'll be overrun with packages using dot for everything. The main thing is to decide what we will use and recommend this for, and probably we should keep the list of recommended uses very short and only extend it as real-world needs arise.

We're not adding super-easy OO-like syntax like type Foo; bar(...) = ....; end for foo.bar(...), so that will limit temptation for newbies too.

Answer 85 · 2014-03-27T19:55:08.000Z

I'm basically in full agreement with @stevengj here. I like a..b for real field access because it

looks similar to a.b
is less convenient, as it should be
is only slightly less convenient
has no existing meaning and we haven't found any compelling use for it in over a year
isn't horrifically weird like ab`

Answer 86 · 2014-03-27T20:03:33.000Z

With this change and possibly (#2403) will nearly all of Julia's syntax be overloadable? (The ternary operator is the only exception I can think of) That almost all syntax is lowered to overloadable method dispatch seems to be a strongly unifying feature to me.

Answer 87 · 2014-03-27T20:08:26.000Z

I agree that it's actually kind of a simplification. The ternary operator and && and || are really control flow, so that's kind of different. Of course that kind of argues against making a..b the real field access since then that would be the only non-overloadable syntax. But I still think it's a good idea. Consistency is good but not paramount for its own sake.

Answer 88 · 2014-03-27T20:09:45.000Z

Oh, there's also function call which is not overloadable. So basic I forgot about it.

Answer 89 · 2014-03-27T20:10:34.000Z

That is what issue #2403 addresses.

Answer 90 · 2014-03-27T20:25:16.000Z

Yep. But this is a lot closer to happening than that is.

Answer 91 · 2014-03-27T20:38:14.000Z

The only fly in the ointment for me here is that it would be really nice to use the real field access operator for modules, but that probably won't happen since nobody wants to write Package..foo.

Tab-completing after dots gets a bit ugly; technically you have to check what method x. might call to see if it's appropriate to list object field names or module names. And I hope nobody tries to define getfield(::Module, ...).

Answer 92 · 2014-03-27T20:55:12.000Z

I think that tab completing can be done like this: foo.<tab> lists the "public fields" and foo..<tab> lists the "private fields". For modules, would it be ok to just allow the default implementation of Mod.foo be Mod..foo and just tell people not to add getfield methods to Module? I mean, you can already redefine integer addition in the language – all hell breaks loose and you get a segfault but we don't try to prevent it. This can't be worse than that, can it?

Answer 93 · 2014-03-27T21:09:03.000Z

It is in fact slightly worse than that, because a programming language really only cares about naming. Resolving names is much more important than adding integers.

We don't have much choice but to have Mod.foo default to Mod..foo, but we'll probably have to use Mod..foo for bootstrapping in some places. The .. operator is extremely helpful here, since without it you can't even call Core.getfield in order to define the fallback. With it, we'd probably just remove Core.getfield and only have ...

Answer 94 · 2014-03-27T22:25:34.000Z

That's a fair point – naming is kind of a big deal in programming :-). Seems like a good way to go – only .. and no Core.getfield.

Answer 95 · 2014-03-28T06:15:48.000Z

This two ideas,

[...] put dot overloading in a later section of the manual, and recommend its use only in a few carefully selected cases @stevengj #1974 (comment)

and

[...] the preference should be to use x.property syntax as much as possible @StefanKarpinski #1974 (comment)

are clearly opposed.

I think that if the first idea is to be chosen then just creating a new .. operator for those "carefully selected cases" makes more sense.
As advantage, using ..name for cases where currently [:name] is used (DataFrames, Dict{Symbol, ...}) would be more typing/syntax friendly while clearly stating that something different from field access was happening. Moreover, the double dot in ..name could be seen as a rotated colon, a hint to the symbol syntax :name, and also there would be no problem with tab completions.
As disadvantage, the uses in PyCall et al. would be not so close to the original syntaxes (and could even be confusing for the cases when the . really must be used). But let's be honest, Julia will never be fully Python syntax compatible, and there will always be cases where one has to type a lot in Julia with PyCall to perform otherwise simple instructions in Python. The .. to emulate . could give a good balance here. (Please don't get me wrong, I really like PyCall and think it is a critical feature which deserves special care)

The second ideia, which I currently prefer, has the big decision about when property(x) or x.property must be used, which requires an elegant, well though, and clear definition, if such thing exists...
It seems that if people want an overloadable . that's because they prefer x.property API style in the first place though.
Anyway, I would prefer to see . not as a overloadable field access operator but as a overloadable "property" access operator (getprop(a, Field{:foo}) maybe?) which defaults to a non-overloadable field operator ...
Other decisions would also have to be taken, e.g., which will be used in concrete implementation code for field access, .. or .? For example, for the Ranges step example, which will be idiomatic? step(r::Range1) = one(r..start) or step(r::Range1) = one(r.start)? (not to mention the question whether step must be a method or a property).

Answer 96 · 2014-03-28T15:10:19.000Z

That's why I backed off of that angle and proposed these criteria: #1974 (comment).

Answer 97 · 2014-05-19T09:27:58.000Z

Just one thought that popped in to my head while reading this interesting thread. Export could be used to declare public fields, while all fields are visible inside the defining module, eg:

module Foo
   type Person
     name
     age
   end
   export Person, Person.name
   @property Person :age(person) = person..age + 1
end

In this situation the exported Person still looks like 'name' and 'age' except in this case age is readonly through a function that adds one. Exporting all of Person might be done as export Person.* or similar.

[pao: quotes]

Answer 98 · 2014-05-19T13:30:21.000Z

@emeseles Please be careful to use backticks to quote things that are like Julia code--this ensures formatting is maintained, and prevents Julia's macros from creating GitHub notifications for similarly-named users.

Answer 99 · 2014-08-09T05:22:45.000Z

. and .. are confusing: a clear and easy to remember sintax is something good