Handling Class Data?

Question

Handling Class Data?

Closed this issue 3 years ago · 12 comments

I've been chatting back and forth with Damian Conway about class data. I've been concerned about a few people insisting upon overloading the meaning of my for class data, but Damian's finally helped me to nail down the issue. (To be fair, those people are insisting that they're not overloading my's meaning)

When I teach OO, I stress that objects are experts. You trust them. You must always be able to trust them. When you consult the expert in their area of expertise, you need to know you can rely on them. Of course, it doesn't always work that way in the real world, but in software, unreliable software is taboo.

So when I load a class and I call a method, I damn well expect that method will return the correct results. And so long as that class is loaded, it shouldn't matter what phase of the program in which I ask it for results.

Using my (or state) for class data breaks those expectations. Here's an example using Object::Pad (a tip 'o the keyboard to .

#!/usr/bin/env perl

use v5.26.0;
use Object::Pad;

class Universe {
    my $classdata = 3.1415927;
    sub pi { return $classdata // '<undefined>'; }
}

BEGIN { say +Universe->pi(); }    # <undefined> (the big bang)
CHECK { say +Universe->pi(); }    # <undefined> (inflation)
INIT  { say +Universe->pi(); }    # <undefined> (Earth is formed)
say         +Universe->pi();      # π (today)
END   { say +Universe->pi(); }    # π (the big crunch)

How do we fix that? Well, we could hack Perl to say "if we're in an Corinna class, we're doing to change the rules about initialization time for my variables", or we could always see my variable in classes and try to rewrite them to my $classdata; BEGIN { $classdata = ... }, but these ideas are terrible. If your choices are:

Let's go ahead and throw away the guarantees that classes should provide us with
Or we hack the Perl language to sometimes change the internal behavior of a well-known and core built-in ...

Then you've messed up your choices somewhere.

Worse, using my or state variables for class data either means:

You cannot use any standard slot attributes on them
Or you must teach every developer when they can and cannot use slot attributes on them

So we're forever in the case of writing extra methods (as above) if we find we need to expose these to the outside world. Corinna was supposed to avoid this.

So ...

Classes must absolutely be trusted to be correct
We don't want to override the meaning of existing code
We'd rather not add another keyword just for this behavior

I know I irritated a few people with this post about the topic, but I don't see any way of properly fixing this issue and sticking with my or state. I shouldn't have to know that Class->foo relies on class data that may not be there depending on the phase of the Perl interpreter. I should just be able to trust it.

Answer 1 · 2021-11-15T09:55:32.000Z

Damian also points out that when and if we get anonymous classes, we would easily want my variables inside of them, but if we repurpose them for class data, we're backed into a corner.

Answer 2 · 2021-11-15T11:32:03.000Z

This is really a perl-generic problem which extends entirely beyond object classes. It's just as visible in regular packages:

package EarlyVar;
my $pi = 4 * atan(1);
sub message { say "Pi is $pi" }

BEGIN { message() }

The solution to it is not to create something Cor-specific but to fix the actual problem. For example a common suggestion is something like a phaser expressions:

package EarlyVar; 
BEGIN my $pi = 4*atan(1);   # this entire expression, the assignment, now happens at BEGIN time
sub message { say "Pi is $pi" }

This is really just a tiny bit of syntax sugar for:

package EarlyVar;
my $pi; BEGIN { $pi = 4 * atan(1); }

but is now far more useful and can be applied in many other places - both within Cor and without.

Answer 3 · 2021-11-15T12:07:32.000Z

I've begun a pre-RFC discussion on p5p@:

https://www.nntp.perl.org/group/perl.perl5.porters/2021/11/msg261890.html

Answer 4 · 2021-11-15T14:25:30.000Z

I would additionally consider this to fall among the other issues with class data where the use case is exceedingly rare. In practice the only way to encounter this would be a BEGIN block within a class that references its own class data, mirroring the current failure mode described above.

Answer 5 · 2021-11-15T16:55:35.000Z

In fact, the real problem here is that 'sub' has weird behaviour, and that's leading to people making bad assumptions about how things will work. Having a 'method' keyword that does its installation work at runtime like all the Devel::Declare based stuff was intentionally to avoid this confusing inconsistency.

Perl classes that are meant to be used in-line in a file should already go in a BEGIN block, that's how you emulate 'inlining a moudle like thing' for quick script examples, and if you're already using phasers in your code - which is required for the example to show any sort of issue in the first place - you should, well, be aware of the use of phasers in your code, surely.

Fixing the fact that perl named sub declarations are weirdly inconsistent with much of the rest of the language by adding more Corinna-specific inconsistences seems like the wrong path to be taking here.

EDIT: Basically, I appreciate the 'different bits of the inside of a Corinna class are inconsistent' being an issue, but making 'method' happen at runtime (especially since we don't know we have all the information to generate a constructor until the end of the class block) would seem like a better way to bring consistency here.

Answer 6 · 2021-11-15T18:47:22.000Z

Thinking about this more, I don't see why this, for example, wouldn't seem natural to a user:

my $resource_bundle = ResourceBundle->new;

class Foo {
    my $resources = $resouce_bundle->get_resources_for(__PACKAGE__);
    ...
}

which will work fine if we simply stop trying to wrap everything inside 'class' in an implicit BEGIN and instead focus on things consistently happening in the order they exist in the file.

Since 'method' is a new keyword, making that not-special-cased would seem fine - and then that also means that anonymous classes can have their own per-instantiation and per-template attributes simply by using state vs. my -

my $class = class {
    state $common = ...;
    my $specific = ...;
    ...
};

(the keyword/identifier/modifier thing overall seems nice to me, but if anything this seems like it goes better with that goal than adding extra special case syntax)

Answer 7 · 2021-11-16T00:30:21.000Z

Arguments about consistency have to keep in mind that this is meant to be part of Perl, and thus has to be consistent with the rest of the language, not only itself. Class data is just a global with better PR. my variables already exist and serve this purpose. Adding a new concept that is almost identical to an existing concept in the language does not aid consistency.

If the desire was specifically to initialize data at compile time, that applies equally to any use of my as a global. A solution to that should not only apply to class data. LeoNerd's BEGIN expr proposal is a better way to address this.

method applying at runtime would make the order of everything much more clear, especially if anonymous classes were added. sub has to happen at compile time, because it has an impact on parsing. Methods shouldn't have any impact on parsing, so there isn't any need to lift them to compile time.

Answer 8 · 2021-11-16T08:19:01.000Z

MST comments on Damian's blog post -

While normally i'd be for anything Damian is for and MST is against 😁 (½ JK), I'm going to have to agree with MST at least here -

This child of two Eng. Lit. graduates dearly loves your proposed alternatives for him but fears how much fun they might not be for people for whom english isn't a first language.

Speaking as son of an Eng.Lit instructor (History grad with Ed. Minor and Guidance Masters+30, go figure), i concur with both clauses:
:preface and :epilog are delightful as literary jokes to me and thee, but alas impractical.
(Shouldn't it be :prolog and :epilog anyway for internal consistency? Anyway, please do use these literary allusions in the necessary explanatory narratives explaining what these things are good for, but not in the syntax.)

The key advantage of :before :after :around is Principle of Least Surprise for those exposed to any other Post-Modern OO dialects that already have largely standardized on exactly these modifiers (in whatever syntax) for their enwraplement of methods (read, structured-monkey-patching, or why didn't i (they) leave an injection hook here?).

In this day and age, we no longer have the luxury of assuming folks using our OO system aren't fully aware of how other OO frameworks work. (Heck, our users have 12+ just on CPAN, nearly all are using at least JavaScript if not others, and very few are going to get Corinna in their baccalaureate courses.)

If wilfully calling :before something else other than what everyone else calls it (before), such as :preface or :prolog,
<reductio ad=absurdem> we might as well demonstrate our embrace of classless society by rejecting the class keyword and democratically renaming that keyword atom too, or perhaps class becomes element and instance atom, in which case slot common or class stots and methods are :platonic ? </reductio>

(<pedant>Actually there is something to classless OO programming. Ungar's original thesis called it Self. Wikipedia. Supposedly JavaScript and Lua support Prototypes in lieu of Class. </pedant> But that's not what Corinna is supposed to be, and CPAN already has at least one such. The point of the reductio joke is that unless we're rejecting OO Class paradigm as Self did, we should keep to the common lexicon as much as we can.)

KISS: Consistency with other OO is a virtue when they don't have it wrong everywhere else.
If everyone else calls them class method before after around, we should too.

(Slot vs Attribute isn't quite fully standardized though Attribute is predominant; since P5 has already used the word :attribute for what KIM calls :modifier we're constrained, and we'll be excused for being forced to take the LISPish variant (arguably original!) nomenclature "slot" there. As Perl is in many ways LISPish but for syntax , and always has been, thank you St Larry, this is arguably natural?)

Overall I find Damian's and Ovid's discussion of KIM mostly persuasive, but remembering "A Foolish Consistency is the Hobgoblin of Little Minds" (Emerson) I'm not quite ready to embrace it just because KIM likes KIMsistency, and must ask, is this a Wise Consistency or a Foolish One? Is KIM a Hobgoblin in disguise?

In general i'm ok with exceptions to a grammar's meta-model for exceptional situations, as

i don't write the parsers, so not my flying monkeys, not my circus (P5P parser wizards might get a vote...);
making odd things look different is fair warning, in which case the louder the better;
the left margin is more likely to be aligned than anything else; i can't rely on other peoples code having Damian's elegant tabular alignment of KIM declarators (unless I feed their code through PerlTidy::Corinna and PerlCritic::Policy::Corinna::PBP before reading), so it will alas often be easier to see the rather important before if it's before the method keyword, etc.

One could adapt the KIM meta-model to allow for K column to include compound keywords (before method, common slot, common method) and think of these keywords as compound nouns as we do in Natural Language (itself a compound noun). Humans are pretty good at chunking compound nouns when reading, and i think even Parsers can too now.

I do agree with Damian that my reuse as implicit context sensitive syntax for common slots is out-dated Perl trying to have OO look more Perlish, not the mature OO that Corinna intends to be, must be to succeed. That common slots and common methods work together is convincing to me that they should be similarly handled in syntax. So either common slot ... or slot $name :common and likewise method it is, and same for both.

Which hurts less?
Is there more value in the rigid predictable consistency of KIMsistency of minimal set of single-word keywords, all modification in :modifiers , or in the "Why is this slot unlike all other slots? Aha!" making odd things odd again?

Either is good enough for me.

If Corinna could be an effective carrot for effectively proselytizing Damian's delectable tabular style of coding (as exemplified in PBP and his KIMxamples in cited post), that would be enough to convince me of :modifiers KIMsistency and abandon easy to find left margin shouting before red flags.
To do so requires PerlTidy::Corinna and PerlCritic::Policy::Corinna::PBP enforce tabular alignment of KIM Kolumns, not just on adjacent lines but across small spans of vertical spacing and comment ... but not going crazy carrying tab alignment 3 pages forward where the names are different lengths so different local alignment is needed.

(I'm blithely assuming the limp compromise of allowing both prefix keywords and :modifier forms interchangeably at the coders' whim or per local PerlCritic::Policy::Local::Corinna policy is probably too much freedom when our meta-goal is to trim down to one core OO framework and style for the next decade of Perl5.)

Re choice of :common or some other word: Since we can't call "class methods" and "class attributes" by their traditional OO/CS theory names (that ship has sailed, bad reuse of keywords, is class inside class a lexical private class, a class method or slot, or just an error? nope, not happening), then common slots, common methods (whether prefix keyword or :modifier suffix) work for me better than Damian's other suggested infix-ish suffix alternatives (:joint :mutual :classwide). Joint and mutual have nuance that make me expect threads or at least promise-like asynchrony or quantum junctions or something else weird or wonderful. If it weren't just so many more letters, the last :classwide would be closer to how my SmallTalk brain thinks about OO even after all these decades; but even with my beloved 34" monitor (=86cm 🇪🇺 🇺🇳 ), i would begrudge those extra columns of :classwide , it's just too wide.
There is even CS heritage in :common, as COMMON was the shared memory keyword in aulde FORTRAN, and so it would be here as well; memory shared between a different sort of computational units than in old Fortran, but it still denotes sharing, just as the Town Common was a shared asset for all the community's cattle and children.

Answer 9 · 2021-11-16T12:31:49.000Z

On the off chance people are not familiar with Damian Conway, he is, unlike most of us self-taught hackers, is a computer scientist with a PhD from Monash university and taught honors courses in advanced objected oriented programming there. He's been doing OO programming for 35 years across a variety of languages. He also has an extensive background in language design.

Damian supports a better syntax for class data (not using my variables) and his last reply on the topic is illuminating.

mst (shadowcat-mst above) had written:

...whether a particular my is a class data slot, or just an ordinary lexical variable (perhaps, for example, acting as private shared memory for two closures within the class).

To me, that would be just as much class data as anything else, but if there's a principled distinction...that I've missed I'd be delighted if you could find the time to elaborate on what it is...

To which Damian replied:

Happy to...

Class data slots are part of the representation of the class.
Internal my variables that act as communication channels for closures are part of the implementation of the class.

Class data slots are meaningful elements of the problem space and hence often also part of the class API (assuming they have at least a read-accessor).
Internal my variables are incidental elements in the solution space and are never part of the API; they are simply a private implementation detail.

Class data slots are “things”, in the sense that they represent some attribute or property of the objects that the class models.
Internal my variables are not “things”; they are merely “means” or “mechanisms” or “tedious and slightly obscure necessities because the language has no other way to implement this particular behaviour”.

Because class data slots represent some essential aspect of the class, they are never trivially replaceable in a design.
Because incidental my variables don’t represent anything essential component of the class itself, but merely facilitate a
particular implementation of its behaviour, such variables are always intrinsically replaceable.

Class data slots, being part of the representation and API of a class, are constructs of considerable interest to automated OO code analysers, summarizers, linters, translators, and refactorers...and therefore need to be easily identifiable by these tools.
Incidental my variables, being a mere internal implementation detail, are of marginal or no interest to such tools.

In other words, while it’s perfectly true that class data slots and incidental my variables do both work the same, they each mean something entirely different. They have different roles and purposes, they inhabit different levels of abstraction in a design, they have different degrees of significance and permanence within the code, they are of primary interest to different individuals (the designer vs the implementer), and they need to be easily detectable and distinguishable by the disparate tools that those different individuals might wish to create and employ.

Answer 10 · 2021-11-16T20:33:03.000Z

From here: http://blogs.perl.org/users/damian_conway/2021/11/a-dream-resyntaxed.html#comment-1810991

I find myself actually quite liking

class Foo :isa(Something) {
  my $internal = 0;
  common $state; # MOP-visible but otherwise private
  slot $data :accessor { 0 }; # instance variable
  
  common method foo (...) { ... } # class method
  method bar (...) { ... } # instance method
  
  # modifier applied to superclass method
  modify method baz :before (...) { ... }
}

or if we're going to go strict KIM

class Foo :isa(Something) {
  my $internal = 0;
  common $state; # MOP-visible but otherwise private
  slot $data :accessor { 0 }; # instance variable
  
  method foo :common (...) { ... } # class method
  method bar (...) { ... } # instance method
  
  # modifier applied to superclass method
  modify baz :before (...) { ... }
}

but I don't mind 'common method' and 'modify method' since they seem nicely symmetrical with 'my sub' to me.

Answer 11 · 2022-02-20T16:25:10.000Z

Closing this as we've settled on :common to declare both class data and methods.

Answer 12 · 2022-02-22T19:57:10.000Z

On Sun, Feb 20, 2022, 11:25 Ovid ***@***.***> wrote: Closing this as we've settled on :common to declare both class data and methods.

Works for me. ❤👍 <cheekyAside> As an old FORTRAN hand/hack, I'd spell it *:COMMON* 😉😄 (Common wasn't for *class* data in* F.4️⃣*, it was for what we now call Globals, named data memory shared with other compilation units. Fortran was progressive in having default file scope for *data.* But before OO Fortran, that was the closest analogue, up one scope, so I like it for the obscure slant reference as well as for the OO usage.)