Handling Class Data?
Closed this issue · 12 comments
I've been chatting back and forth with Damian Conway about class data. I've been concerned about a few people insisting upon overloading the meaning of my
for class data, but Damian's finally helped me to nail down the issue. (To be fair, those people are insisting that they're not overloading my
's meaning)
When I teach OO, I stress that objects are experts. You trust them. You must always be able to trust them. When you consult the expert in their area of expertise, you need to know you can rely on them. Of course, it doesn't always work that way in the real world, but in software, unreliable software is taboo.
So when I load a class and I call a method, I damn well expect that method will return the correct results. And so long as that class is loaded, it shouldn't matter what phase of the program in which I ask it for results.
Using my
(or state
) for class data breaks those expectations. Here's an example using Object::Pad (a tip 'o the keyboard to .
#!/usr/bin/env perl
use v5.26.0;
use Object::Pad;
class Universe {
my $classdata = 3.1415927;
sub pi { return $classdata // '<undefined>'; }
}
BEGIN { say +Universe->pi(); } # <undefined> (the big bang)
CHECK { say +Universe->pi(); } # <undefined> (inflation)
INIT { say +Universe->pi(); } # <undefined> (Earth is formed)
say +Universe->pi(); # π (today)
END { say +Universe->pi(); } # π (the big crunch)
How do we fix that? Well, we could hack Perl to say "if we're in an Corinna class, we're doing to change the rules about initialization time for my
variables", or we could always see my
variable in classes and try to rewrite them to my $classdata; BEGIN { $classdata = ... }
, but these ideas are terrible. If your choices are:
- Let's go ahead and throw away the guarantees that classes should provide us with
- Or we hack the Perl language to sometimes change the internal behavior of a well-known and core built-in ...
Then you've messed up your choices somewhere.
Worse, using my
or state
variables for class data either means:
- You cannot use any standard slot attributes on them
- Or you must teach every developer when they can and cannot use slot attributes on them
So we're forever in the case of writing extra methods (as above) if we find we need to expose these to the outside world. Corinna was supposed to avoid this.
So ...
- Classes must absolutely be trusted to be correct
- We don't want to override the meaning of existing code
- We'd rather not add another keyword just for this behavior
I know I irritated a few people with this post about the topic, but I don't see any way of properly fixing this issue and sticking with my
or state
. I shouldn't have to know that Class->foo
relies on class data that may not be there depending on the phase of the Perl interpreter. I should just be able to trust it.
Damian also points out that when and if we get anonymous classes, we would easily want my
variables inside of them, but if we repurpose them for class data, we're backed into a corner.
This is really a perl-generic problem which extends entirely beyond object classes. It's just as visible in regular packages:
package EarlyVar;
my $pi = 4 * atan(1);
sub message { say "Pi is $pi" }
BEGIN { message() }
The solution to it is not to create something Cor-specific but to fix the actual problem. For example a common suggestion is something like a phaser expressions:
package EarlyVar;
BEGIN my $pi = 4*atan(1); # this entire expression, the assignment, now happens at BEGIN time
sub message { say "Pi is $pi" }
This is really just a tiny bit of syntax sugar for:
package EarlyVar;
my $pi; BEGIN { $pi = 4 * atan(1); }
but is now far more useful and can be applied in many other places - both within Cor and without.
I've begun a pre-RFC discussion on p5p@:
https://www.nntp.perl.org/group/perl.perl5.porters/2021/11/msg261890.html
I would additionally consider this to fall among the other issues with class data where the use case is exceedingly rare. In practice the only way to encounter this would be a BEGIN block within a class that references its own class data, mirroring the current failure mode described above.
In fact, the real problem here is that 'sub' has weird behaviour, and that's leading to people making bad assumptions about how things will work. Having a 'method' keyword that does its installation work at runtime like all the Devel::Declare based stuff was intentionally to avoid this confusing inconsistency.
Perl classes that are meant to be used in-line in a file should already go in a BEGIN block, that's how you emulate 'inlining a moudle like thing' for quick script examples, and if you're already using phasers in your code - which is required for the example to show any sort of issue in the first place - you should, well, be aware of the use of phasers in your code, surely.
Fixing the fact that perl named sub declarations are weirdly inconsistent with much of the rest of the language by adding more Corinna-specific inconsistences seems like the wrong path to be taking here.
EDIT: Basically, I appreciate the 'different bits of the inside of a Corinna class are inconsistent' being an issue, but making 'method' happen at runtime (especially since we don't know we have all the information to generate a constructor until the end of the class block) would seem like a better way to bring consistency here.
Thinking about this more, I don't see why this, for example, wouldn't seem natural to a user:
my $resource_bundle = ResourceBundle->new;
class Foo {
my $resources = $resouce_bundle->get_resources_for(__PACKAGE__);
...
}
which will work fine if we simply stop trying to wrap everything inside 'class' in an implicit BEGIN and instead focus on things consistently happening in the order they exist in the file.
Since 'method' is a new keyword, making that not-special-cased would seem fine - and then that also means that anonymous classes can have their own per-instantiation and per-template attributes simply by using state vs. my -
my $class = class {
state $common = ...;
my $specific = ...;
...
};
(the keyword/identifier/modifier thing overall seems nice to me, but if anything this seems like it goes better with that goal than adding extra special case syntax)
Arguments about consistency have to keep in mind that this is meant to be part of Perl, and thus has to be consistent with the rest of the language, not only itself. Class data is just a global with better PR. my
variables already exist and serve this purpose. Adding a new concept that is almost identical to an existing concept in the language does not aid consistency.
If the desire was specifically to initialize data at compile time, that applies equally to any use of my
as a global. A solution to that should not only apply to class data. LeoNerd's BEGIN expr
proposal is a better way to address this.
method
applying at runtime would make the order of everything much more clear, especially if anonymous classes were added. sub
has to happen at compile time, because it has an impact on parsing. Methods shouldn't have any impact on parsing, so there isn't any need to lift them to compile time.
MST comments on Damian's blog post -
While normally i'd be for anything Damian is for and MST is against 😁 (½ JK), I'm going to have to agree with MST at least here -
- This child of two Eng. Lit. graduates dearly loves your proposed alternatives for him but fears how much fun they might not be for people for whom english isn't a first language.
Speaking as son of an Eng.Lit instructor (History grad with Ed. Minor and Guidance Masters+30, go figure), i concur with both clauses:
:preface
and :epilog
are delightful as literary jokes to me and thee, but alas impractical.
(Shouldn't it be :prolog
and :epilog
anyway for internal consistency? Anyway, please do use these literary allusions in the necessary explanatory narratives explaining what these things are good for, but not in the syntax.)
The key advantage of :before :after :around
is Principle of Least Surprise for those exposed to any other Post-Modern OO dialects that already have largely standardized on exactly these modifiers (in whatever syntax) for their enwraplement of methods (read, structured-monkey-patching, or why didn't i (they) leave an injection hook here?).
In this day and age, we no longer have the luxury of assuming folks using our OO system aren't fully aware of how other OO frameworks work. (Heck, our users have 12+ just on CPAN, nearly all are using at least JavaScript if not others, and very few are going to get Corinna in their baccalaureate courses.)
If wilfully calling :before
something else other than what everyone else calls it (before
), such as :preface
or :prolog
,
<reductio ad=absurdem>
we might as well demonstrate our embrace of classless society by rejecting the class
keyword and democratically renaming that keyword atom
too, or perhaps class becomes element
and instance atom
, in which case slot common or class stots and methods are :platonic
? </reductio>
(<pedant>
Actually there is something to classless OO programming. Ungar's original thesis called it Self
. Wikipedia. Supposedly JavaScript and Lua support Prototypes in lieu of Class. </pedant>
But that's not what Corinna is supposed to be, and CPAN already has at least one such. The point of the reductio joke is that unless we're rejecting OO Class paradigm as Self did, we should keep to the common lexicon as much as we can.)
KISS: Consistency with other OO is a virtue when they don't have it wrong everywhere else.
If everyone else calls them class method before after around
, we should too.
(Slot vs Attribute isn't quite fully standardized though Attribute is predominant; since P5 has already used the word :attribute
for what KIM calls :modifier
we're constrained, and we'll be excused for being forced to take the LISPish variant (arguably original!) nomenclature "slot" there. As Perl is in many ways LISPish but for syntax , and always has been, thank you St Larry, this is arguably natural?)
Overall I find Damian's and Ovid's discussion of KIM mostly persuasive, but remembering "A Foolish Consistency is the Hobgoblin of Little Minds" (Emerson) I'm not quite ready to embrace it just because KIM likes KIMsistency, and must ask, is this a Wise Consistency or a Foolish One? Is KIM a Hobgoblin in disguise?
In general i'm ok with exceptions to a grammar's meta-model for exceptional situations, as
- i don't write the parsers, so not my flying monkeys, not my circus (P5P parser wizards might get a vote...);
- making odd things look different is fair warning, in which case the louder the better;
- the left margin is more likely to be aligned than anything else; i can't rely on other peoples code having Damian's elegant tabular alignment of KIM declarators (unless I feed their code through PerlTidy::Corinna and PerlCritic::Policy::Corinna::PBP before reading), so it will alas often be easier to see the rather important
before
if it's before themethod
keyword, etc.
One could adapt the KIM meta-model to allow for K column to include compound keywords (before method
, common slot
, common method
) and think of these keywords as compound nouns as we do in Natural Language (itself a compound noun). Humans are pretty good at chunking compound nouns when reading, and i think even Parsers can too now.
I do agree with Damian that my
reuse as implicit context sensitive syntax for common slots is out-dated Perl trying to have OO look more Perlish, not the mature OO that Corinna intends to be, must be to succeed. That common slots and common methods work together is convincing to me that they should be similarly handled in syntax. So either common slot ...
or slot $name :common
and likewise method it is, and same for both.
Which hurts less?
Is there more value in the rigid predictable consistency of KIMsistency of minimal set of single-word keywords, all modification in :modifiers , or in the "Why is this slot unlike all other slots? Aha!" making odd things odd again?
Either is good enough for me.
If Corinna could be an effective carrot for effectively proselytizing Damian's delectable tabular style of coding (as exemplified in PBP and his KIMxamples in cited post), that would be enough to convince me of :modifiers
KIMsistency and abandon easy to find left margin shouting before
red flags.
To do so requires PerlTidy::Corinna
and PerlCritic::Policy::Corinna::PBP
enforce tabular alignment of KIM Kolumns, not just on adjacent lines but across small spans of vertical spacing and comment ... but not going crazy carrying tab alignment 3 pages forward where the names are different lengths so different local alignment is needed.
(I'm blithely assuming the limp compromise of allowing both prefix keywords and :modifier forms interchangeably at the coders' whim or per local PerlCritic::Policy::Local::Corinna
policy is probably too much freedom when our meta-goal is to trim down to one core OO framework and style for the next decade of Perl5.)
Re choice of :common
or some other word: Since we can't call "class methods" and "class attributes" by their traditional OO/CS theory names (that ship has sailed, bad reuse of keywords, is class inside class a lexical private class, a class method or slot, or just an error? nope, not happening), then common slots, common methods (whether prefix keyword or :modifier suffix) work for me better than Damian's other suggested infix-ish suffix alternatives (:joint
:mutual
:classwide
). Joint and mutual have nuance that make me expect threads or at least promise-like asynchrony or quantum junctions or something else weird or wonderful. If it weren't just so many more letters, the last :classwide
would be closer to how my SmallTalk brain thinks about OO even after all these decades; but even with my beloved 34" monitor (=86cm 🇪🇺 🇺🇳 ), i would begrudge those extra columns of :classwide
, it's just too wide.
There is even CS heritage in :common
, as COMMON
was the shared memory keyword in aulde FORTRAN
, and so it would be here as well; memory shared between a different sort of computational units than in old Fortran, but it still denotes sharing, just as the Town Common was a shared asset for all the community's cattle and children.
On the off chance people are not familiar with Damian Conway, he is, unlike most of us self-taught hackers, is a computer scientist with a PhD from Monash university and taught honors courses in advanced objected oriented programming there. He's been doing OO programming for 35 years across a variety of languages. He also has an extensive background in language design.
Damian supports a better syntax for class data (not using my
variables) and his last reply on the topic is illuminating.
mst (shadowcat-mst above) had written:
...whether a particular
my
is a class data slot, or just an ordinary lexical variable (perhaps, for example, acting as private shared memory for two closures within the class).
To me, that would be just as much class data as anything else, but if there's a principled distinction...that I've missed I'd be delighted if you could find the time to elaborate on what it is...
To which Damian replied:
Happy to...
- Class data slots are part of the representation of the class.
- Internal
my
variables that act as communication channels for closures are part of the implementation of the class.
- Class data slots are meaningful elements of the problem space and hence often also part of the class API (assuming they have at least a read-accessor).
- Internal
my
variables are incidental elements in the solution space and are never part of the API; they are simply a private implementation detail.
- Class data slots are “things”, in the sense that they represent some attribute or property of the objects that the class models.
- Internal
my
variables are not “things”; they are merely “means” or “mechanisms” or “tedious and slightly obscure necessities because the language has no other way to implement this particular behaviour”.
- Because class data slots represent some essential aspect of the class, they are never trivially replaceable in a design.
- Because incidental
my
variables don’t represent anything essential component of the class itself, but merely facilitate a
particular implementation of its behaviour, such variables are always intrinsically replaceable.
- Class data slots, being part of the representation and API of a class, are constructs of considerable interest to automated OO code analysers, summarizers, linters, translators, and refactorers...and therefore need to be easily identifiable by these tools.
- Incidental
my
variables, being a mere internal implementation detail, are of marginal or no interest to such tools.
In other words, while it’s perfectly true that class data slots and incidental my
variables do both work the same, they each mean something entirely different. They have different roles and purposes, they inhabit different levels of abstraction in a design, they have different degrees of significance and permanence within the code, they are of primary interest to different individuals (the designer vs the implementer), and they need to be easily detectable and distinguishable by the disparate tools that those different individuals might wish to create and employ.
From here: http://blogs.perl.org/users/damian_conway/2021/11/a-dream-resyntaxed.html#comment-1810991
I find myself actually quite liking
class Foo :isa(Something) {
my $internal = 0;
common $state; # MOP-visible but otherwise private
slot $data :accessor { 0 }; # instance variable
common method foo (...) { ... } # class method
method bar (...) { ... } # instance method
# modifier applied to superclass method
modify method baz :before (...) { ... }
}
or if we're going to go strict KIM
class Foo :isa(Something) {
my $internal = 0;
common $state; # MOP-visible but otherwise private
slot $data :accessor { 0 }; # instance variable
method foo :common (...) { ... } # class method
method bar (...) { ... } # instance method
# modifier applied to superclass method
modify baz :before (...) { ... }
}
but I don't mind 'common method' and 'modify method' since they seem nicely symmetrical with 'my sub' to me.
Closing this as we've settled on :common
to declare both class data and methods.