morrisonlevi/Ardent

Define some basic naming conventions.

morrisonlevi opened this issue ยท 17 comments

Currently we have names like first and last but also have names like getLeft and getHeight. I'm not sure which convention I'd like to use just yet, but having both rubs me the wrong way.

I agree. This is something I've struggled with as well and I seem to change my mind constantly. I like both the explicitness of a get verb prefix, but I also like the brevity of names like first and last. I think lately I've been leaning towards the get version because while the one-word names are nice, I invariably run out of them in a class with any more than a small number of methods. What you definitely don't want (IMO) is to mix the approaches. Also, the classic naming rule of "methods should be verbs" argues against the one-word versions because the single words are usually more like adjectives than verbs.

+1 for getters :)

total agreement that the only really wrong answer is mixing the two.

fwiw, i'm more inlined in the direction of not using get. i associate getters with domain objects and that type of software architecture. but collection, traversal and algorithmic behaviors are defined by math, or at least something vaguely math-ish. these structures kinda intrinsically are the thing you're doing to them, in a way that isn't quite so true with more typical domain objects.

totally just my 2c.

I agree that mixing them is a bad idea; semantically I would say that getFirst() conveys "no side effects" better than first() which may reset internal pointers to get there.

+1 on getX() imho it makes the intent more obvious.

๐Ÿ‘ for get*. Another reason in favor of get* would be reserved word conflicts; because you can getList, but you can't list. While this comes up in-often, it can sure be a piss-off when it bites.

I'm leaning more towards omitting the get, but you guys do bring up some good points. I will continue to think about this.

also, perhaps methods could be normalized for better integration with other interfaces, for those who want to implement multiple interfaces (as php arrays) in the same class. For example...

interface Queue
{
function addLast($item);
function getAndRemoveFirst($throwIfEmpty = true);
function getFirst($throwIfEmpty = true);
}

interface Stack
{
function addLast($item);
function getAndRemoveLast($throwIfEmpty = true);
function getLast($throwIfEmpty = true);
}

interface Deque extends Stack, Queue
{
function addFirst($item);
}

maybe this is conceptually wrong?

Queue and Stack are already distinct and you could implement both.

not that is impossible now. talking about using a different naming. addLast() and getFirst() rather than enqueue() and dequeue()

Why not follow the convention of the SPL here? Once it's not working, you can start with prefixes (get, is, has), most likely for reserved keywords (IIRC that's the first case of "not working"). Because once you start with prefixes you can't stop. But getters is more for field-access and here are mostly interfaces or abstract classes. Whether or not it will represent a field is an implementation detail and most likely not necessary to know for the interface.

And the counter-argument:

getters signal state. data-structures normally represent state. sure those are bulky and we do the mistakes of Java however in SPL we have current(), key(), count() etc. .

I know that w/o get prefix you run into problems with reserved names more quickly in PHP. So more often in the last times when I needed to decide on that and if I'm lazy I decide for the getters. It's nothing I'm proud of but it's working quite well. It's also with is and has prefixes.

Perhaps keep the "core" methods from Iterator and Countable and for the own use getter-style.

Conclusion:

I personally prefer the non-getter style, but I know I run into problems with that time-to-time and getters is often accepted but also the lazy decision. Type less, do more, so if you can keep getters out (as long as possible), the interface is less bulky and more precise. It could also allow concrete classes implementing an interface to add getters as they wish. So yes, I'm leaning towards to not introduce get prefixes II think.

In the spirit of good code, keep waste out as much as possible.

I'd keep SPL interfaces' methods aliases of the properly-named ones. SPL is not very consistent.

Also, "has" and "is" aren't good prefix either.

getIsEmpty()
setIsEmpty($bool)

i know this doesn't exist, but explains why one should have get/set also for booleans

you can't take java as example because java supports method overloading, while php doesn't - and really hope it will never do

prefixes should be get(getHas*, getIs*) add, set, remove, replace(or swap, it's shorter)

getHasItem($item)
getHasItemsAll(Set $items)
getHasItemsNone(Set $items)
getIsEmpty()

suffixes should be All, None, or maybe

getHasAllOfItems(Set $items)
getHasNoneOfItems(Set $items)

no prefix at all would be confusing. what does ->empty() do? does it clear the collection or tells if the collection is empty?

HTH

It's also worth mentioning that the problem with reserved words in classes is being limited in PHP7 (not removed entirely, as some words like public, protected and private have to be reserved, but fewer words will cause conflicts). See this RFC.

I am still not sure what to do here. Leaving open.

I'd like to brieflyThis post has a tl;dr enumerate some of my thoughts on this topic...:

  1. Other languages tend to use single-word methods for things such as height, size, peeking, pushing, popping...:
  2. Method names should be as succinct & intuitive as possible. Ambiguous names should be changed.
  3. Getters (get*) should just retrieve the corresponding instance variable (getX should return $this->x). If the operation is more complex, or has nuances not-sufficiently described by just "get*", then the name could probably be better. Additionally, as @hakre said, we're dealing with interfaces/abstractions -- whether or not a method will represent an instance variable is an implementation detail.

For what it's worth, I would suggest:

  • Queue::first โ†’ Queue::peek: peek is used across multiple languages and is clearer than both first and getFirst.
  • Stack::last โ†’ Stack::peek: again, peek is a widely-used term for safely viewing the next item out of a one-way structure; it is clearer than both last and getLast.
  • BinaryTree::left โ†’ BinaryTree::getLeft: whilst both are very clear, I would opt for getLeft because...:
    • It is a valid getter (it retrieves the instance variable $this->left).
    • Object methods should read as queries or commands, and it should be clear which are which:
      • "Does this BTree have a left node?" โ†’ $tree->hasLeft().
      • "Add the number 5 to this AVLTree!" โ†’ $tree->add(5).
      • "Give me the left node of this tree!" โ†’ $tree->getLeft().
    • I believe it implies O(1) complexity (which is the case).
  • BinaryTree::getHeight โ†’ BinaryTree::height: whilst the intent of both is fairly clear, getHeight suggests to me a simple operation, when, really, height calculation can be expensive.

With regards to some of the other issues addressed:

  • Reserved words: As mentioned by @darkphoenixff4, this will become less of an issue.
  • Mixing get* with one-worders: Curiously, no reasons have been provided for as to why this is bad. I believe the main goal in choosing an identifier is to improve readability as much as possible, and -- while consistency helps (f.e., using mostly get* or one-worders) -- it does not always guarantee the best readability.

Hope these ramblings help.

TL;DR: I suggest not to use get* unless:

  • It's an expected O(1) retrieval operation and it refers to an instance variable, or...
  • You can't think of a better name.

I can't do Queue::peek and Stack::peek because then you can't implement Queue and Stack in the same structure (like a Deque).

I don't think queues and stacks should be implemented within a single structure because they have kinda conflicting definitions: Queue::enqueue & Stack::push (or Queue::dequeue & Stack::pop, depending on which way you look at it) do exactly the same thing (try merging these two diagrams: 1 2); unless you're thinking that enqueue and dequeue both function on the same end of the structure, but then you get stuff like this:

$struct->enqueue("First in");
$struct->enqueue("Last in");
echo $struct->dequeue(); // outputs: "Last in"

If you wish to merge them "nicely", you'd be forced to consider @WesNetmo's solution; however, I'm not too fond of this because the Stack, as a concept, doesn't really have a front/back/first/last.

With regards again to peek, if you should adopt this, I would suggest refactoring it to its own interface (since the behaviour applies to multiple structures).