derive4j/derive4j

Question: please clarify equals/hashCode/toString philosophy as stated in README

dminkovsky opened this issue · 4 comments

The README says:

Derive4J philosophy is to be as safe and consistent as possible. That is why Object.{equals, hashCode, toString} are not implemented by generated classes by default. Nonetheless, as a concession to legacy.

If you have a minute, can you please clarify what this means? Specifically, why is not having these methods "safe and consistent".

Thank you!

jbgi commented

Essentially it is because those methods (all object methods) breaks parametricity which is a very important property to be able to reason about types and code. To quote @tel:

If I have a type BiFunction<A,A,A> then with parametricity it’s either an infinite loop or one of the following

(x, y) -> x

or

(x, y) -> y

that’s a trivial example, but it scales very well and can eliminate huge swaths of potential implementations and vastly improve type-driven reasoning.

On the other hand, if you account for Object#hashcode/equals, BiFunction<A,A,A> has more inhabitants, even without considering impurity, for instance:

 (x, y) -> if (x.hashCode > 1000) x else y

Other pitfalls of Object#equals include:

  • it let you compare Apples and Oranges.
  • there is usually no guarantees that it is not the default implementation inherited from Object which never what you want.
  • implementing it correctly in presence of subtyping is almost impossible.

For further reading/viewing on why you would want parametricity, I recommend Tony Morris' talks on the topic.

Hope it helps!

Edit (thanks @TomasMikula): the Bifunction<A, A, A> is to be read in a context where A is a universally quantified (unrestricted) type variable. Eg:

<A> BiFunction<A, A, A> foo() {
  // either:
  return (x, y) -> x
  // or
  return (x, y) -> y
}

Thank you! I kept glossing over this part of the README and finally just had to ask :). I don't think I have the necessary background to actually understand your answer, but this gives me plenty of direction for further exploration. <3

@jbgi I get your point, but BiFunction<A,A,A> is a somewhat confusing type, because it doesn't make sense without an enclosing scope where A is introduced. In that scope, A might be an unrestricted type parameter (which is what you intended) as in

class Foo<A> {
    BiFunction<A, A, A> f = ??? // this has to be one of
                                //   (x, y) -> x
                                //   (x, y) -> y
                                //   infinite recursion
}

but there might also be more information available about A in the enclosing scope, as in

class Foo<A extends Bar> {
    BiFunction<A, A, A> f = ??? // many more options
}

abstract class Bar {
    public int x;
}

So maybe instead of BiFunction, it is better to use a generic method to illustrate the point:

static <A> A foo(A a1, A a2) {
    // now my only options are
    //   return a1;
    //   return a2;
    //   infinite recursion, e.g. return foo(a1, foo(a2, a1));
}
jbgi commented

@TomasMikula totally, thanks for the clarification; I fail to mention that A in my example need to be a universally quantified type variable for it to make sense.