haskell/error-messages

Type variables versus unification variables

Ericson2314 opened this issue · 7 comments

I've always gotten the sense that GHC is a little bit fast-and-loose about this, especially as polymorphic types can be inferred, whereby unification variables become type variables due to an implicit quantifier.

I am not sure exactly what I want different, especially implementation wise, but this just makes me feel like GHC is gaslighting me by virtue of me being unable to describe what the alternative looks like!

@xplat one remarked the same thing, after which a wave of relief and validation swept over me. Maybe he can describe the problem and potential solutions more eloquently.

xplat commented

GHC has "flexible variables", "rigid variables", and "Skolems", the first of which is generally a unification variable, and some "flexible" variables can still be "untouchable" inside an implication. The classification of "Skolem" vs ordinary "rigid" variables has to do with the algorithmic issues below.

More fundamentally, GHC uses unification algorithms that represent unification variables as bare identifiers. Because of this, there is no right thing for instantiation of a type variable to do when it hits a unification variable, and so the type checker goes well out of its way to avoid doing that. Really a unification variable should come with a suspended substitution of type variables, which allows doing instantiation and unification in any order and getting sensible results. If this is not done for efficiency or historical reasons, it is still likely to be better to phrase error messages in terms that assume this.

Action items from this:

  • users should never see terms like flexible / rigid / Skolem / untouchable
  • error messages should use more conceptual terms like unify, instantiate, and perhaps substitute, and perhaps refer to (existentially/universally) quantified type and unknown type or if more technical then at least type variable and metavariable
  • never talk about instantiating a unification variable

Currently you can see messages like:

Cannot instantiate unification variable "a" with a type involving polytypes:
forall b. b

I'd rather see messages like:

Cannot instantiate type "forall a. a -> a" with:
a = forall b. b
("a" is only quantified over monomorphic types)

although I'm not wedded to that phrasing either ...

xplat commented

https://hal.inria.fr/inria-00077197/document and https://www.cs.cmu.edu/~fp/papers/INRIA-RR-3591.pdf explain this perspective and how it relates to the older literature.

Thanks for that explanation @xplat! The only small delta I'd like to apply is that, as far as I know, GHC treats the terms "rigid" and "skolem" completely interchangeably. If they have different meanings, I've gotten this far without noticing!

Coming up with consistent phrasing around all of this would be a big step forward. Even better would be having e.g. a blog post that puts it all in context, and then somehow linking to that in the messages.

xplat commented

Aha! Skolems were the only rigid variables I remembered running into in GHC, but the (rigid, skolem) error message has always always put me in doubt whether there are non-skolem rigids in use somewhere.

Even better would be having e.g. a blog post that puts it all in context, and then somehow linking to that in the messages.

Yes a blog post would be quite fun. I can only hope it inspires someone to revisit this literature can take a stab at a different sort of type checker.

Perhaps "placeholder" is a more intuitive name for "unification variable"? I think it captures the intuition that the placeholder can still be assigned any other type.

Yes! I really like "placeholder".