Type variables versus unification variables

Question

Type variables versus unification variables

Ericson2314 opened this issue 3 years ago · 7 comments

I've always gotten the sense that GHC is a little bit fast-and-loose about this, especially as polymorphic types can be inferred, whereby unification variables become type variables due to an implicit quantifier.

I am not sure exactly what I want different, especially implementation wise, but this just makes me feel like GHC is gaslighting me by virtue of me being unable to describe what the alternative looks like!

@xplat one remarked the same thing, after which a wave of relief and validation swept over me. Maybe he can describe the problem and potential solutions more eloquently.

Answer 1 · 2022-02-18T16:06:11.000Z

GHC has "flexible variables", "rigid variables", and "Skolems", the first of which is generally a unification variable, and some "flexible" variables can still be "untouchable" inside an implication. The classification of "Skolem" vs ordinary "rigid" variables has to do with the algorithmic issues below.

More fundamentally, GHC uses unification algorithms that represent unification variables as bare identifiers. Because of this, there is no right thing for instantiation of a type variable to do when it hits a unification variable, and so the type checker goes well out of its way to avoid doing that. Really a unification variable should come with a suspended substitution of type variables, which allows doing instantiation and unification in any order and getting sensible results. If this is not done for efficiency or historical reasons, it is still likely to be better to phrase error messages in terms that assume this.

Action items from this:

users should never see terms like flexible / rigid / Skolem / untouchable
error messages should use more conceptual terms like unify, instantiate, and perhaps substitute, and perhaps refer to (existentially/universally) quantified type and unknown type or if more technical then at least type variable and metavariable
never talk about instantiating a unification variable

Currently you can see messages like:

Cannot instantiate unification variable "a" with a type involving polytypes:
forall b. b

I'd rather see messages like:

Cannot instantiate type "forall a. a -> a" with:
a = forall b. b
("a" is only quantified over monomorphic types)

although I'm not wedded to that phrasing either ...

Answer 2 · 2022-02-18T16:10:57.000Z

https://hal.inria.fr/inria-00077197/document and https://www.cs.cmu.edu/~fp/papers/INRIA-RR-3591.pdf explain this perspective and how it relates to the older literature.

Answer 3 · 2022-02-18T19:09:43.000Z

Thanks for that explanation @xplat! The only small delta I'd like to apply is that, as far as I know, GHC treats the terms "rigid" and "skolem" completely interchangeably. If they have different meanings, I've gotten this far without noticing!

Coming up with consistent phrasing around all of this would be a big step forward. Even better would be having e.g. a blog post that puts it all in context, and then somehow linking to that in the messages.

Answer 4 · 2022-02-18T20:55:48.000Z

Aha! Skolems were the only rigid variables I remembered running into in GHC, but the (rigid, skolem) error message has always always put me in doubt whether there are non-skolem rigids in use somewhere.

Answer 5 · 2022-02-20T05:38:46.000Z

Even better would be having e.g. a blog post that puts it all in context, and then somehow linking to that in the messages.

Yes a blog post would be quite fun. I can only hope it inspires someone to revisit this literature can take a stab at a different sort of type checker.

Answer 6 · 2022-02-23T18:08:37.000Z

Perhaps "placeholder" is a more intuitive name for "unification variable"? I think it captures the intuition that the placeholder can still be assigned any other type.

Answer 7 · 2022-02-23T19:57:08.000Z

Yes! I really like "placeholder".