Repeated terms in unions

Question

Repeated terms in unions

marcandre opened this issue 4 years ago · 3 comments

Not a huge issue, but typeprof sometimes repeats terms in unions.

Here's an example with Array[Vector] | Array[Vector]

Probably a beginner's question, but since Array is covariant, shouldn't Array[Vector] | Array[untyped] be simplified to Array[untyped]?

Answer 1 · 2020-11-11T18:58:05.000Z

Updated @ f881fbe, much less repetitions, but still some

Answer 2 · 2020-11-11T19:00:40.000Z

The Array[Integer] should clearly not be repeated.

Agreed, it is a bug depending on very subtle internal. I'll try fixing it tomorrow.

Detail: In general, TypeProf deals with Foo[X] (a class with a type variable) as if its instance has only one cell that contains X. However, arrays and hashes are specially handled to track their multiple elements (i.e., not only one cell) on a best effort basis because an array is often used as a tuple and a hash is often used as a JSON record and keywords.
However, Array.new and Hash.new creates general instances, not specially handled (just because I have not implemented it yet). A general instance of Array class (which should not occur) and a specially-handled array are not merged appropriately, which leads to Array[Integer] | Array[Integer] (one is for a general instance of Array, and the other is for a specially-handled array.)

Probably a beginner's question, but since Array is covariant, shouldn't Array[Integer] | Array[untyped] be simplified to Array[untyped]?

Good question. In short, TypeProf merges Array[X] | Array[Y] to Array[X | Y]. I guess that the reason why they are not merged is what I said above.

But there are some more esoteric points:

Currently, TypeProf does not handle variance because TypeProf's approach is very different from traditional generics system. Notably, TypeProf allows implicit type expansion: a = [1]; p a shows [Integer] (an one-length array that contains one Integer), and a = [1]; a << "str"; p a shows Array[Integer | String] (a length-unknown array that contains Integer | String). So, the element types of an array is not fixed in TypeProf. To support variance in TypeProf, we need a new theory (TM).
TypeProf merges Array[X] | Array[Y] to Array[X | Y] blindly, regardless of variance, mainly because of the analysis performance. I know it is less precise. (Array[X] | Array[Y] is an array containing only Xs, or an array containing only Ys. On the other hand, Array[X | Y] is an array that may mix X and Y.) But the distinguishment of Array[X] | Array[Y] led to state explosion in some experiments, so I compromised.
In theory, X | untyped is equal to untyped. But unfortunately, TypeProf easily produces untyped for many reasons (just unimplemented feature, the limitation of analysis precision, etc.). The current goal of TypeProf is a type inference tool, and in my experience, when TypeProf says X | untyped, my expected result is X. So, I think that simplifying X | untyped to untyped makes TypeProf less useful. Rather, TypeProf implicitly removes | untyped by default. (This removal can be suppressed by -fpedantic-output option).

Anyway, thank you very much for your interest to TypeProf!

Answer 3 · 2020-11-11T19:08:07.000Z

All very interesting, thank you! I'll mentally delete | Array[untyped] then.