/frags

Plugin gonna getcha, row types

Primary LanguageHaskell

Documentation for frag and motley

This file always contains the latest official documentation for the frag and motley libraries.

Table of Contents

Meta

Context for this work

I've developed this typechecker plugin as a hobby passion project over the last few years. The aim has always been twofold: to learn about GHC's type checker and to make a proof-of-concept compelling enough that an active GHC developer/researcher could not resist the urge to distill a proper GHC patch out of it for /row polymorphism/. I've learned a lot, and there's thankfully no end in sight on that front. And of the several from-scratch iterations I've done in private, I'm delighted to finally share one that feels promising!

This repository was, is, and will remain a hobby project. My work on it will continue to be bursty. I'm excited to cooperate with you! But my bandwidth is limited :(

About this documentation

We must maximize how accurate, available, and approachable they feel to readers and authors alike.

  • I'm tracking docs alongside the source in a single repository, prioritizing accuracy over Git precision.

  • I'm using a single monolithic file instead of adding dependencies on frameworks like Sphinx or MkDocs, prioritizing availablility over aesthetics and Git precision. (TODO Can I get some kind of linting for intra-document refs? Otherwise separate files docs might be better.)

  • I'm using https://github.github.com/gfm/ because of its general prevalence, prioritizing approachability over other formats' technical advantages.

Contributing to this project

Please find or create a GitHub Issue on which to discuss your idea; I'd like to keep the higher-level discussions there and dig into the details on the PRs.

Acknowledgements

The GHC developer community has been very helpful and encouraging over the few years I've iterated on this. There are too many to list them all, but Simon Peyton Jones, Adam Gundry, Richard Eisenberg, Iavor Diatchki, E.Z. Yang, Jan van Brügge, and AntC (?) have given several answers, conversation, and encouragement that helped a lot.

I deeply appreciate that my employer https://www.navican.com made it easy for me to release this work as open source.

Bibliography

frag API

The frag library declares the Frag b data kind, its interface, and a typechecker plugin implements its equational theory.

Frags and their equivalence

The library declares the following introduction forms. Because of the plugin, the type families are not as much of a hindrance as one usually expects -- they behave like something in between type constructors and type families similar to injective type families (https://gitlab.haskell.org/ghc/ghc/issues/6018). (See the Frag polymorphism section below for more information.)

data Frag (b :: k) = 'Nil
infixl 6 :+,:-
type family (fr :: Frag b) :+ (e :: b) :: Frag b where {}
type family (fr :: Frag b) :- (e :: b) :: Frag b where {}

The kind Frag b denotes all finite signed multisets of bs. The element multiplicities in a signed multiset can be negative. There are only finitely many non-zero multiplicities in a finite multiset. Example frags:

'Nil :+ Int :+ Int :- Char :: Frag Type   -- {Int: 2,Char: -1}

'Nil :+ 1 :+ 2 :- 13 :: Frag Nat   -- {1: 1,2: 1,13: -1}

'Nil :+ "docker" :+ "sudo" :: Frag Symbol   -- {"docker": 1,"sudo": 1}

(Until https://gitlab.haskell.org/ghc/ghc/issues/13637 is resolved, GHC messages will unfortunately use unnecessary parens as in (('Nil :+ 1) :+ 2) :- 13.)

Although a frag denotes a multiset, the current frag interface currently omits union, intersection, and all other binary operators. Multisets can only be built by incrementing or decrementing multiplicities one at a time. (See nfrisby#25.) We call a right partial application :+ e or :- e a signed tally.

The typechecker plugin lets GHC simplify equivalences at kind Frag b. In particular, it ensures that the order of tallies does not matter. Example frag equivalences that the plugin can simplify:

(('Nil :+ Int :- Int)   ~   'Nil)

(('Nil :+ Int :+ Char)   ~   ('Nil :+ Char :+ Int))

(('Nil :+ x :+ y :- x)   ~   ('Nil :+ y))

(('Nil :+ x :+ Int)   ~   ('Nil :+ Int :+ y))   -- simplified to (x ~ y)

(I think we'll need those inner parentheses until https://gitlab.haskell.org/ghc/ghc/issues/10056 is resolved.)

Those last examples involve type variables, foreshadowing frag polymorphism.

Frag integers

The kind Frag () denotes the integers, Z. Example integer frags:

'Nil   -- 0

'Nil :+ '()   -- 1

'Nil :+ '() :+ '()  -- 2

'Nil :- '() :- '()  -- negative 2

This approach lets the library conveniently declare the FragEQ elimination form, which reduces a frag to the (integer) multiplicity of the specific element.

type family FragEQ (e :: b) (fr :: Frag b) :: Frag () where {}

Example multiplicity constraints that the plugin can simplify:

FragEQ e 'Nil ~ 'Nil

FragEQ Int (fr :+ Char) ~ FragEQ Int fr

FragEQ e (fr :- e) ~ FragEQ e fr :- '()

FragEQ Int ('Nil :+ x) ~ ('Nil :+ '())   -- simplified to (x ~ Int)

Sets

Sets are multisets with binary multiplicities, 0 or 1. We introduce a methodless type class for that predicate.

class SetFrag (fr :: Frag b) where {}

Example set constraints that the plugin can simplify:

SetFrag ('Nil :+ Int :+ Char)

SetFrag ('Nil :+ x :- y)   -- simplified to (x ~ y)

Workaround

For technical reasons, we currently need set constraints to be ~ constraints. We therefore write SetFrag fr ~ '() instead of SetFrag fr. The essential semantics are the same, but GHC now handles unification in the scope of local set constraints arising from GADT pattern matches. (FYI, understanding https://gitlab.haskell.org/ghc/ghc/issues/15009 really claried this aspect of OutsideIn(X) for me.)

type family SetFrag (fr :: Frag b) :: () where {}

Frag polymorphism

The plugin interprets and simplifies non-trivial relationships between type variables at kind Frag b, thereby implementing frag polymorphism.

Frag polymorphism allows the user to write signatures that GHC without the plugin would report as ambiguous.

frpo_fun :: Proxy (fr :+ x) -> Proxy fr

In that signature, the type variable x only occurs as an argument to a type family application. And that family was not declared injective in that argument, so GHC without the plugin would not be able to infer x at occurrences of frpo_fun. Hence it rightly rejects the signature (without -XAllowAmbiguousTypes). However, with the plugin, GHC can infer x (at well-typed use sites). Examples:

frpo_fun (Proxy :: Proxy ('Nil :+ Int))
  -- infers (fr ~ 'Nil,x ~ Int)

frpo_fun (Proxy :: Proxy 'Nil) :: Proxy ('Nil :- Int)
  -- infers (x ~ Int,fr ~ ('Nil :- Int))

TODO forward reference to motley products.

TODO sentence transitioning to next section

Universes

Informally, universe is type theory terminology for "set of types". Following the Sets as frags section, we can use a frag as a universe. Universes tend to support an additional interface, specifically a "code" data type that is one-to-one with types in the universe. (Find more details in any resource discussing universes a la Tarski.)

The GHC ecosystem already includes support for universes.

  • The kind Type is a "set of types", but it doesn't have codes. The Typeable class does and thereby determines a universe (subuniverse of Type). In particular, it is an open universe: any data declaration adds correspoding codes and types to it. (See the GHC User's Guide for details.)

  • This ExampleU GADT defines codes for the (sub)universe containing Char and Int.

    data ExampleU :: Type -> Type where
      ExampleU_Char :: ExampleU Char
      ExampleU_Int :: ExampleU Int

    Unlike the Typeable universe, this universe is closed: no declaration can create more codes.

  • Universe are a core concept in the singletons package's encoding of dependent types.

The frag library and plugin introduce novel universes that are more precise than any in the GHC ecosystem. It relies on two additional elimination forms.

type family FragLT (e :: b) (fr :: Frag b) :: Frag () where {}
class KnownFragCard (fr :: Frag ()) where
  fragCard :: proxy fr -> Int   -- NB plugin grinds to halt well before this overflows

The FragLT e fr application denotes the total multicplity of all elements x in fr such that x < e. That < sign refers to a strict partial order on types that we call ORDER_STABLE. How exactly it relates types is an implementation detail of GHC and the plugin. It has a few key properties that make it safe for our use here.

  • It relates a and b only if it relates all substitution instances of them in the same way. This is why it is a partial order.

  • On the other hand, two types with no free variables are necessarily comparable (maybe "bar" < "foo"). It even orders some types with free variables (maybe [a] < Maybe b).

  • I have not thought about how it would handle impredicative types; I suggest not trying that out :).

  • Any non-empty set of pairwise-related types has a unique minimum according to ORDER_STABLE.

The fragCard method demotes the type-level integer to the value-level.

With SetFrag, FragLT, and KnownFragCard, we can declare a data type for a frag universe's codes and functions on them.

data FragRep :: Frag b -> b -> * where
  MkFragRep ::
      (FragEQ e fr ~ ('Nil :+ '()),KnownFragCard (FragLT e fr))
	=>
	  FragRep fr e

testEquality_FragRep ::
  SetFrag fr ~ '() => FragRep fr x -> FragRep fr y -> Maybe (x :~: y)

Note the set constraint in the type of testEquality_FragRep: this coercion is only safe if all of the multiplicities in fr are non-negative. SetFrag is all we have at hand, but positive multiplicities greater than 1 would be fine here.

There are other useful functions over FragRep. For example, if y is the minimum of fr :+ y, then we can embed FragRep fr x into FragRep (fr :+ y) x. (Operationaly, this increments the integer in the KnownFragCard dictionary.)

widenFragRepByMin :: (FragLT y fr ~ 'Nil) => proxyy y -> FragRep fr x -> FragRep (fr :+ y) x

TODO discuss all FragRep axia here as part of nfrisby#14

Nominal view

TODO

type family FragPop_NonDet (fr :: Frag k) :: MaybeFragPop k where {}

type family FragPush (mpop :: MaybeFragPop k) :: Frag k where {}

Auxiliaries

If only because of implementation details, the following types are relevant to the user.

Frag masking

The library provides the following elimination form.

type family FragNE (e :: b) (fr :: Frag b) :: Frag b where {}

The multiciplities of FragNE e fr agree with those of fr except on e, which has a multiplicity of 0. Thus, FragNE e removes e from fr's support, the set of elements with non-zero multiplicities.

(Observation: A set frag is equivalent to its own support.)

Apartness constraints

TODO motivate and define

motley API

The motley library uses the frag polymorphism and frag universes of the frag library to define n-ary type-indexed sums and products.

Sums

TODO

data Sum :: Frag b -> (b -> *) -> * where
  MkSum :: !(FragRep fr e) -> f e -> Sum fr f

Products

TODO

data Prod :: Frag b -> (b -> *) -> * where
  MkCons ::
      (FragLT e fr ~ 'Nil,FragEQ e fr ~ 'Nil)
	=>
	  !(Prod fr f) -> f e -> Prod (fr :+ e) f
  MkNil :: Prod 'Nil f

TODO mention nfrisby#9 flat Prod

Combinators

TODO

updateSum :: Prod fr (f ~> g) -> Sum fr f -> Sum fr g
updateProd :: Sum fr (Compose Endo f) -> Prod fr f -> Prod fr f

Optics

TODO

opticSum ::
    ( SetFrag fr ~ '()
    , FragEQ x fr ~ 'Nil,KnownFragCard (FragLT x fr)
	, FragEQ y fr ~ 'Nil,KnownFragCard (FragLT y fr)
	)
  =>
    Prism.Prism (Sum (fr :+ x) f) (Sum (fr :+ y) f) (f x) (f y)
opticProd ::
    (
	  FragEQ x fr ~ 'Nil,KnownFragCard (FragLTx fr)
	,
	  FragEQ y fr ~ 'Nil,KnownFragCard (FragLT y fr)
    )
  =>
    Lens.Lens (Prod (fr :+ x) f) (Prod (fr :+ y) f) (f x) (f y)

Internals

This section explains the imlementation of the frag package.

Frag semantics

Informally, the terms inhabiting Frag b denote multisets. That would imply that Frag b denotes the free abelian group over whatever b denotes. However, Frag b cannot itself be a group; it does not even have a binary operator. What is going on? How sound is that informal notion?

Any term inhabiting Frag b is equivalent to a pair of a root and an extension. The extension is a possibly empty sequence of signed tallies, and the root is another term inhabiting Frag b. We usually assume that the root, which viewed as a pair itself, has an empty extension -- so the root is either 'Nil or a type variable (including flattening variables). There is a crucial reason that it is safe to assume that! Extensions form a monoid and therefore the functor that pairs an X with an extension is a ("writer") monad over X: its unit pairs an X with the empty extension, and its join merely combines the two extensions.

This extension monoid is moreover the free abelian group over the terms inhabiting b. (See formal sum (Wikipedia).) The unit is the empty sequence of tallies, the group operator is concatenation -- i.e. arithmetic over signed b-ornamented unary numerals, i.e. multiset union, which is associative and commutative -- and the inverse function flips the sign of each tally.

The equational theory of Frag b is determined by its equivalence to the "writer" monad with the free abelian group of b-inhabitants as its "output" and Frag b-inhabitants as its carrier (i.e. its X). And in a fully grounded inhabitant of Frag b (no type variables (including function application), impredicativity?, etc?), the root is 'Nil, so the term is the extension, which is a multiset. Thus, the informal notion that frags denote multsets seems reasonable.

TODO Call out the canonical(?) action of the extension group on terms of Frag b?

Normalization

TODO what the plugin could do without normalizing

TODO what the plugin can do with normalizing

TODO what the plugin cannot do even with normalizing

TODO how we normalize

Derived equivalences

TODO derived equivalence and apartness constraints

Floating equivalences

TODO why SetFrag and Apart need to block them,

TODO why the ~ '() workaround does.

TODO refer to https://ghc.haskell.org/trac/ghc/wiki/FloatMoreEqs2018

Flattening

TODO GHC's flattening, especially avoiding loops reorienting frag equivalences

TODO unflattening via the frag substitution

TODO challenges from https://gitlab.haskell.org/ghc/ghc/issues/15147 unflattened Wanteds