Can and should KTrees be generalized?
Closed this issue · 11 comments
The asm
language as defined for the icfp submission represents labels as binary trees encoded in Type
. The short story is:
- Collections of blocks have type
fun (A B: Type) => A -> block B
- Their denotation therefore
fun (A B: Type) => bks A B -> A -> itree E B
, i.e.fun (A B: Type) => bks A B -> ktree E A B
. ktrees
are shown to enjoy a traced monoidal structure.asm
enrichesbks
by allowing to hide part of the labels- Correctness of the linking theory crucially consists in showing that
asm
enjoys the structure of a traced monoidal category by building combinators on its syntax whose denotation commutes with the correspondingktrees
operators.
Now one issue of all of this is that representing labels as Type pretty much prevent us from writing any optimization (unless someone has a trick that I do not see). Indeed, one cannot introspect its structure.
Following discussions with @Lysxia and @Zdancewic, I therefore tried to switch to a representation of labels as Fin
types, whose temporarily aborted attempt can be witnessed in the labels-fin
branch.
However, things turned out heavier that I anticipated. I define the two following new categories:
FinC
whose objects arenat
and Hom(A,B)Fin.t A -> Fin.t B
with tensor productNat.add
over objects.Label
(retrospectively poorly named) whose objects arenat
and Hom(A,B)ktree E (Fin.t A) (Fin.t B)
.
Correspondingly, collections of blocks now have typefun (A B: nat) => Fin.t A -> block (Fin.t B)
, and their denotationfun (A B: nat) => Label E A B
.
Having done that, I initially thought that I could simply adapt the proofs of correctness of the combinators ofasm
,Label
being simply some specificktrees
.
However this turns out wrong. ktrees
concretely relies on the Fun
monoidal category of objects Type
and arrows A -> B,
with tensor on objects sum
, to get its traced monoidal structure. Label
on their side, although similar, are a different one, with a slightly different loop
operator, and whose structure relies on FinC
.
Concretely, that means that to move to Fin
for labels, the whole theory of ktrees
needs to be adapted and duplicated over Labels
, and the process is different enough to not be straightforward.
We hence should consider whether both can be unified, or if the library should provide both.
I don't see a benefit to including it inside the library. I'm also not sure if I completely understand the difficulty though. The type of control flow graphs is naturally a deep embedding that is given semantics through ktree. Is there something that we can not do in that?
My thoughts on this are still a bit fuzzy, sorry if I'm unclear.
One way to put the problem bluntly: with labels represented as Type
, I know how to prove the compiler correct without writing a single cofix; with labels represented as Fin
types, I do not, and actually need to rebuild the equivalent of the KTreeFacts.v
file, that includes explicit coinductive proofs.
It seems quite odd that the library would favor a way of representing labels, and it's actually an issue since the representation does not seem viable in the long run.
Looking at it more closely, it seems that KTrees
might be unnecessary tied to the Fun
category. I hence wonder whether it would make sense to generalize it so that given a monoidal category of the form (C,forall (A B: C), F A -> F B)
, we would have an instance of a Traced Monoidal one for something along the lines of (C, forall (A B: C) -> F A -> itree E (F B))
.
We would hence have the current notion of KTree
with C = Type
and F = id
, i.e. the Fun
category, while the one we need for the Fin representation would be obtained with C = nat
and F = Fin.t
, i.e. the FinC
category I defined in the branch.
But I'll try to take a more rigorous look asap.
I think that I posted my response too hastily. So let me step back.
What I don't quite understand is why the denotation impacts the syntax so much. Conceptually, choosing a very generic denotation such as ktree
s shouldn't prevent me from writing transformations on the syntax. And what you propose in the issue seems like exactly the right thing to do to get around this. I think the crucial statement that I glossed over in my first comment is that it is necessary to prove that the denotation commutes with the categorical operators. I don't understand exactly how this manifests itself. Is the right thing to look at to understand the problem the incomplete proofs in LabelFacts.v?
As far as the generalization you are proposing. It seems reasonable. If I understand correctly, you would re-define ktree
as:
Definition ktree (i : Type) (F : i -> Type) (E : Type -> Type) (a b : i) : Type :=
F a -> itree E (F b).
Let me map more concretely to the code in the branch labels-fin
to highlight the precise problem.
First, I do not think that the denotation impacts the syntax. The reason I cannot write transformations on the syntax right now is that the syntax of labels is simply Type
. I cannot pattern-match on its structure, so I do not know how to introspect my term of type asm A B
in order to write an optimization.
Hence the need to change the syntax by moving to another representation of labels. This syntactic change can be witnessed in the types of bks
and asm
in Asm.v
.
In turn, that impacts the semantical domain, see denote_b
and denote_asm
.
In particular, we cannot use the loop
operator from ktree
s anymore to define denote_asm
, we need a loop_Label {A B C : nat}: (Fin.t (C + A) -> itree E (Fin.t (C + B)) -> (Fin.t A -> itree E (Fin.t B))
, or Label (C + A) (C + B) -> Label A B
with the definitions from Label.v
.
Now in AsmCombinators.v
, the definition of the combinators themself is still identical, with the new type of labels.
However, if we for instance look at app_asm_correct
, we now need to commute with denote_b
loop_Label
, and not loop
. Since I defined loop_Label
in terms of loop
, my initial hope was that the difference was superficial, and the proof could be fixed with a bit of unfolding and still use loop_bimap_ktree
and so on.
But actually it seems that I really need to redo all of KTreesFacts.v
to prove that Label
enjoys the same structure. I started doing so "by need" in LabelFacts.v
, but it soon seemed to both be necessary to redo the proof rather than be able to use its ktree
pendent, and more work than expected.
Hence indeed the new type for ktree
you just wrote down that would hopefully solve the issue for free. I'll give it a shot tomorrow.
What is the current status of this? I am planning to take a look at it today, but wanted to see if there are any new insights.
Unfortunately, I do not have a good answer to give you, I need to get back to it too. Here are the few new elements I have:
- @Lysxia seemed slightly perplexed that things would turn so complicated when we briefly discussed it. My understanding of his objections were that the current definition of
ktree
, i.e.ktree Type Fun
with the current proposal, do have a particular status in that it is the Kleisli category ofitree E
. He therefore seemed to feel bad about this generalization, and would rather expect us to be able to work on ways to derive more easily these otherktrees
from the original one, rather than extend the definition. But I do not think that he had an opportunity to look at it though, and should confirm that I am not deforming his thought. - I've been busy notably trying to turn the semantics of Vellvm into the style we used for Asm. Currently we manage to do it, but without specifying the behavior of open cfg (i.e. collection of blocks), we only do the whole linking once at the top level.
In order to do it nicely, it seems to me that we would need a third instance of these generalizedktree
s. The problem is basically that we have a concrete representation of labels that we do not want to change. We would therefore need to work with something like:
Definition FinSet (l: list block_id): Type := {bid: block_id | In bid l}.
And work overktree (list block_id) (FinSet)
.
So that kinda convinces me further that we indeed need something along these lines. However, whether a generalization or a better understanding has to how to derive easily the other ones is the way to go seem to be still up to debate.
I'd be interested in chatting about the modifications you're making to the Vellvm semantics. If you're up for a call or something like that.
Of course with pleasure. I'm working on this with Steve (@Zdancewic) and Calvin (@Chobbes), and it's very much in progress.
I won't be available before the itree meeting this afternoon, but we can definitely setup a call for later.
Happy to join a call!
But I do not think that he had an opportunity to look at it though, and should confirm that I am not deforming his thought.
That was pretty accurate :)
Rather than generalizing the current notion of category my idea was to formalize the idea of taking a subcategory of an existing one, and derive as much structure as we can from that.