Can and should KTrees be generalized?

Question

Can and should KTrees be generalized?

Closed this issue 5 years ago · 11 comments

The asm language as defined for the icfp submission represents labels as binary trees encoded in Type. The short story is:

Collections of blocks have type fun (A B: Type) => A -> block B
Their denotation therefore fun (A B: Type) => bks A B -> A -> itree E B, i.e. fun (A B: Type) => bks A B -> ktree E A B.
ktrees are shown to enjoy a traced monoidal structure.
asm enriches bks by allowing to hide part of the labels
Correctness of the linking theory crucially consists in showing that asm enjoys the structure of a traced monoidal category by building combinators on its syntax whose denotation commutes with the corresponding ktrees operators.

Now one issue of all of this is that representing labels as Type pretty much prevent us from writing any optimization (unless someone has a trick that I do not see). Indeed, one cannot introspect its structure.
Following discussions with @Lysxia and @Zdancewic, I therefore tried to switch to a representation of labels as Fin types, whose temporarily aborted attempt can be witnessed in the labels-fin branch.
However, things turned out heavier that I anticipated. I define the two following new categories:

FinC whose objects are nat and Hom(A,B) Fin.t A -> Fin.t B with tensor product Nat.add over objects.
Label (retrospectively poorly named) whose objects are nat and Hom(A,B) ktree E (Fin.t A) (Fin.t B).
Correspondingly, collections of blocks now have type fun (A B: nat) => Fin.t A -> block (Fin.t B), and their denotation fun (A B: nat) => Label E A B.
Having done that, I initially thought that I could simply adapt the proofs of correctness of the combinators of asm, Label being simply some specific ktrees.

However this turns out wrong. ktrees concretely relies on the Fun monoidal category of objects Type and arrows A -> B, with tensor on objects sum, to get its traced monoidal structure. Label on their side, although similar, are a different one, with a slightly different loop operator, and whose structure relies on FinC.
Concretely, that means that to move to Fin for labels, the whole theory of ktrees needs to be adapted and duplicated over Labels, and the process is different enough to not be straightforward.

We hence should consider whether both can be unified, or if the library should provide both.

Answer 1 · 2019-03-19T01:29:19.000Z

I don't see a benefit to including it inside the library. I'm also not sure if I completely understand the difficulty though. The type of control flow graphs is naturally a deep embedding that is given semantics through ktree. Is there something that we can not do in that?

Answer 2 · 2019-03-19T02:08:27.000Z

My thoughts on this are still a bit fuzzy, sorry if I'm unclear.

One way to put the problem bluntly: with labels represented as Type, I know how to prove the compiler correct without writing a single cofix; with labels represented as Fin types, I do not, and actually need to rebuild the equivalent of the KTreeFacts.v file, that includes explicit coinductive proofs.
It seems quite odd that the library would favor a way of representing labels, and it's actually an issue since the representation does not seem viable in the long run.

Looking at it more closely, it seems that KTrees might be unnecessary tied to the Fun category. I hence wonder whether it would make sense to generalize it so that given a monoidal category of the form (C,forall (A B: C), F A -> F B), we would have an instance of a Traced Monoidal one for something along the lines of (C, forall (A B: C) -> F A -> itree E (F B)).
We would hence have the current notion of KTree with C = Type and F = id, i.e. the Fun category, while the one we need for the Fin representation would be obtained with C = nat and F = Fin.t, i.e. the FinC category I defined in the branch.

But I'll try to take a more rigorous look asap.

Answer 3 · 2019-03-19T02:30:28.000Z

I think that I posted my response too hastily. So let me step back.

What I don't quite understand is why the denotation impacts the syntax so much. Conceptually, choosing a very generic denotation such as ktrees shouldn't prevent me from writing transformations on the syntax. And what you propose in the issue seems like exactly the right thing to do to get around this. I think the crucial statement that I glossed over in my first comment is that it is necessary to prove that the denotation commutes with the categorical operators. I don't understand exactly how this manifests itself. Is the right thing to look at to understand the problem the incomplete proofs in LabelFacts.v?

As far as the generalization you are proposing. It seems reasonable. If I understand correctly, you would re-define ktree as:

Definition ktree (i : Type) (F : i -> Type) (E : Type -> Type) (a b : i) : Type :=
  F a -> itree E (F b).

Answer 4 · 2019-03-19T13:25:13.000Z

Let me map more concretely to the code in the branch labels-fin to highlight the precise problem.

First, I do not think that the denotation impacts the syntax. The reason I cannot write transformations on the syntax right now is that the syntax of labels is simply Type. I cannot pattern-match on its structure, so I do not know how to introspect my term of type asm A B in order to write an optimization.

Hence the need to change the syntax by moving to another representation of labels. This syntactic change can be witnessed in the types of bks and asm in Asm.v.
In turn, that impacts the semantical domain, see denote_b and denote_asm.
In particular, we cannot use the loop operator from ktrees anymore to define denote_asm, we need a loop_Label {A B C : nat}: (Fin.t (C + A) -> itree E (Fin.t (C + B)) -> (Fin.t A -> itree E (Fin.t B)), or Label (C + A) (C + B) -> Label A B with the definitions from Label.v.

Now in AsmCombinators.v, the definition of the combinators themself is still identical, with the new type of labels.
However, if we for instance look at app_asm_correct, we now need to commute with denote_b loop_Label, and not loop. Since I defined loop_Label in terms of loop, my initial hope was that the difference was superficial, and the proof could be fixed with a bit of unfolding and still use loop_bimap_ktree and so on.
But actually it seems that I really need to redo all of KTreesFacts.v to prove that Label enjoys the same structure. I started doing so "by need" in LabelFacts.v, but it soon seemed to both be necessary to redo the proof rather than be able to use its ktree pendent, and more work than expected.

Hence indeed the new type for ktree you just wrote down that would hopefully solve the issue for free. I'll give it a shot tomorrow.

Answer 5 · 2019-03-27T13:53:48.000Z

What is the current status of this? I am planning to take a look at it today, but wanted to see if there are any new insights.

Answer 6 · 2019-03-27T14:17:22.000Z

Unfortunately, I do not have a good answer to give you, I need to get back to it too. Here are the few new elements I have:

@Lysxia seemed slightly perplexed that things would turn so complicated when we briefly discussed it. My understanding of his objections were that the current definition of ktree, i.e. ktree Type Fun with the current proposal, do have a particular status in that it is the Kleisli category of itree E. He therefore seemed to feel bad about this generalization, and would rather expect us to be able to work on ways to derive more easily these other ktrees from the original one, rather than extend the definition. But I do not think that he had an opportunity to look at it though, and should confirm that I am not deforming his thought.
I've been busy notably trying to turn the semantics of Vellvm into the style we used for Asm. Currently we manage to do it, but without specifying the behavior of open cfg (i.e. collection of blocks), we only do the whole linking once at the top level.
In order to do it nicely, it seems to me that we would need a third instance of these generalized ktrees. The problem is basically that we have a concrete representation of labels that we do not want to change. We would therefore need to work with something like:
Definition FinSet (l: list block_id): Type := {bid: block_id | In bid l}.
And work over ktree (list block_id) (FinSet).

So that kinda convinces me further that we indeed need something along these lines. However, whether a generalization or a better understanding has to how to derive easily the other ones is the way to go seem to be still up to debate.

Answer 7 · 2019-03-27T14:30:23.000Z

I'd be interested in chatting about the modifications you're making to the Vellvm semantics. If you're up for a call or something like that.

Answer 8 · 2019-03-27T14:50:03.000Z

Of course with pleasure. I'm working on this with Steve (@Zdancewic) and Calvin (@Chobbes), and it's very much in progress.
I won't be available before the itree meeting this afternoon, but we can definitely setup a call for later.

Answer 9 · 2019-03-27T14:57:00.000Z

Happy to join a call!

Answer 10 · 2019-03-27T15:39:56.000Z

But I do not think that he had an opportunity to look at it though, and should confirm that I am not deforming his thought.

That was pretty accurate :)

Rather than generalizing the current notion of category my idea was to formalize the idea of taking a subcategory of an existing one, and derive as much structure as we can from that.

Answer 11 · 2020-03-30T21:31:54.000Z

This is closed by #127 and the subsequent development of the CategorySub modules.