Implement the grammars and judgments in Part IV.A
Closed this issue · 61 comments
- Part IV.A: #636
- Extensions of Part IV.A
- different kinds
- more optimized coe, discrete tags, etc
- Part IV.B: ??
Should the labels/constructors be capitalized like CustumOp, or do we prefer other marking? @ecavallo @jonsterling A bigger issue is, how should we declare a new inductive type? I suppose we want to extend signatures? Any architectural comments?
@ecavallo I wonder if you already have some ideas how these should be implemented? Jon and I can probably hack up something, but your inputs will be very useful.
Last time I was paying attention Jon was planning to rewrite the signature stuff. Is that still true or has it happened? I think I need to catch up on the current state of redprl before I can say much that's useful.
I did rewrite the signature stuff, but that was a long time ago
Here are some scattered thoughts about the implementation:
- An inductive type declaration should probably be its own kind of signature declaration. The annotations on constructor terms can then just contain the name of the inductive, instantiations of its parameters, and the constructor label; when you need the constructor data you can look it up in the signature.
- The fact that a declaration is well-formed can require reasoning, so maybe a declaration should be paired with a proof of that? Or maybe that's a separate theorem. There's a question here of how much you want to expose the well-formed constructor list judgment to the user. (Probably less is better than more.)
- There should be two kinds of constructors: ones with and ones without a specified boundary. (If a constructor has a boundary, the set of constraints needs to be valid.) In the paper only 0-constructors are allowed to have no boundary, but there's no real reason for this. (Incidentally, make sure you have v2 of the preprint, I forgot to account for 0-constructors in v1.) For constructors with a boundary, coe does endpoint correction, for ones without it doesn't.
- I'm not sure how you want to handle the transition from formal schematic terms to ordinary terms... I imagine it's expensive to redo the interpretation whenever you step to a boundary and so on. I guess they could be stored in interpreted form in the signature.
Some of these things are different from the paper, but not in ways that have significant impact on proofs. I don't have any opinions about the surface syntax. I'm trying to focus on getting something on untyped computation out right now, so I don't want get too involved at the moment, but I'll try to answer any questions of course.
Perhaps it makes sense to name constructors as TheInductiveType.constr
or something like that. Then we know where to find constr
in the signature. For example there would be S1.base
and S1.loop
.
The next difficulty is how to expose, for example, both S1
(or S1.loop
) and the well-formedness of S1
(or S1.loop
). I doubt we really need to expose the well-formedness. Unless there is a need, I vote for exposing no well-formedness at all, and that the tactics/rules simply assume every defined types and constructors are well-formed. The proofs are provided when defining the type and then forgotten. Am I being too naive?
That makes sense to me, but I might also be naive.
The things in the signature should be fully elaborated. I do not think it will be fruitful to keep the schema, unless we want a general pattern matching or something...?
Actually, I just realized that you need to keep the formal description around for the eliminator. (To state the typing rule you need some things that are inductively defined on the form of the constructor data, and you also need that argument types are functorial for the operational semantics.) I guess you could store both the formal description and (for efficiency) its instantiation at the inductive type.
Hmm I thought it is possible to just keep the compiled S1.elim
(including its type), in the sense that the schema can be dropped after the compilation?
Yeah, I think that would work.
Wait---I am indeed naive. The machine (the operational semantics) still needs those data. At least partially.
I guess it is okay for the first cut to just keep everything. In some sense it is also more accurate with respect to the language on paper.
The only place the data is used in the operational semantics in non-interpreted form is as an argument to the functorial action operator func
, but that operator could be turned into a metafunction like the interpretation functions, and then you would just need to store the compiled version. (There is not really a good reason why func
is treated differently in the paper.)
I guess the trouble is that the machine somehow has to look S1.elim
up in the signature, and somehow knows, for example, how (S1.elim (S1.loop x))
reduces. This seems a little bit mysterious to me within the current architecture. @jonsterling
@favonia The operational semantics can already query the signature, so I think there shouldn't be a problem.
But maybe I misunderstood what the hard part was.
(Note that under the new architecture, we can really store anything in the signature, not just ABTs.)
The machine never consults the signature for something complicated (involving bindings). Anyways, using S1
as an example, we need an entry S1
in the signature pointing to all the following information:
- An entry denoting the type (or sort) of
S1
itself. - A list of constructors containing the following information:
- An entry
base
saying the type (or sort) ofS1.base
, and that there is no boundary condition forbase
, and howS1.elim(S1.base...)
computes. - An entry
loop
saying the type (or sort) ofS1.loop
, and that the boundaries ofS1.loop
areS1.base
(??), and howS1.elim(S1.loop...)
computes (??).
- An entry
- An entry for eliminator denoting the type (or sort) of
S1.elim
.
Ideally, the schema of the computation rules are rich enough to handle recursive arguments, but not richer. Did I miss something?
Sounds right to me. I don't understand your comment about richness though.
I don't understand your comment about richness though.
I want to avoid keeping the complete schema.
@ecavallo Is there a reason to ban dependency of argument types on previous arguments, in the presence of Sigma types for uncurrying?
Can you give an example? I'm not sure I see what you're thinking of.
No, this is allowed in the paper.
@ecavallo Does the first rule (HYP) in Figure 1 implies p
is a valid boundary term? Note that it's not in the grammar.
@ecavallo I am going to assume you forgot to list boundary term variables as valid boundary terms. Please tell me if I am wrong.
By the way, I think it is okay to bind every label to a particular inductive type. (This will be more restricted than the proof theory on paper.)
@ecavallo After some thinking, I renamed "argument types" to "recursive types" and "boundary terms" to "recursive terms".
I agree about labels.
Recursive is fine with me. In the LICS paper I called them "argument types" and "argument terms".
What's the linguistic intuition behind "argument"? I can easily be convinced.
@ecavallo By the way, I believe the dependency graph between the judgments are not strongly connected. K ≡ K'
does not seem to be used by others.
@ecavallo (By the way)^2, you might want to review the usage of Theta
across the paper---it is not completely consistent and sometimes the subscript "j" is missing.
What's the linguistic intuition behind "argument"? I can easily be convinced.
The only reason I have to prefer "argument" over "recursive" is that "recursive type" already means something else.
@ecavallo By the way, I believe the dependency graph between the judgments are not strongly connected. K ≡ K' does not seem to be used by others.
It's presupposed in the term equality judgment, unless you mean the binary form specifically?
@ecavallo By the way, I believe the dependency graph between the judgments are not strongly connected. K ≡ K' does not seem to be used by others.
It's presupposed in the term equality judgment, unless you mean the binary form specifically?
I meant the judgments in Definition 3.2. I regret spamming this issue and let's take this discussion offline.
@ecavallo Could you remind me again the problem of point constructors having boundaries?
The constraint set for a constructor with boundaries has to be valid, and the empty set isn't valid, so you need to allow something else for point constructors. So you should have two options for a constructor: (a) valid constraint set, (b) empty constraint set. These are treated differently in the operational semantics for coe: for (a) you need the boundary-correcting fcom, for (b) you don't.
@ecavallo I see, but shouldn't it be (1) valid constraints or (2) boundaries (if any) with discrete parameters?
What do you mean by parameters here?
So the conditions on the constructor would be that
(a) The boundaries of the constructor are all intro terms,
(b) Each such intro term takes only non-recursive arguments, and those arguments are of discrete types
?
(b') ... or no constructor has boundaries and every argument is of some discrete type?
BTW, I believe allowing (glue [x : dim] (@ c x) [...])
would be a disaster, and is banned in Part IV. However, do you have a concrete example to blow up the entire type theory?
I feel like these distinctions are maybe too fine to be making in a first implementation.
Re BTW, I disallowed this because it seemed to make everything a lot more complicated and we don't have any examples where we want it. I don't know if it would break anything.
@ecavallo Alright. Next question: would it be kind of weird to allow a spec term fcom
to mention some dimensions NOT introduced by intro
?
(I am also thinking that if a fcom
can mention something not introduced by intro
, maybe the equations in intro
should be allowed to mention something outside too? Maybe I missed something important here?)
It's important for the proof that coe is well-typed that the constraint set for a constructor only mentions dimension variables which are parameters to the constructor (not any from outside).
I'm going to allow glue [x:dim] (@ c x)
because checking apartness looks expensive. We can add back the check if this allows us to prove false.
Can you not just arrange the syntax so that x is not in scope there?
Yes, but I thought we want to write (merid a i x)
not (merid a x i)
?
It seems like a big change to the theory just to get the argument order we want...
Alright. The parser will ban the dimensions for the moment.
If the types that appear in the spec are only pretypes, then the inductive type is an hcom-type. If they are all coe-types, then the inductive type is Kan.
The inductive type adds free fcoms, so we don't need them in the input type. Let me clarify a bit.
If, for every constructor,
(a) All the types in Gamma have kind k,
(b) all the spec types in Theta have kind k,
then the inductive type has kind k ⊔ hcom. (I don't know if you have kinds for spec types now, but they can work in the same way as they do for real types.)
@ecavallo Let k
be COE
for the moment. It is easy to upgrade (or complicate) the checker. @jonsterling I got another idea of marking some constructors as "discrete"---meaning that, when used as boundaries, the coe do not have to do any fixing. We can then have
data Susp (#l : lvl) [a : (U #l kan)] : (U #l kan)
{ discrete north
, discrete south
, merid a [x] [x=0 north] [x=1 south]
} ...
I wonder if you have thought about this, or if you have other ideas to mark discreteness.
@favonia Sounds interesting! I haven't thought about it, but I don't have any opposition to your idea.