Consensys/corset

Understanding Spillage

Closed this issue · 9 comments

I'm trying to understand how spilling comes about. Roughly I understood that spilling is necessary to ensure that certain constraints still hold despite arbitrary padding rows being added. It seems that padding values can fulfil this requirement if the values are constant (e.g. 1 or 0). So, I'm trying to construct a minimal example which illustrates this, but have not yet succeeded. A few points:

  • Spillage arises from constraints introduced by the compiler which are outside of the users control .
  • Spillage arises when a value is taken from a later row (i.e. a positive shift).

The minimal example I have come up with is:

(defcolumns A)

(defconstraint test1 () (vanishes! (shift (~ A) 2)))

Then, with an input trace:

{ "<prelude>": {"A": [1,2,3,4] } }

We can witness the spillage by computed an expanded trace with the following command:

corset compute -eeeeeN --trace spilling.trace --out spilling.out spilling.lisp

Which yields the following expanded trace:

{"columns":{
"A":{
"values":["0x00","0x00","0x00","0x01","0x02","0x03","0x04"],
"padding_strategy": {"action": "prepend", "value": "0"}
}
,"C/INV[A₊₂]":{
"values":["0x00","0x01","0x04222230874714185212124412469390773265687949667577031913967616727958704619521","0x05629641166285580282832549959187697687583932890102709218623488970611606159361","0x06333346312071277818186618704086159898531924501365547870951425091938056929281","0x00","0x00"],
"padding_strategy": {"action": "prepend", "value": "4222230874714185212124412469390773265687949667577031913967616727958704619521"}
}
}}

We can see three padding rows have been added (one to ensure trace starts with 0 and two to account for spillage).

Now, my questions:

  • What else is there to know about spillage? e.g. what other common scenarios does it arise from?

  • Why does corset not distinguish between positive and negative spillage (i.e. spillage arising from positive shifts versus negative shifts)?

The latter point seems evident from the fact that corset debug -s reports:

=== Spilling ===
 <prelude>:    2

For the above example, but also the same for this variation:

(defcolumns A)

(defconstraint test1 ()
  (vanishes! (shift (~ A) -2)))

Is this just an optimisation which could be applied?

@delehef Any comments appreciated!

What else is there to know about spillage? e.g. what other common scenarios does it arise from?

Any constraint, generated by the compiler or the user. E.g. (eq A (prev B)) would spill over, as we would need to force A[-1] == B[0] (-1 being here the first row of padding, and 0 the first row of actual traces). So all in all, many scenarios.

Why does corset not distinguish between positive and negative spillage (i.e. spillage arising from positive shifts versus negative shifts)?

Technical debt; the first time I implemented padding, the prover was in the middle of moving padding from after the trace to before the trace. Now it could be removed, just need to delete the .max(...) here.

Any constraint, generated by the compiler or the user. E.g. (eq A (prev B)) would spill over,

So, I understand those generated by the compiler. But, do we really need to bother about those coming from the user? It seems to me that the user already has to manage the issue of padding, and therefore, spillage. In your example above, if both A and B are user columns, then we're not going to expand them other than with padding values, right?

I guess perhaps the issue here is also about terminology. I understand "spill over" is caused by shifting in general. But, what I'm specifically interested in is:

  1. We have to decide how much spillage is required (e.g. which goes into the toml file)
  2. During trace expansion we have to compute spillage values in some cases.

Currently, the tool reports a spillage height of 1 e.g. for the example (eq A (prev B)). That seems safe, but conservative, as during trace expansion it only ever pads A and B with 0. So, you end up with something like this:

{ "<prelude>": {"A": [123,456], "B": [456,897] } }

being expanded into this:

{ "<prelude>": {"A": [0,0,123,456], "B": [0,0,456,897] } }

Which seems unnecessary (in this particular case).

Technical debt; the first time I implemented padding, the prover was in the middle of moving padding from after the trace to before the trace.

Great, that makes total sense. I figured it was something like that, but wanted to check I'm not missing something :)

It seems to me that the user already has to manage the issue of padding, and therefore, spillage. In your example above, if both A and B are user columns, then we're not going to expand them other than with padding values, right?

Yes, you're right actually. We expect the user to give columns that actually work.

Reopening this because I believe the fix I've made is wrong. We should be removing past_spilling() not future_spilling().

I need a test case for this.

Here's my test case:

(defcolumns X Y)
;; Enforce (X[k] = 0) ∨ (X[k+1] ≠ Y[k])
(defconstraint test () (* X (- 1 (~ (- (shift X 1) Y))))

This reports 0 spilling required for the module. Here's an input trace:

{ "<prelude>": {"X": [2,4,1], "Y": [1,2,3]}}

However, this does not fail --- I think because it needs at least two rows of padding, and is only getting one. Therefore, it is correctly determining a single padding value to use.

There is actually no easy way to test this functionality. Therefore, I'm going to revert for now. Once I have the ability to properly test, then I can reconsider.