Consensys/corset

Unused Declaration for `:array` Column

Closed this issue · 14 comments

Minimal example:

(module test)

(defcolumns
    (ACC_1 :i128)
    (BYTE :array [0:2])
)

This generates the following define.go:

package define

import (
	_ "embed"

	"github.com/consensys/zkevm-monorepo/prover/symbolic"
)

func ZkEVMDefine(build *Builder) {
	//
	// Corset-computed columns
	//
	test__BYTE_0 := build.RegisterCommit("test.BYTE_0", build.Settings.Traces.Test)
	test__BYTE_1 := build.RegisterCommit("test.BYTE_1", build.Settings.Traces.Test)
	test__BYTE_2 := build.RegisterCommit("test.BYTE_2", build.Settings.Traces.Test)
        ...
}

Notice how ACC_1 is not declared whilst BYTE_0 ... BYTE_2 are.

This seems to stem to a faulty update of the used field in the symbol table when it comes down to array columns.

BTW, array columns are currently quite messy, as in they exist both as a single array (to make the operations applied to the array, e.g. [] or len work) and as multiple unidimensional columns (to make numeric computations work). If you could find a better representation for that, that would be great.

This seems to stem to a faulty update of the used field in the symbol table when it comes down to array columns.

Ok, ta --- I will look into it shortly!

If you could find a better representation for that, that would be great.

Well, we shall see!

So, in compiler/definitions.rs we have:

fn reduce(e: &AstNode, ctx: &mut Scope, settings: &CompileSettings) -> Result<()> {
    ...
 Token::DefArrayColumn {
            ...
        } => {
            ...
            for i in domain.iter() {
                let ith_handle = handle.ith(i.try_into().unwrap());
                ctx.insert_used_symbol(

Here, replacing ctx.insert_used_symbol with ctx.insert_symbol resolves the example above. However, the problem is that array columns are then not being marked as used at all.

The problem maybe that the traversal which marks things as visited (_resolve_symbol in tables.rs) happens before the array is broken down into its individual columns? Thus, we see that the Symbol::Final for BYTE is marked as visited, but of course not e.g. BYTE_1.

So, in generator.rs we have:

pub fn reduce(e: &AstNode, ctx: &mut Scope, settings: &CompileSettings) -> Result<Option<Node>> {
    match &e.class {
        Token::Keyword(_) | Token::Domain(_) => Ok(None),
        Token::Value(x) => Ok(Some(
            // We want the value to specifically be a BigInt here, as we may
            // have negative ones, e.g. as shift arguments.
            Node::from(Expression::Const(Value::big_int(x.clone()))).with_type(
                if *x >= Zero::zero() && *x <= One::one() {
                    Type::Scalar(Magma::binary())
                } else {
                    Type::Scalar(Magma::native())
                },
            ),
        )),
        Token::Symbol(name) => Ok(Some(
            ctx.resolve_symbol(name)
                .with_context(|| make_ast_error(e))?,
        )),
        Token::IndexedSymbol { name, index } => {
            let symbol = ctx.resolve_symbol(name)?;

Here, we can see that [BYTE 0] will generate a resolve call to BYTE alone.

Some approaches:

  • One slightly odd approach might be to retype Symbo::Final(.., used) from bool to e.g. Vec<usize> or perhaps use a bitvector. Then we can update resolve_symbol to allow resolving a particular index of a symbol. For non-array symbols, then the index would just always be 0.
  • Another obvious approach is not to use the used information when generating the define.go. Instead, we write a separate visitor which computes the used information and just recalculate it all at the end. Not ideal, but at least we know it would work.

UPDATE: The first option doesn't make sense. I think all the machinery is there already.

Ok, looking again at reduce in generator.rs we have this:

Token::IndexedSymbol { name, index } => {
            let symbol = ctx.resolve_symbol(name)?;
            if let Expression::ArrayColumn {
                handle,
                domain,
                base,
            } = symbol.e()
            {
                let i = reduce(index, ctx, settings)?
                    .and_then(|n| n.pure_eval().ok())
                    .and_then(|b| b.to_usize())
                    .ok_or_else(|| anyhow!("{:?} is not a valid index", index))?;
                if domain.contains(i.try_into().unwrap()) {
                    Ok(Some(
                        Node::column()
                            .handle(handle.as_handle().ith(i))
                            .kind(Kind::Commitment)
                            .base(*base)
                            .t(symbol.t().m())
                            .build(),
                    ))

Here, it must presumably be the case that the newly created column is marked as being used. We actually want to say .used(true) here, and we'd have it ... ?

I think setting used here will never backpropagate it to the symbol table unfortunately, whic is from where it will be read. I assume a solution would be, when looking up an arraycolumnn, also to look up the associated indexed column as to set its used attribute in the ST.

For the record, examples needed to properly test and fix this:

(module test)

(defcolumns
    (ACC_1 :i128)
    (BYTE :byte :array [0:2])
)

(defconstraint test () (vanishes! (if (eq! ACC_1 1) [BYTE 0])))
(module m1)
(defcolumns
    (ACC_1 :i128)
    (BYTE :byte :array [0:2])
)
(defconstraint test () (vanishes! (if (eq! ACC_1 1) [BYTE 0])))

(module m2)
(defcolumns (A1 :i128) (B1 :byte))
(deflookup 
  tmplook
  ;; target columns
  (m1.ACC_1 [m1.BYTE 2])
  ;; source columns
  (A1 B1))
(module m1)

(defcolumns SEL)

(defperspective test
  ;; Selector
  SEL
  ;; Columns
  ((BYTE :byte :array [2])))

(defun (hi)
  [test/BYTE 1])

(defconstraint check (:perspective test)
  (vanishes! (hi)))

That looks like I've got it now.

@delehef Interested if you think I missed anything here.

Looks good. The only thing I'm not sure about is why you use resolve_symbol_with_path? This way, user may directly address array columns from any modules, I'm not sure that's the desired behavior.

The only thing I'm not sure about is why you use resolve_symbol_with_path?

Yeah, good question. I think the short answer is that I didn't figure out a better way. Using resolve_symbol directly didn't work since the name generated here:

let name = handle.as_handle().ith(i.try_into().unwrap()).to_string();

includes the module path, which resolve_symbol rejects. The problem is turning a token name into an indexed name. I wanted to avoid doing it by something like e.g. format!("{name}_{index}") as that feels brittle. If there is some obvious mechanism for doing this, then I would be keen to hear it.

This way, user may directly address array columns from any modules, I'm not sure that's the desired behavior.

What you're suggesting here in terms of scope is not something I had considered, so definitely worth thinking more about.

Using resolve_symbol directly didn't work

Perhaps there is a secondary issue going on here which flummoxed me. Its the lookups where the issue of the path is relevant, so I'm assuming it should work for them 🤔

On the other hand, we still have this:

Token::IndexedSymbol { name, index } => {
            let symbol = ctx.resolve_symbol(name)?;

So, actually, presumably this would catch any attempt to access something outside the proper scope anyway.

Yeah, so here's my test:

(module m1)
(defcolumns (BYTE :byte :array [0:2]))

(module test)
(defcolumns (ACC_1 :i128))
(defconstraint test () (vanishes! (if (eq! ACC_1 1) [m1.BYTE 0])))

Which generates this error:

Error: compiling test.lisp

Caused by:
    0: at line 13: (vanishes! (if (eq! ACC_1 1) [m1.BYTE 0]))
    1: at line 13: (if (eq! ACC_1 1) [m1.BYTE 0])
    2: explicit module m1 can not be reached from here

So, I think it looks OK. But, I'm still interested if there is a better way to construct the indexed symbol name to pass to resolve_symbol() directly.