calyxir/calyx

Compilation bug: overeager cell sharing

Closed this issue · 9 comments

I have found a case where the interpreter gives me the answer I expect, but Verilator and Verilog both give the same incorrect answer. The minimal example is here. I am including the respective commands, which you must run from calyx-py/test/correctness/queues. I've also included the result of running fud check.

The offending line is

If you remove it from the control and the group from the wires, all three drivers give the same correct output. I'm pretty confused because zero_out_j is such a non sequitur: the binheap component literally just puts the value 4 at cell mem[i++]. Amid this, zero_out_j just sets j to 0, and does not touch i.

Interpreter (Correct)

cat binheap.futil | fud e --from calyx --to interpreter-out \
    -s calyx.flags '-p compile-invoke' \
    -s verilog.data binheap.data
{
  "main": {
    "mem": [
      4,
      4,
      0
    ]
  }
}

Verilog

cat binheap.futil | fud e --from calyx --to jq \
       --through verilog \
       --through dat \
       -s verilog.data binheap.data \
       -s jq.expr ".memories" \
       -q
{
  "mem": [
    4,
    0,
    0
  ]
}

Icarus Verilog

cat binheap.futil | fud e --from calyx --to jq \
       --through icarus-verilog \
       --through dat \
       -s verilog.data binheap.data \
       -s jq.expr ".memories" \
       -q
{
  "mem": [
    4,
    0,
    0
  ]
}

fud check

fud check
global:
 ✔ /scratch/anshuman/calyx exists.

stages.calyx.exec:
 ✔ /scratch/anshuman/calyx/target/debug/calyx installed.

stages.interpreter.exec:
 ✔ /scratch/anshuman/calyx/target/debug/cider installed.

...

stages.verilog.exec:
 ✔ /opt/verilator/bin/verilator installed.
 ✔ Found version 5.006 (>= 5.002).

stages.icarus-verilog.exec:
 ✔ iverilog installed.
 ✔ Found version 11.0 (>= 11.0).

The note below is super outdated because I minimized the example. The upshot is that I confirmed cell-sharing between the registers i and j.

Ah I forgot to mention another piece of detective work I did. If you set find_parent_idx to always return some other hardcoded value, then 6 and 3 are put at that index instead of 0. The 6 is clobbered by the 3. So if find_parent_idx always returned 2, the output from Verilog would be [9, 0, 3].

I tested the clobbering behavior by making the main component invoke the called component only twice, with 9 and 6. When parent-finding is hardcoded to 0, the output is [6, 0, 0]. When hardcoded to 2, the output is [9, 0, 6].

Oh, interesting find! One thing I would try to do is minimizing the example using the flag-compare.sh file. You can set the fud commands to be run to perform the output diff you want.

If you're feeling more ambitious, you can try using creduce to automate the whole thing! We've not tried it on Calyx but it might be able to work because we have somewhat C-like syntax.

Cool! Also, maybe this was already painfully obvious to you haha, but this is a cell-sharing issue. I just ran the verilog/verilator commands with cell-sharing turned off and got the expected answers. Now I'll try to minimize.

It was not but thanks for testing it out! Tagging @calebmkim in case he's interested in this as well.

I'll take a look at what's going on with cell sharing

I have minimized the example. The top post now discusses and points to the new code.

When trying to minimize, I found some examples of things that make the problem go away:

  • Inline the work of memsto into main, thereby massaging away the invokes.
  • Remove the second invoke, and just duplicate memsto's control so that it reads
comp.control += [
        mem_store,
        incr_i,
        zero_out_j,
        mem_store,
        incr_i,
        zero_out_j,
    ]

Anyway, because they removed the problem, I have not presented these options in the minimized example. Also, the .py file accurately generates the .futil file.

I also checked that the issue wasn't because I was indexing with bitwidth 4, which is perhaps unusual. I now index by 8.

Most interesting of all: I decided that if i and j are just being shared, then why not leave i alone and increment j instead? So, in pseudocode:

reg i;
reg j;
mem[i] = 4;
j++;

But this caused the compiler to behave correctly, i.e., it did not share the cells anymore.

It's only when you do what I have in the futil file:

reg i;
reg j;
mem[i] = 4;
i++;
j := 0;

that the cells get shared.

Fix is here: #2071