StanfordAHA/garnet

PD: Power domains break when PE features added

Opened this issue · 0 comments

== PROBLEM ==
When new features are added to the PE tile, the existing features can renumber automatically such that e.g. the power domain configuration register might now be loaded via address 18 instead of 13.

Why this matters: hopefully the mapper knows how to track these changes for the compiler. But unfortunately the physical design flow does not know how to track these changes. As a result, the power domain GLS tests fail when they try to turn off power domains via the old address 13 (0x000D0000). See e.g. tb_Tile_PE.v in step pwr-aware-gls, where...

Verilog says this:

      #1 $display("==== TEST2: DISABLE TILE  ======");
      #1 $display("------------PS REGISTER DISABLE:--------------");
      #1 config_config_addr = 32'h000D0000;
      #1 config_config_data = 32'h00000001;
      #1 assert (dut.PowerDomainConfigReg_inst0.ps_en_out == 1'b1) ...
         else $error("ASSERTION 2 FAIL: Tile didn't get disabled");

...such that if the address is wrong, we get

ASSERTION #2
xmsim: *E,ASRTST (./tb_Tile_PE.v,213): (time 116 NS)
Assertion tb_Tile_PE.__assert_2 has failed
ASSERTION 2 FAIL: Tile didn't get disabled

== WORSE PROBLEM ==
power-domains/outputs/upf_Tile_PE.tcl uses the feature address to tell design flow which circuits to keep in the always-on domain:

create_power_domain AON -elements {
    PowerDomainOR DECODE_FEATURE_13 coreir_eq_16_inst0 and_inst1
    FEATURE_AND_13 PowerDomainConfigReg_inst0 const_511_9 const_0_8
    }

If the circuitry to control the power domains is not in the always-on domain, we cannot reliably turn the tile on or off.

E.g. in the example below, the feature address changed from 13 to 18 but the upf did not get updated. Subsequently, the design flow added a buffer to address bit 23 leading to the feature-18 decoder (feature is encoded as address bits {23:16} so e.g. feature 13 is encoded as "0x000D0000" and feature 18 is "0x00120000"). The buffer is not in the always-on domain and so produces a dont-care signal and subsequent failure

addr_23 = 0
FE_OFN47_config_config_addr_23 = x
DECODE_FEATURE_18_O = x
FEATURE_AND_18 in1 = 1
FEATURE_AND_18_out = x

ASSERTION #2
ncsim: *E,ASRTST (./tb_Tile_PE.v,226): (time 122 NS) Assertion tb_Tile_PE.__assert_2 has failed
ASSERTION 2 FAIL: Tile didn't get disabled

== PROPOSED SOLUTION ==

Add a parameter to common/power-domains/outputs/pe-pd-params.tcl, something like

# Used by upf_Tile_PE
set pe_power_domain_config_reg_addr 18
set aon_elements {
  PowerDomainOR
  DECODE_FEATURE_$pe_power_domain_config_reg_addr
  coreir_eq_16_inst0 and_inst1
  FEATURE_AND_$pe_power_domain_config_reg_addr
  PowerDomainConfigReg_inst0
  const_511_9
  const_0_8
}

Then in common/power-domains/outputs/upf_Tile_PE.tcl:

# See 'pe-pd-params.tcl' for aon_elements
create_power_domain AON -elements $aon_elements

...and then I guess some horrible hacky fixup script for the GLS testbench...stay tuned...and/or feel free to suggest something please.