NVlabs/timeloop

How to change the number of MACs of Diannao (chen-asplos2014.yaml)?

FishSeeker opened this issue · 2 comments

I'm trying to run Diannao ([chen-asplos2014.yaml]) on timeloop. I'd like to use 1024 mac units rather than the default 256 mac units. So I changed the instances of MACs to 1024. But I got some errors shown below:
`WARNING: found neither a problem shape description nor a string corresponding to a to a pre-existing shape description. Assuming shape: cnn-layer.
MESSAGE: attempting to read problem shape from file: /home/fish/micro/timeloop/problem-shapes/cnn-layer.yaml
Problem configuration complete.
Architecture configuration complete.
Using all available hardware threads = 8
Mapper configuration complete.
Initializing Index Factorization subspace.
Factorization options along problem dimension R = 3
Factorization options along problem dimension S = 3
Factorization options along problem dimension P = 45
Factorization options along problem dimension Q = 45
Factorization options along problem dimension C = 3
Factorization options along problem dimension K = 3
Factorization options along problem dimension N = 1
Mapspace Dimension [IndexFactorization] Size: 164025
Mapspace Dimension [LoopPermutation] Size: 1
Mapspace Dimension [Spatial] Size: 512
Mapspace Dimension [DatatypeBypass] Size: 1
Mapspace split! Per-split Mapping Dimension [IndexFactorization] Size: 20504 Residue: 7
Mapspace construction complete.
Search configuration complete.
Sparse optimization configuration complete.

MESSAGE: no valid mappings found within search criteria. Some suggestions:
(1) Observe each mapper thread's termination message. If it terminated due to
consecutive failed mappings, it will tell you the number of mappings that
failed because of a spatial fanout violation and the number that failed
because of a buffer capacity violation.
(2) Check your architecture configuration (especially mapspace constraints).
Try to find the offending constraints that are likely to have caused the
above violations, and disable those constraints.
(3) Try other search algorithms, and relax the termination criteria:
victory-condition, timeout and/or search-size.
(4) Enable mapper's diagnostics (mapper.diagnostics = True) to track and emit
more information about failed mappings.
`
May I know how to run diannao by using 1024 mac units?

What workload are you running? Is it the default workload embedded into that yaml or something else?

Assuming it is the embedded workload spec, then how did you change the architecture? Did you simply change the number of MACs to 1024? That would create an oddly-shaped machine unless you also changed the instances at other storage levels -- you would get a 1:8 fanout from the WeightBuffer to the MAC units, which is no longer a DianNao style architecture. You need to change the instance counts of the other storage levels as well.

Additionally, you also need to change the mapspace constraints -- there are hard C=16 and K=16 spatial constraints, so if you increase the instance counts (and consequently the available parallelism) then you would have to scale these constraints to match the updated parallelism, otherwise you will get severe underutilization.

@angshuman-parashar I changed instances of PsumRegFile to 32, InputBuffer to 1024, and WeightBuffer to 1024. Also changed K=32 and C=32 in mapspace constraints. It seems to work now. Thank you so much for your help.