ZigZag-Project/zigzag-v1

Memory cannot hold all the data when doing architecture search

BrianQian1999 opened this issue · 6 comments

Hi,

I'm trying to run some architecture search for some NN layers, with default memory_pool_exploration.yaml, but got something like Memory Scheme 1 cannot hold all the data in NN Layer 2 .

My understanding is that the candidates in the memory pool are not large enough to process the layers, so I'm confused how do you conduct the case study 2 (memory hierarchy search) in your paper, as the layers in DarkNet19 would always lead to Memory Scheme * cannot hold all the data. Maybe you were using a custom memory pool?

BR/Brian

Meanwhile, it is quite weird that in stdout:
Required memory size: 13824 <-> Available memory size: 16777216

asyms commented

Hi Brian,

Yes it is correct that in the paper a custom memory pool was used to conduct that case study.

What other settings were you running zigzag with? If you could upload the yaml files here, I will investigate the issue further.

Kind regards,
Arne

Hi Arne,

The inputs I'm using is attached here.
I'm running python3 top_module.py --arch inputs/architecture.yaml --set inputs/settings.yaml --mempool inputs/memory_pool_exploration.yaml --map inputs/mapping.yaml

Best,
Brian

Hi Brian,

There was a printing issue in the memory hierarchy checking function (thank you for helping us catch the bug :) ). Now the issue is fixed. So, if you rerun the code, the new printed info for your case should be:
"Memory Scheme XX cannot hold all the data in NN Layer 1. | Required memory size: 28155584 <-> Available memory size: 16777216 (unit: bit)".

Note that the dimension of DarkNet19 Layer 1 is: {'B': 1, 'K': 32, 'C': 3, 'OX': 224, 'OY': 224, 'FX': 3, 'FY': 3, ...}, and you have set the data precision to be 16-bit for all operands, in which case

the Input activation size is: 3 * 226 * 226 * 16 = 2,451,648 bits
the Output activation size is: 32 * 224 * 224 * 16 = 25,690,112 bits
and the Weight size is: 32 * 3 * 3 * 3 * 16 = 13,824 bits
In total, the data size is: 2,451,648 + 25,690,112 + 13,824 = 28,155,584 bit

So, the way to solve your issue is to set the top memory in your system to be more than 28,155,584 bits.

Best regards,
Linyan

I see, thanks for answering.

Best,
Brian