neurosim/DNN_NeuroSim_V1.4

Calling inference.py on --subarray 256 causes segmentation fault on TilePerformanceCalculation

Opened this issue · 1 comments

As the title says, if subarray == 256 as below:

python inference.py --dataset cifar10 --model VGG8 --mode FP --inference 1 --cellBit 1 --subArray 256 --parallelRead 256

Using debugger, this happens because the program tries to CopySubArray into 128-row newMemory with numRowSubArray = 256 in ProcessingUnitPerformanceCalculation line 488

subArrayMemory = CopySubArray(newMemory, i*param->numRowSubArray, j*param->numColSubArray, numRowMatrix, numColMatrix);

Floorplan:

Tile and PE size are optimized to maximize memory utilization ( = memory mapped by synapse / total memory on chip)

Desired Conventional Mapped Tile Storage Size: 2048x2048
Desired Conventional PE Storage Size: 1024x1024
User-defined SubArray Size: 256x256

This happens in layer 2 of VGG8 where it's 1152 rows x 1024 cols, so that the final row PE will have only 128/1024 rows filled.

If that final PE is only partially filled (PEMemory.size() < pesize), and also that the partial filling (PEMemory.size()) is less than the subArray size (256 in this case), this segmentation fault will happen.

This might be caused by this 1.4 vs 1.3 change in TilePerformanceCalculation:

image

By reverting to the previous numRowMatrix and numColMatrix arguments, the problem is solved.

However, I'm not sure if PELeakageSRAMInUse (leakage power at specific PE state?) breaks via this reversion.
I'd appreciate if someone can tell me if the energy estimates become wrong from this reversion.

Thanks for pointing this out, we also came across this bug recently and are working on a fix. We'll push the changes as soon as it is solved.