satabios/sconce

Pruning seems to be an invasive technique? How does the package handle the performance degradation?

Opened this issue · 2 comments

Pruning seems to be an invasive technique? How does the package handle the performance degradation?

Yes, Pruning is an invasive process. However, if you can find the sweet spot (i.e...) the tradeoff between model degradation and removing redundant data.

We can cram this space, to do so, the package employs a layer-wise sensitivity scan that parses through every layer of the model and finds the sweet spot. Usually, this is quite expensive but the package has one of the fastest ways to find the best pruning ratio.

The tutorial explains this in detail: https://sconce.readthedocs.io/en/latest/tutorials/Pruning.html#lets-first-evaluate-the-accuracy-and-model-size-of-dense-model Look out for the header “Sensitivity Scan”.

Thus even after pruning, we make a wise decision to only prune the redundant data possible. Also, fine-tuning is applied post to the pruning such that we can regain the degraded accuracy.

I hope this answers this question. Feel free to open it again if you do not feel satisfied with the answer.

To add to the above point the final result table gives a glimpse of the technique quantitatively.

Also note that there is actual reduction of MAC operations unlike quantization.

image