/Model_Prunning

Several basic methods for model compression

Model_Prunning

Several basic methods for model compression

  • Networks are typically over-parameterized (there is significant redundant weights or neurons)
  • Prune them!

step 1: Pretrained Network (large)
step 2: Evaluate the importance

    - Importance of weight: L1, L2, ....
    - Importance of a neuron:
        The number of times it wasn't zero on a given dataset

step 3: Remove (smaller)

    After prunning, the accuracy will drop (hopefully not too much)
    Don't prune too much at once, or the network won't recover. 

step 4: Fine-tune

step 5: Loop to step2

Why prunning?

How about simply train a smaller network?

- It is widely known that smaller network is more diffcult to learn successfully
- If the network is large enough, it is able to find the global optimal easily. 
- Lottery Ticket Hypothesis
  - Small network with random init weights is hard to train
  - Small network with random init weights of big network can be trained
  - 大的模型是有多个小模型组成的, 大的模型可以很容易被训练,但他剪裁出的小模型像是乐透中奖一样, 有的可以被训练, 而有的不行

- Rethinking the value of Network Prunning
  - Real random initialization, not original random initialization in "Lottery Ticket Hypothesis"
  - Prunning algorithms could be seen as performing network architecture search

How to prune?

  • Weight pruning - mask
  • Neuron pruning