/sliming-ai.github.io

🧠Singular values-driven automated filter pruning

Primary LanguageJavaScriptApache License 2.0Apache-2.0

The ever-accelerating progress of technology… gives the appearance of approaching some essential singularity. — John von Neumann, 1958

The Singularity Is Nearer: When We Merge with AI.Ray Kurzweil, 2024

Visitors License Days since submission

🧠 Singular values-driven automated filter pruning

Van Tien PHAM1,✉Yassine ZNIYED1Thanh Phuong NGUYEN2
1Université de Toulon, Université d'Aix-Marseille, CNRS, LIS, UMR 7020, France
2Université Côte d'Azur, CNRS, I3S, UMR 7271, France
Corresponding Author

🌀 Abstract

This paper introduces a novel automated filter pruning approach through singular values-driven optimization. Based on the observation and analysis of the distribution of singular values of the overparameterized model, we establish a robust connection between weight redundancy and these values, rendering them potent indicators for automated pruning. The automated structured pruning is formulated as a constrained combinatorial optimization problem spanning all layers, aiming to maximize the nuclear norm of the compact model. This problem is decomposed into two sub-problems: determining the pruning configuration and assessing the filter importance within a layer based on the identified pruning ratio. We introduce two straightforward algorithms to address these sub-problems, effectively handling the global relationship between layers and the inter-filter correlation within each layer. Thorough experiments across 8 architectures, 4 benchmark datasets, and 4 vision tasks underscore the efficacy of our framework.

🌟 News

🎨 Supplementary materials

1. Throughput acceleration

  • FasterRCNN for object detection
Baseline Pruned
  • MaskRCNN for instance segmentation
Baseline Pruned
  • KeypointRCNN for human keypoint detection
Baseline Pruned
Baseline (left) vs Compressed (right) model inference.

To underscore the practical advantages of SLIMING, an experiment was meticulously conducted, involving a direct comparison between a baseline model and a compressed model, both tailored for object detection tasks. Leveraging the FasterRCNN_ResNet50_FPN architecture on a RTX 3060 GPU, the experiment robustly highlights the substantial performance enhancement achieved by SLIMING. The accompanying GIFs offer a vivid visual depiction: the baseline model showcases an inference speed of approximately 12 FPS, while the SLIMING-compressed model boasts a remarkable twofold acceleration in throughput. This notable disparity effectively showcases SLIMING's efficacy and scalability, firmly establishing its relevance and applicability across diverse deployment scenarios.

Note: For replication of this experiment, please refer to detection/README.md.

2. Visualizing feature preservation

Input CR=0% CR=50% CR=64% CR=78%
Qualitative assessment of feature preservation in compressed models.
We present a qualitative evaluation of feature preservation in SLIMING, complementing the established efficiency demonstrated through numerical results. Our analysis involves a random selection of 5 images from the ImageNet validation dataset, examining three compression levels applied to the original ResNet-50 model: 50%, 64%, and 78%. Utilizing GradCAM for interpretation, we visually assess and analyze feature maps in both the original and compressed models.

The visual representation underscores SLIMING's efficacy in retaining crucial features across a diverse range of classes. Noteworthy is its consistent robustness in capturing and preserving essential information at different CRs. This resilience implies sustained effectiveness and reliability across varying scenarios and compression levels, positioning SLIMING as a versatile choice for network compression across diverse applications and datasets.

🕙 ToDo

  • Write detailed documentation.
  • Upload compressed models.
  • Clean code.

👪 Équipe

🔖 Citation

If the code and paper help your research, please kindly cite:

@misc{pham2024singular,
    title={Singular values-driven automated filter pruning}, 
    author={Pham, Van Tien and Zniyed, Yassine and Nguyen, Thanh Phuong},
    howpublished={\url{https://sliming-ai.github.io/}},
    year={2024}
    }

👍 Acknowledgements

This work was granted access to the HPC resources of IDRIS under the allocation 2023-103147 made by GENCI.
The work of T.P. Nguyen is partially supported by ANR ASTRID ROV-Chasseur.