Implementing Quantization in ADown Downsampling Class

Question

Implementing Quantization in ADown Downsampling Class

levipereira opened this issue 9 months ago · 2 comments

We're currently working on improving our YOLOv9 model by implementing quantization in the ADown downsampling class. We've observed that the current implementation is generating reformat operations and increasing the latency of the model.

To address this issue, we plan to implement quantization in two steps:

Creating a Quantized Version of ADown:
We will create a new class called QuantADown . This class will contain a method named adown_quant_forward(self, x) to handle the quantized forward pass.
Integration into the Model:
We will integrate the QuantADown class into our model by modifying the replace_custom_module_forward function in our quantization script. This function is responsible for replacing custom modules with their quantized counterparts during the quantization process.

We believe that implementing quantization in the ADown class will help optimize the model's performance and reduce latency. We welcome any feedback or suggestions from the community regarding this approach.

Thank you for your support and collaboration!

Answer 1 · 2024-04-09T18:18:30.000Z

Already implemented. Testing!

Answer 2 · 2024-04-12T04:06:19.000Z

Validated!