Implementing Quantization in ADown Downsampling Class
levipereira opened this issue · 2 comments
We're currently working on improving our YOLOv9 model by implementing quantization in the ADown
downsampling class. We've observed that the current implementation is generating reformat operations and increasing the latency of the model.
To address this issue, we plan to implement quantization in two steps:
-
Creating a Quantized Version of ADown:
We will create a new class calledQuantADown
. This class will contain a method namedadown_quant_forward(self, x)
to handle the quantized forward pass. -
Integration into the Model:
We will integrate theQuantADown
class into our model by modifying thereplace_custom_module_forward
function in our quantization script. This function is responsible for replacing custom modules with their quantized counterparts during the quantization process.
We believe that implementing quantization in the ADown
class will help optimize the model's performance and reduce latency. We welcome any feedback or suggestions from the community regarding this approach.
Thank you for your support and collaboration!
Already implemented. Testing!
Validated!