[onert-micro] plan for quantized kernel
Opened this issue · 5 comments
Let's make a plan(by Sep) for quantized(s8/s16) kernel.
AFAIU, the master branch supports :
- S8 Add
- S8 AveragePool2D
- S8 Mul
- S8 Conv2D
- S8 MaxPool2D
- S8 AveragePool2D
- S8 and S16 FullyConnected
All this operations are accelerated by CMSIS_NN library(https://github.com/ARM-software/CMSIS-NN/tree/v4.1.0?tab=readme-ov-file).
IMHO, Let's fist support all the operations, which are supported by CMSIS_NN. That is, based on CMSIS_NN 4.1(https://github.com/ARM-software/CMSIS-NN/tree/v4.1.0?tab=readme-ov-file), final goal by Sep is that accelerating S8 10 kernels by Sep.
And then, enable S8 kernel for several operations(TBD, ~10 operations) not supported by CMSIS_NN.
@BalyshevArtem Please share any opinion about this
@BalyshevArtem Please share any opinion about this
Yes, sure. Currently we are in process with this task, thank you for detailing the task :)
Then, our final goal by Sep is :
- 20 operations will support int8 datatype
- 10 operations will be accelerated by CMSIS_NN
gtest log on x86 about quantized kernel :
quantized_test_xml_log.zip
Note: Google Test filter = *S8*:*S16*
[==========] Running 5 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 1 test from AveragePool2DTest
[ RUN ] AveragePool2DTest.S8_P
[ OK ] AveragePool2DTest.S8_P (0 ms)
[----------] 1 test from AveragePool2DTest (0 ms total)
[----------] 2 tests from FullyConnectedTest
[ RUN ] FullyConnectedTest.S8_P
[ OK ] FullyConnectedTest.S8_P (0 ms)
[ RUN ] FullyConnectedTest.S16_P
[ OK ] FullyConnectedTest.S16_P (0 ms)
[----------] 2 tests from FullyConnectedTest (0 ms total)
[----------] 1 test from Conv2DTest
[ RUN ] Conv2DTest.S8_P
[ OK ] Conv2DTest.S8_P (0 ms)
[----------] 1 test from Conv2DTest (0 ms total)
[----------] 1 test from MaxPool2DTest
[ RUN ] MaxPool2DTest.S8_P
[ OK ] MaxPool2DTest.S8_P (0 ms)
[----------] 1 test from MaxPool2DTest (0 ms total)
[----------] Global test environment tear-down
[==========] 5 tests from 4 test suites ran. (0 ms total)
[ PASSED ] 5 tests.
log from our target board for testing quantized kernels
START TESTING
-----------------
[ START TEST: Conv2DTest.INT8 ]
[ TEST TIME = (20.000000) us ]
[ TEST Conv2DTest.INT8 RESULT: OK ]
-----------------
[ START TEST: FullyConnectedTest.S8 ]
[ TEST TIME = (10.000000) us ]
[ TEST FullyConnectedTest.S8 RESULT: OK ]
-----------------
[ START TEST: FullyConnectedTest.S16 ]
[ TEST TIME = (20.000000) us ]
[ TEST FullyConnectedTest.S16 RESULT: OK ]
-----------------
[ START TEST: AveragePool2DTest.S8 ]
[ TEST TIME = (10.000000) us ]
[ TEST AveragePool2DTest.S8 RESULT: OK ]
-----------------
[ START TEST: MaxPool2DTest.S8 ]
[ TEST TIME = (10.000000) us ]
[ TEST MaxPool2DTest.S8 RESULT: OK ]
-----------------
END TESTING