bash autotune_cpu_random_int8.cpu 16 16 16, which seems compared failed
yi1zhao opened this issue · 1 comments
yi1zhao commented
On branch amx_sparse
case1:
sparsednn$ bash autotune_cpu_random_int8.cpu 128 128 128
density 0.101318359375
False
Generating X86 vector intrinsics
(128, 128)
Reduced A dimension 128
128 128
== Load shared library ==
== at 20.56021 milliseconds ==
445553
== spmm microkernel ==
== at 0.00243 milliseconds ==
== 445553 reps ==
Difference: 0
(array([], dtype=int64), array([], dtype=int64))
Best Runtime 100000
case2:
sparsednn$ bash autotune_cpu_random_int8.cpu 16 16 16
density 0.15625
False
Generating X86 vector intrinsics
(16, 16)
Reduced A dimension 12
16 16
== Load shared library ==
== at 12.99022 milliseconds ==
4538234
== spmm microkernel ==
== at 0.00009 milliseconds ==
== 4538234 reps ==
Difference: 160
(array([12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 14,
14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14]), array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 0,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]))
Best Runtime 100000
marsupialtail commented
The third dimension must be a multiple of 32 for now, I think.