It seems that the quantitative input is missing
yi1zhao opened this issue · 1 comments
yi1zhao commented
Hello, thank you for your contribution to sparse.
In the autotune_cpu_random.sh, You use driver_cpu.cpp to call spmm API.
Issue is, there seems to be a lack of scale and zero_point data pointers in the input parameters?
Details:
The input parameter is,
struct thread_data {
const int8_t * __restrict__ AB_val;
const int * __restrict__ AB_bias;
const int8_t * __restrict__ BC;
int8_t * AC;
int start;
int end;
};
If the external interface of SPMM is:
s8xs8->s8.
Then its internal logic is:
s8xs8->s32->fp32->s8.
The quantitative formula of DST(s32) is:
dst_int8 = (scale_src * scale_weights / scale_dst) * dst_int32.
See for details: https://oneapi-src.github.io/oneDNN/dev_guide_attributes_quantization.html
yi1zhao commented
have found shift_scale