ashvardanian/less_slow.cpp

Data Alignment may have error?

bfdyanshe opened this issue · 1 comments

The loop in f32_pairwise_accumulation have f32s_in_cache_line_half_k * 2 times, and the other one only have f32s_in_cache_line_half_k times.
图片

@bfdyanshe, yes, you are right! Sorry, I didn't see it in time. @alexbarev has just pointed me to that same problem and we will submit a patch soon.