segmentio/asm

Performance comparison between assembly and cgo

yingfeng opened this issue · 0 comments

Hi folks,
This library is accelerated using assembly. Have you made any performance comparison of using cgo for corresponding acceleration? Although cgo itself has a remarkable overhead around 100ns for each call, it might be negligible given larger batch inputs. On the other hand, assembly itself also has the shortcomings such that it could not be inlined, so it would be interesting if performance comparison is available. Thank you~