I recently read this article: https://probablydance.com/2023/04/27/beautiful-branchless-binary-search/, and I was inspired. First I implemented the same algorithm in pure Python:
And then I compared it against sortedcontainers
's implementation of bisect_left across a large range of array sizes:
Pretty handily beats it! However, most people using Python would probably be using bisect_left
from the bisect
library in the stdlib.
To try to beat that implementation, though, I will have to descend into C-land. Here's my implementation as a Python C-extension:
That beats it as well! Admittedly this is only for arrays of size up to 2**29
, but still pretty cool.
I also checked whether it successfully compiled with a CMOVE
instruction:
It does! You can see the full compiled dump in objdump_output.txt
.
Now, here are all of them combined (what you will get if you run main.py
):
This repo contains the submodule and benchmarking code I used to obtain these results.