/rounding-bench-rs

A benchmark for unorm conversions

Primary LanguageRustThe UnlicenseUnlicense

Unorm rounding benchmark

This repo contains the benchmark for multiple implementations of unorm conversions in Rust.

Results

Here's a simple table of the output of cargo bench:

u5_to_u8_naive          time:   [196.99 µs 197.75 µs 198.71 µs]
u5_to_u8_v2             time:   [17.730 µs 17.771 µs 17.817 µs]
u5_to_u8_unsafe         time:   [7.2110 µs 7.2304 µs 7.2551 µs]
u5_to_u8_safer          time:   [7.8876 µs 7.9219 µs 7.9606 µs]
u5_to_u8_safer_int      time:   [7.2187 µs 7.2372 µs 7.2576 µs]
u5_to_u8_lut            time:   [6.7951 µs 6.8097 µs 6.8261 µs]
u5_to_u8_int            time:   [5.0247 µs 5.0486 µs 5.0715 µs]
u5_to_u8_ma             time:   [4.5759 µs 4.5909 µs 4.6094 µs]
u5_to_u8_ma8            time:   [4.2081 µs 4.2210 µs 4.2356 µs]

The relevant specs(?) of my machine:

  • OS: Windows 10
  • CPU: Intel® Core™ i7-8700K CPU @ 3.70GHz
  • Rust: 1.80.1
Full output
PS C:\Users\micha\Git\rounding-rs> cargo bench
   Compiling rounding-rs v0.1.0 (C:\Users\micha\Git\rounding-rs)
    Finished `bench` profile [optimized] target(s) in 6.06s
     Running unittests src\lib.rs (target\release\deps\rounding_rs-d1fe9a8a0a27ed37.exe)

running 1 test
test tests::test_correctness ... ignored

test result: ok. 0 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches\bench.rs (target\release\deps\bench-ed5b5a701c237ad3.exe)
Gnuplot not found, using plotters backend
u5_to_u8_naive          time:   [196.99 µs 197.75 µs 198.71 µs]
                        change: [-5.0101% -3.7511% -2.5985%] (p = 0.00 < 0.05)
                        Performance has improved.

u5_to_u8_v2             time:   [17.730 µs 17.771 µs 17.817 µs]
                        change: [+0.5979% +1.2739% +1.9495%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  8 (8.00%) high mild
  4 (4.00%) high severe

u5_to_u8_unsafe         time:   [7.2110 µs 7.2304 µs 7.2551 µs]
                        change: [-1.1789% -0.2864% +0.5970%] (p = 0.53 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  4 (4.00%) high mild
  9 (9.00%) high severe

u5_to_u8_safer          time:   [7.8876 µs 7.9219 µs 7.9606 µs]
                        change: [+0.9381% +1.9293% +2.9120%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

u5_to_u8_safer_int      time:   [7.2187 µs 7.2372 µs 7.2576 µs]
                        change: [-0.8096% +0.0358% +0.8161%] (p = 0.93 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe

u5_to_u8_lut            time:   [6.7951 µs 6.8097 µs 6.8261 µs]
                        change: [-0.5757% +0.1641% +0.8236%] (p = 0.66 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  7 (7.00%) high severe

u5_to_u8_int            time:   [5.0247 µs 5.0486 µs 5.0715 µs]
                        change: [-1.4602% -0.6664% +0.0793%] (p = 0.10 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

u5_to_u8_ma             time:   [4.5759 µs 4.5909 µs 4.6094 µs]
                        change: [-0.8729% -0.0356% +0.8203%] (p = 0.94 > 0.05)
                        No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
  6 (6.00%) high mild
  8 (8.00%) high severe

u5_to_u8_ma8            time:   [4.2081 µs 4.2210 µs 4.2356 µs]
                        change: [-0.3863% +0.4854% +1.3611%] (p = 0.29 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

Credit

Thank you to @turalcar for finding a bug in the benchmark and suggesting better constants for the multiply-add method (here u5_to_u8_ma8)!

License

All code in this repo is licensed under The Unlicense OR CC0-1.0. Whatever gives you more freedom.