Benchmark command buffer writes
Closed this issue · 3 comments
Would there be a benefit to treating PUB
/HPUB
as an Int32
or otherwise using Span.Copy
from a static readonly ReadOnlySpan<byte>
(which would still allow for embedding in assembly)?
Originally posted by @to11mtm in #303 (comment)
FWIW I did run @caleblloyd 's gist locally to check (tyvvvm for making it, apologies for not having the chance) via a single file app calling BenchmarkRunner.Run<CommandConstantsBench>();
here are my numbers and impl for reference
Not sure whether the difference I'm seeing is because of newer/different uArch, target OS, containerization or form of containerization, but I'm seeing a much more pronounced difference across all frameworks, with PubBinary
executing in 1/4 of the time across all frameworks and NewLineBinary
executing in at worst 1/2 of the time.
I think having a few more data points around the uArch/OS/containerization/???[0] balance here would be -very- useful in making an informed decision around overall gains vs readability tradeoffs; if anyone is willing to run and post results it would be appreciated!
I moved the benchamark into this project in #329 - also made a couple fixes to the original gist (the bit shifts were wrong in the gist)
With all of these calls to Little Endian operations... I do want to know what the results of the benchmark are on a Big Endian processor are. Anyone got a Big Endian system laying around?