PHPBench integration
dantleech opened this issue · 24 comments
I would like to propose introducing PHPBench benchmarks into Hoa.
PHPBench is a benchmarking framework (micro and macro), it is structurally similar to PHPUnit, and can be used in any situation where you have written a microtime
script to test something. It is somewhat similar to Java's JMH and was inspired by Athletic. It is still under development.
Some of its advantages:
- Benchmarks are located in the source repository (à la unit test files).
- It can generate themeable reports (to the console, in markdown, or in HTML)
- It can store results and allow you to compare different runs (eventually allowing you to store in a GIT branch).
- Iterations are executed in isolated processes.
I propose it here because I have seen similar micro-benchmarks showcased by @Hywan on IRC and, honestly, I want to know if PHPBench is useful and fit-for-purpose.
Benchmarks could be located in the following path:
LibraryName/Test/Benchmark/SomeBench.php
And PHPBench itself could either be installed as a require-dev
dependency, or globally on the developers machine (there will be a PHAR at some point). The benchmarks themselves have no runtime dependency on PHPBench.
@Hywan expressed a wish for there to be assertions for memory and time with the implication that it could be used in the CI process. This is currently not a feature of PHPBench due to time being affected by the platform the benchmarks are running on, but CI assertions (in whatever form) are clearly something to work towards.
So, there it is. Just an idea, if you are interested maybe @Hywan could show me the code he used to generate his benchmarks and I could make a PR..
I can easily imagine this as an generic atoum extension. See here for examples.
This would let us define new assertions working with PHPBench.
That would be interesting, but it would be quite a big depature from the current way PHPBench is used, e.g:
Would imagine something like this:
public function testThis()
{
$this->phpbench
->assert('main.mean < 2s')
->benchmark(function () {
md5('foobar')
})->iterations(4)
->retryThreshold(2)
->revolutions(1000)
->etc();
}
But that means solving some problems, f.e. PHPBench launches the benchmarks in separate processes.
I was thinking more immediately about adding assertions to the annotations in the PHPBench cases:
/**
* @Assert('mean < 2s and memory <= 200b')
*/
public function benchSomething()
{
md5('hello world');
}
But it certainly would be a shame to not use atoum.
@dantleech Is is possible to split PHPBench into the library and the CLI? This way, we will reduce the number of dependencies.
(will be more “embeddable”)
I like what @dantleech proposes: we could easily wire annotations to atoum assertions directly in PHPbench. This would be really easy to do!
@dantleech do you want me to do a POC ?
I just would like to throw an idea. memory <= 200b
. It can be true or false according to the PHP VM we are using. Actually, this is very hard to predict the verdict of this assertion. Same with mean < 2s
. How to deal with this?
@Hywan this is exactly why I suggested, on IRC, an assertion like the one we do on float: isNearlyEqualTo
.
We could implement assertion allowing to check value with a delta: memory <= 200b
would become memory <= 220b
(if we apply a 10% delta)
@Hywan good point, perhaps by defining an assertion per env:
Assert('php_version = 5.4', 'mem < 200b')
Assert('php_version = 7', 'mem < 100b')
So you would target assertions at specific php versions / os / cpu / whatever. Lots of maintenance and of course you are assuming that these tests will only be run in specific environments (imagine every developer having their own assertions for they're personal laptops!).
@dantleech You have to think in terms of VM and then VM's version, not only “PHP version”.
I would like more inputs from @hoaproject/hoackers community please :-)!
So, what to do with this? PHPBench has a measure tool is awesome, it's not the question here. However, having this as a testing tool is apparently difficult because it hardly depends on the “execution engine”, i.e. the VM.
My suggestions:
- Close this issue and drop the idea of using PHPBench as a testing tool,
- Beeing ready to use PHPBench for some benchmarks, e.g. in
Hoa\Compiler
orHoa\Ruler
orHoa\Router
or… almost all libraries could have ones, but for what goals? ……
The main goal of such a tool for Hoa is to check if modifications in the code will not introduce any regressions in terms of performances too. So you make one or many reference runs before starting your patches, and then compare them with new patches applied. This is “local testing” (the term is not scientific at all).
New suggestions:
- Maybe we should write “generic/prefilled benchmarks” in a library (let's say
Hoa\Compiler
), - Based on these generic benchmarks, for the first reference run, we generate tests,
- Then we can modify our code and re-run the first generated tests to ensure no regression.
In pseudo-code, it would look like:
$ hoa test:generate-benchmarks
… benchmarks are running several times
… we compute all the results
… we round them up (+5% for instance)
… we generate performance **reference** tests
$ vi foo.php
… do your stuff
$ hoa test:run
… check the performance of the new code against the **reference** tests
What do you think?
Not sure I understand the workflow above --
- what is generated?
- why add 5% ?
I think the value is currently in the local testing - my workflow (when using PHPbench) is similar to:
$ git checkout master
$ phpbench run benchmarks --store --iterations=30 --revs=10000 # normally this is in config
$ git checkout working_branch
$ phpbench run benchmarks --store --iterations=30 --revs=10000
$ phpbench report --uuid=latest --uid=latest-1 --report=aggregate # compare the difference
Yes, this is “local testing”. You are testing whether your new patches are not introducing a performance regression. So you need a “reference version”. This is what we generate. We run several times PHPBench to extract numbers, and then, we generate tests saying: “The expected numbers are the following…”. Then, when you are developping, you can check the performance by running the generated tests.
Do you see?
The generated tests will never be commited.
However, as I can see, you have a storage and you can already compare results of several runs. This is interesting. So, we could drop the test generation I guess, no? Do you see any added gain?
I think there could be a "workflow" gain - it sounds as if you are sugesting that the "master" branch is automatically checked out (either in the CWD or elsewhere) and the benchmarking suite is executed and used then as a reference - that could be a nice extension for PHPBench.
But I don't think the workflow is too bad now, you generate the reference after checking out the master branch and receieve a UUID:
phpbench run --store benchmarks/Micro/Math/KdeBench.php --progress=dots
...
Run: 1339ffe9ce9066787b4fa8217f957ebbf8bb4656
and then you can reference that UUID in subsequent reports at any time and compare it with the "meta UUID" latest
:
$ phpbench report --uuid=latest --uuid=1339ffe9ce9066787b4fa8217f957ebbf8bb4656 --report=blah
OK. So what we can do is to “wrap” PHPBench into “short” commands, like we did with hoa test:run
that is basically a wrapper around atoum
(it pre-fill all the options, find the configuration files etc.) or with hoa devtools:cs
that is a wrapper of the same nature around php-cs-fixer
(find the configuration files, add our own CS rules etc.).
Maybe a:
hoa test:performance --init
that will do the firstphpbench run --store …
,hoa test:performance
that willphpbench run
andphpbench report
against the first run,hoa test:performance --reset
to go back the initialize state (--init
implied--reset
) but does nothing.
Bonus:
hoa test:performance --loop
that waits the user to pressEnter
before running and comparing with the first run.
This workflow is good when you would like to compare often against an initial run, but it does not work great if you would like “incremental” comparisons/reports. Maybe something like:
hoa test:performance --delta
implies a--init
+ run.
Finally, a hoa test:performance --clean
is necessary to clean the storage.
What do you think?
(Note: This proposal removes atoum for the party).
looks reasonable - the only negative is that by calling PHPBench by proxy you limit the power of it, e.g.
$ phpbench run benchmarks --iterations=50 --revs=1 --revs=10 --revs=100 --revs=1000 --store
In this example we can use the results to compare the memory usage for 1, 10, 100 and 1000 revolutions of the benchmark in a single process.
So I (personally) would tend to use PHPBench directly rather than through a wrapping script but I think perhaps what would be better to discuss would be the benchmarks themselves, the manner in which PHPbench is executed is secondary imo.
Can I propose that I rewrite the hoa (compiler?) benchmarks in PHPBench as a POC? I could also generate a Hoa themed HTML report. Then we would have something more concrete to discuss and would be able to see if it is worth the effort :)
The hoa test:performance
command will receive more options, obvisouly 😃.
And 👍 for your proposal. Go for it!
ping?
oops, missed your response. will try and knock something together!
ping :-)? I would be really enthousiast to integrate PHPBench!
I love the idea !
About reference, is not trustable for non regression performance we need run old code and new code in same time for have a fair comparison.
This mean reference is always master and for each PR we should compare PhpBench on master vs PHPBench on branch.
That would be ideal, but is tricky on Travis as it would mean doing two composer installs, in addition to the time penalty the system load might change - although PHPBench does at least provide baselines which can indicate deviations there.
As mentioned on IRC a good start might be with the Ruler library.
@Pierozi @dantleech So far, I don't see this running on a CI server because this is hard to do a fair comparison. This could be interesting, but it's hard. First step is to get PHPBench in our Hoa\Test
or Hoa\Devtools
box. Then, we will see how we use it. I need PHPBench because I sometimes would like to compare my PR with themaster
, but maybe we can have other usages.
Made a small PR on Ruler: hoaproject/Ruler#96