hoaproject/Central

Compile API documentation examples in executable tests

Hywan opened this issue · 24 comments

Hywan commented

Hello fellow @hoaproject/hoackers and users!

Introduction

This RFC aims at including the documentation of Hoa's libraries into the test quality process by having executable and testable examples. This is a sequel/extend to the RFC #58 which introduces a new API format. In particular, each method will have an API documentation written in markdown, with special sections, like Examples. In this section, we can have code blocks representing PHP code. The goal of this RFC is to define a process that: (i) extract the code from these API documentation, (ii) transform them into a unit or integration test, (iii) execute them.

The same process can be extended to RFC #51, where the Invalid examples and Valid examples Sections define the same kind of code blocks, with the same goal. The difference is that an invalid example is expected to fail, but the process is the same.

Extracting the code blocks

Based on the tools defined in the RFC #53, we will be able to:

  • Open a directory,
  • Iterate over PHP files,
  • Open each file,
  • Iterate over classes, chained with attributes and methods,
  • For each doc comment (/** … */), if the “Examples” Section is present, iterate over code blocks.

Now, a code block will has the following form:

\```[pl]?
c
\```

where c (body of the code block) is PHP code, and pl (programming language type of the code block, also called code block type) is php or none. If none, php will be the default.

Compiling to tests

The body of the code block will not use any test framework API, so no atoum or Hoa\Test API in our case. Why? Because this is documentation. Documentation has nothing to do with an API from our test framework. However, the documentation can contain assert intrinsics (also called Expectations). Example:

$x = 1;
$y = 2;
$z = new Sum();

assert(3 === $z($x, $y));

This example illustrates the usage of Sum, it shows also its result. The assert is here to illustrate the form of the results that Sum can produce. This way to write an example for the documentation provides all these benefits. I think this is adequate for this kind of documentation.

Since PHP 7.0, assert can take a second argument, called $exception, which is an instance of an exception of kind AssertionError. This exception will be thrown if the assertion fails. We can change the code to automatically add an exception as a second argument.

If we take our previous example, the code block will be compiled into:

public function case_methodName_example_1()
{
    $this
        ->when(function () {
            $x = 1;
            $y = 2;
            $z = new Sum();

            assert(3 === $z($x, $y), new AssertionError());
        });
}

Because the when pseudo-asserter has no specific meaning, we can create a new pseudo-asserter in Hoa\Test like: do, as an alias to when. Thus:

public function case_methodName_example_1()
{
    $this
        ->do(function () {
            $x = 1;
            $y = 2;
            $z = new Sum();

            assert(3 === $z($x, $y), new AssertionError());
        });
}

If the test is failing, atoum will catch that an unexpected exception has been thrown, and the test will fail.

This test case will belong to a specific test suite. The structure will be like this:

Xyz/
    Foo.php
    Test/
        Unit/
        Integration/
        Documentation/
            Foo.php

The Test/Documentation/Foo.php file will contain a test suite defined as:

namespace Hoa\Xyz\Test\Documentation;

use Hoa\Test;

class Foo extends Test\Documentation\Suite
{
    public function case_methodName_example_1()
    {
        // …
    }
}

where methodName is the name of the method whom API documentation is being tested.

Because the API documentation examples can be either a unit test or an integration test, a special Documentation namespace is created. It targets all the documentation tests.

Catching expected fail tests

This is possible that an example shows that an exception must be thrown. In this case, the test will fail while the example is valid. To avoid that, the following code block type must be used:

\```php,must_throw
c
\```

In this situation, the test case will compile to:

public function case_methodName_example_1()
{
    $this
        ->exception(function () {
            $x = 1;
            $y = 2;
            $z = new Sum();

            assert(3 === $z($x, $y), new AssertionError());
        })
            ->isInstanceOf(AssertionError::class);
}

Ignoring examples

To ignore an example, use the following code block type:

\```php,ignore
c
\```

Run tests

So far, tests are executed with the hoa test:run command. We also have hoa test:generate and hoa test:clean. These latters target Praspel, but we can re-use them to generate the documentation. We could select what kind of tests we would like to generate with an option, like hoa test:generate --documentation, or hoa test:generate --praspel for instance.

By default, hoa test:run will run hoa test:generate --documentation if no Documentation directory exists. The goal is to not modify the .travis.yml file to include these new tests.

In the contributor guide, we must stipulate that hoa test:generate --documentation must be run before hoa test:run when iterating on the code (edit, test, edit, test, edit, test…). Maybe we could introduce a cache invalidation system to re-generate only the specific documentation test suite. This should not be too much complicated.

Visually, there will be no difference between unit, integration or documentation test execution in the CLI report. Only the test suite namespaces will provide this information, like Hoa\Xyz\Test\Unit\Foo\Bar or Hoa\Xuz\Test\Documentation\Foo\Bar. Room for improvements in the CLI report to add a separator between “test namespaces” (between Unit, Integration etc.)?

Conclusion

I think with this approach we will be able to automatically test the API documentation examples. This will be a good win. With RFC #53 in mind, we will ensure that all examples will always be valid. This will improve the whole quality of the project. The test workflow will not be disturbed since hoa test:run will still control everything. The contributor workflow might change a little bit, but the impact is minimal compared to the guarantees it provides.

Thoughts?

1e1 commented

👍
It reduces the global code size.
But we are adding the Test file into the Class file. It could flood the code?
I notice the test generated by the documentation as not the same namespace as the written tests.

Hywan commented

@1e1 How can it reduce the global code size? What can flood the code, I don't get it? Also, the namespace for tests are (for a library called Hoa\Xyz): Hoa\Xyz\Test. This is the approach we are using since the beginning and it works great.

Seems really interesting, also adding examples in the source code documentation help developers to understand better the code behaviour...

@1e1 since it's "only" for the documentation tests you are not adding the tests in the class, only ensure examples quality 😄.

@Hywan I just saw one issue around documentation test obsolescence. Do you think these tests must be added to the Git ? Will they update automatically using CI or is the developer which will need to update documentation tests ?

Since it's generated tests, I think we don't need to store them somewhere because we always be able to regenerate them...

Hywan commented

@shulard Good note. We should add Test/Documentation/ to .gitignore. This is generated.

1e1 commented

@shulard Thx
Ok in this case, the example should be extract from the tests, instead of the doc comment?

Hywan commented

@1e1 You have the API documentation which contains an Examples Section. In this section, we parse the code blocks, and we compile them to tests. That's the workflow.

Hywan commented

Full example:

<?php

namespace Hoa\Xyz;

class Foo
{
    /**
     * This is an API documentation for the `f` method.
     *
     * # Examples
     *
     * This example creates 2 variables, namely `$x` and `$y`, and sums them
     * with the help of the `Foo::f` method.
     *
     * ```php
     * # use Hoa\Xyz\Foo;
     * $x   = 1;
     * $y   = 2;
     * $foo = new Foo();
     *
     * assert(3 === $foo->f($x, $y));
     * ```
     */
    public function f(int $x, int $y): int
    {
        // …
    }
}

Resulting test suite for the class Foo:

namespace Hoa\Xyz\Test;

use Hoa\Test;

class Foo extends Test\Documentation\Suite
{
    public function case_f_example_1()
    {
        $this
            ->do(function () {
                $x   = 1;
                $y   = 2;
                $foo = new \Hoa\Xyz\Foo();

                assert(3 === $foo->f($x, $y), new AssertionError());
            });
    }
}

Also, the result in the API documentation browser:

This is an API documentation for the f method.

Examples

This example creates 2 variables, namely $x and $y, and sums them with the help of the Foo::f method.

$x   = 1;
$y   = 2;
$foo = new Foo();

assert(3 === $foo->f($x, $y));

Isn't clearer?

1e1 commented

I agree to the literal lines. Why not extracting examples from the test suite?
The generated example seems different like the usual PHP example: http://php.net/manual/en/function.array-merge.php#refsect1-function.array-merge-examples
It ends by a print_r or a var_dump. Not an asset

Hywan commented

@1e1 What are the literal lines? What would we extract examples from the test suite? What test suite? The assert is here to validate a result/a data, not to print it. Take a look at assert.

This idea of "documentation as tests" looks a lot to what I had in mind when I started working on Rusty

If you decide to really implement this RFC, it might be worth considering using Rusty (I'd be willing to help, of course :) ).

Hywan commented

@K-Phoen Yup, I know this project. What is a blocker for me:

  • Not a library (so you have console dependencies & co.),
  • Does not support all the features we want,
  • Does not integrate with atoum.

Do you want to address these points?

Integrating rusty with atoum/phpunit is something that I also wanted. Splitting the project in two and provide both a CLI application and a library could be done too.
So yeah, if you think that Rusty can be relevant for your use case I'll address these points.

Hywan commented

I am also pretty sure we can do something much more simpler. I will try soon, and compare my POC with Rusty.

1e1 commented

@Hywan eg the API generator cannot guess the sentences ;)

I don't understand. If there is some tests (in ~/Test/Unit). Why writing another code in the doc comment?
The API generator should read the relative test suite and extract one example (or all ones)?

Moreover if the doc comment contains a Praspel instruction, this one will appear in the final documentation.

I miss something. I guess some issues have the same goal but the steps are already defined (like: #53 ).

Hywan commented

@1e1 The goal of this RFC is to compile examples into tests. That's all. Unit tests are not examples, they form an executable informal specification. An example illustrates a particular usage of a method, or a datum, relevant to understand its global usage or an edge case. So the direction is Examples to Tests, not the opposite.

If the API documentation contains a contract written in Praspel, this is not related to this RFC at all. We are talking about the Examples Section, not the Praspel/Contracts Section.

RFC #53 has nothing to do with this RFC neither. The common basis between #52, #53, and #58 is the new API documentation format. This new format allows many features, like the ones described in all the RFC.

Is it clear :-)?

Hywan commented

Hello,

So this Gist https://gist.github.com/Hywan/b8bd387def5e3cc13e024c4f924e8c3c makes everything work. It just does not save the result in specific files, but here is what it does so far:

  • Scan all PHP files,
  • Include all of them,
  • By using introspection (reflection), we scan all methods, and parse API documentations,
  • For each API documentation, we scan for the Examples Section, we collect all code blocks,
  • Each code block is compiled into test cases,
  • The final test suite is consituted.

Result:

<?php

namespace Hoa\Acl\Test\Integration;

use Hoa\Test;

class A extends Test\Integration\Suite
{
    public function case_sum_example_0()
    {
        $this
            ->do(function () {
                $x = 1;
                $y = 3;

                assert(3 === $x + $y);
            });

    }

    public function case_sum_example_1()
    {
        $this
            ->do(function () {
                $x = 1;
                $y = 2;
                $a = new \Hoa\Acl\A();

                assert(3 === $a->sum($x, $y));
            });

    }
}

From this:

<?php

namespace Hoa\Acl;

class A
{
    /**
     * The `sum` method will compute the sum of two integers, `$x` and `$y`.
     *
     * # A section
     *
     * Bla bla bla
     *
     * # Examples
     *
     * This first example shows a regular sum with the `+` operator. Looser.
     *
     * ```
     * $x = 1;
     * $y = 2;
     *
     * assert(3 === $x + $y);
     * ```
     *
     * This example shows how a real programmer will use the `sum` method.
     *
     * ```php
     * $x = 1;
     * $y = 2;
     * $a = new A();
     *
     * assert(3 === $a->sum($x, $y));
     * ```
     *
     * # Exceptions
     *
     * Nothing special. Baboum.
     */
    public function sum(int $x, int $y): int
    {
        return $x + $y;
    }

    public function noDoc()
    {
    }

    /**
     * This method has no example.
     *
     * # A section
     *
     * Bla bla bla
     */
    public function noExample()
    {
    }
}

This patch hoaproject/Test#87 introduces the do asserter. It also sets the assertion behavior. An exception is automatically thrown if an assertion fails, so no need to instrument the code.

What it is not supported yet:

  • Filter by code block type (none, php, php,ignore, php,must_throw),
  • Save to files,
  • Integrate to hoa test:generate,
  • Integrate to hoa test:run.

My opinion: A simple class can do the trick (< 200 LOC). This is a very simple compilation/transformation step, no need to have multiple classes & co. The only dependency we add to hoa/test is league/commonmark, which is a standalone library, so it's great too.

Hywan commented

I got some free time tonight, so I made some progresses on hoaproject/Test#87: The implementation for this RFC.

  1. $ vendor/bin/hoa test:generate -d ../Acl -n Hoa.Acl to generate the test suites,
  2. $ vendor/bin/hoa test:run -d ../Acl/Test/Documentation to run the test suites.

Considering the same code sample above, the produced test suite is the following:

<?php

namespace Hoa\Acl\Test\Documentation;

use Hoa\Test;

class A extends Test\Integration\Suite
{
    public function case_sum_example_0()
    {
        $this
            ->assert(function () {
                $x = 1;
                $y = 2;

                assert(3 === $x + $y);
            });

    }

    public function case_sum_example_1()
    {
        $this
            ->assert(function () {
                $x = 1;
                $y = 2;
                $a = new \Hoa\Acl\A();

                assert(3 === $a->sum($x, $y));
            });

    }
}

It is saved in the Hoa/Acl/Test/Documentation/A.php file.

Important things to notice:

  • We are using the assert atoum asserter. It is implemented in Hoa\Test. There is already an naive assert asserter in atoum but we are overriding it (is it a good idea?),
  • The test suite extends Hoa\Test\Integration\Suite. Having a Hoa\Test\Documentation\Suite is technically harder because the Documentation directory already exists in Hoa/Test/Documentation. Anyway, that's not a big deal, and it also makes sense.

Output if everything is working great:

Suite Hoa\Acl\Test\Documentation\A...
[SS__________________________________________________________][2/2]
~> Duration: 0.000325 second.
~> Memory usage: 0.000 Kb.

> Total test duration: 0.00 second.
> Total test memory usage: 0.00 Mb.
> Running duration: 0.08 second.

Success (1 test suite, 2/2 test cases, 0 void test case, 0 skipped test case, 2 assertions)!

Output when at least one test case fails, here we replaced 3 by 4:

- assert(3 === $a->sum($x, $y));
+ assert(4 === $a->sum($x, $y));
Suite Hoa\Acl\Test\Documentation\A...
[SF__________________________________________________________][2/2]
~> Duration: 0.000275 second.
~> Memory usage: 0.000 Kb.

> Total test duration: 0.00 second.
> Total test memory usage: 0.00 Mb.
> Running duration: 0.07 second.

Failure (1 test suite, 2/2 test cases, 0 void test case, 0 skipped test case, 0 uncompleted test case, 1 failure, 0 error, 0 exception)!

There is 1 failure:
~> Hoa\Acl\Test\Documentation\A::case_sum_example_1():
In file /Users/hywan/Development/Hoa/Project/Central/Hoa/Acl/Test/Documentation/A.php on line 30, Hoa\Test\Asserter\Assert() failed: The assertion `assert(4 === $a->sum($x, $y))` has failed.

Note the:

The assertion assert(4 === $a->sum($x, $y)) has failed.

This is the information we need. No diff, no exception stack trace, just the failing assert.

Hywan commented

Next steps:

  • Auto-run hoa test:generate from hoa test:run to remove one step,
  • Filter by code block type (none, php, php,ignore, php,must_throw).
Hywan commented

Progression:

must_throw expects an exception to be thrown. The kind of the exception cannot be set, so I would propose something like must_throw(My\Exception) to expect an exception of kind My\Exception only, not another exception. Should we discuss about this issue in another RFC, or in this one? Thoughts?

Next steps:

  • Support # in code (# will hide a line for the documentation API, but it will be uncommented when running tests),
  • Add use support, to not write: $a = new \Hoa\Foo\Bar();, but just:
# use Hoa\Foo\Bar;
$a = new Bar();

I don't know if it is hard. I don't want to use a lexer, nor a parser, for PHP. I would like to keep it simple, let's see.

The ability to define an exception type is a must have, love the syntax !

Hywan commented

# and use are supported 🎉, see hoaproject/Test@c3ba4c6. Here is the commit message for the record:

Code block can have comments. If the comment starts with # (shell
style), then the whole comment is removed for the HTML API browser, but
they are kept when compiling examples into test cases. So the following
example:

 # $a = 1;
 $b = 2;
 assert(3 === $a + $b);

will be compiled as the following test case:

 $a = 1;
 $b = 2;
 assert(3 === $a + $b);

and will be displayed as follows for the HTML version:

 $b = 2;
 assert(3 === $a + $b);

This is useful when we would like to add use statements in
comments. In our context, we cannot use use because test cases are
methods, and the syntax of PHP does not allow use statements inside
methods. So we must expanded use statements to remove them.

To address that, the new unfoldCode method lexes and expands use
statements. For instance:

# use Foo\Bar;
new Bar\Baz();

will be unfolded as:

# use Foo\Bar;
new \Foo\Bar\Baz();

To achieve this, first, the comments are removed, and, second, <?php
is prepended:

<?php
use Foo\Bar;
new Bar\Baz();

Then, third, we use the token_get_all native lexer to lex the code
block, and rewrite it. Finally, when rewriting, the use statements are
put in comments (of kinds #), even if they were not in comments
before. The <?php opening tag is removed too. The result is:

# use Foo\Bar;
new \Foo\Bar\Baz();
Hywan commented

Done. This RFC is implemented!

hoaproject/Test#87

Hywan commented

Congrats everyone ❤️!