wmde/WikibaseDataModel

RFC: filter support for collections

Closed this issue · 7 comments

We could implement filter support for our collections. I got this idea by looking at Ardent.

This means adding a filter method to StatementList and other collections where we find need for it.

You can then do things such as these without needing to write a foreach:

$referencedStatements = $statements->filter( function( Statement $statement ) {
    return !$statement->getReferences->isEmpty();
} );

This approach is more OOP / high level than the more imperative foreach.

Implementation is simple:

public function filter( callable $filter ) {
        $statementList = new self();

        foreach ( $this->statements as $statement ) {
            if ( call_user_func( $filter, $statement ) ) {
                $statementList->addStatement( $statement );
            }
        }

        return $statementList;
    }

Example usecase for this in Wikibase.git is explained in this commit message: https://gerrit.wikimedia.org/r/#/c/219041/

Hm. Not sure. My first impression is that this feels like something you would do in JavaScript/jQuery.

The suggested method is so simple, there is not much to dislike. ;-)

But I think it will not be used much. Most of the time we will avoid using it to avoid iterating the resulting list again in an other loop.

I checked possible use cases and found:

  • In one case the filter would be a lovely return $statement->getMainSnak() instanceof PropertyValueSnak;, but the need for constructing a new StatementList that must then be iterated again wastes time and memory and does not make the code much more readable. All the new method saves is a simple if ( !( $snak instanceof PropertyValueSnak ) ) { continue; }, replaced with the filter call.

That's all I could find in addition to the mentioned GetClaims::claimMatchesFilters.

There is quite some code that deals with "best" statements, but the proposed filter can't do this anyway.

So my conclusion is that we do not really need this in our code base. Otherwise, it's a nice addition that can't hurt.

In ES6:

var filtered = statements.filter( statement => !statement.getReferences.isEmpty() );

I agree with Thiemo that it would indeed be nice to have this in some rare usecases but in most cases the proposed mechanism isn't powerful enough so that we have to iterate twice. However, I don't think that anonymous functions and callbacks are something only useful in javascript. Most highlevel programming languages (Java, C# etc.) support lambdas nowadays which is exactly what Jeroen proposes here.

I like the idea of having this, but I would prefer the type of filter being something like StatementFilter instead of callable.

Like this?

interface StatementFilter {
    /**
     * @return boolean
     */
    public function keepStatement( Statement $statement );
}

Advantages I see:

  • The signature is more explicit and enforced more
  • This makes the filters usable in other code by default

Disadvantages:

  • For simple one line filters such as the example provided in the OP, being forced to create a class is rather cumbersome. OTOH if a callable is used, you can still provide a handle to a method of such a class.

Any other arguments either way? I've used both in the past and am fine with both.

Yes, I think this way it is easier or safer to use and worth the overhead ...

Here you go #525