SearchParser is a parser that converts a freeform query into an intermediate object, that can then be converted to query many backends (SQL, ElasticSearch, etc). It includes translators for SQL (using PDO) and Laravel Eloquent ORM. It supports a faceted language search as commonly found on many sites across the web.
For example, consider the following query:
from:foo@example.com "bar baz" !meef date:2018/01/01-2018/08/01 #hashtag
Using SearchParser, it is tokenized into a SearchQuery
object containing a series of SearchQueryComponent
s that represent each logical component of the search query:
$q = new \peckrob\SearchParser\SearchParser();
$x = $q->parse($query);
print_r($x);
Returns:
peckrob\SearchParser\SearchQuery Object
(
[position:peckrob\SearchParser\SearchQuery:private] => 0
[data:protected] => Array
(
[0] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => field
[field] => from
[value] => foo@example.com
[firstRangeValue] =>
[secondRangeValue] =>
[negate] =>
)
[1] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => text
[field] =>
[value] => bar baz
[firstRangeValue] =>
[secondRangeValue] =>
[negate] =>
)
[2] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => text
[field] =>
[value] => meef
[firstRangeValue] =>
[secondRangeValue] =>
[negate] => 1
)
[3] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => range
[field] => date
[value] =>
[firstRangeValue] => 2018/01/01
[secondRangeValue] => 2018/08/01
[negate] =>
)
[4] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => text
[field] =>
[value] => #hashtag
[firstRangeValue] =>
[secondRangeValue] =>
[negate] =>
)
)
)
Install using composer:
composer require peckrob/search-parser
Has no external dependencies. Only tested on PHP 7.2+, but may potentially work on PHP5. But you should not be using PHP5. :)
To parse a string into component tokens, create a SearchParser
instance and call parse()
on it.
$q = new \peckrob\SearchParser\SearchParser();
$x = $q->parse($query);
This will return a SearchQuery
object that contains a series of SearchQueryComponents
. The SearchQuery
object is iterable, you can loop over it with a foreach loop.
The built-in parser will parse the string above fine and supports a nice baseline of functionality. But if you need to extend the parser to parse additional data, you can do so trivially. You can create a class that implements the \peckrob\SearchParser\Parsers\Parser
interface and implements the parsePart()
method that returns a SearchQueryComponent
object. This will be added to the SearchQuery
object generated by the parser before being returned.
Then, just add the custom parser to SearchParser
by calling addParser()
.
$custom = new \peckrob\SearchParser\Parsers\Hashtag();
$q = new \peckrob\SearchParser\SearchParser();
$q->addParser($custom);
$q->parse($query);
Returns:
...
[4] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => hashtag
[field] =>
[value] => hashtag
[firstRangeValue] =>
[secondRangeValue] =>
[negate] =>
)
You can look at the Hashtag
parser in Parsers for an example. Of course, you will need to provide a matching custom transform to handle your new custom component type (see below). Note that parsers do not "fall through." If your parser handles a part, it will move on to the next part.
Included in the package are a couple of example transforms. These take the SearchQuery
output from parse()
and transform it into a format suitable for querying a backend. Included are a SQL backend and an Eloquent backend suitable for directly querying a Laravel Eloquent model object.
To use a transform, create an instance of a Transformer, passing in an optional default field and context object depending on the transformer.
$pdo = new PDO("sqlite:/tmp/foo.sql");
$transform = new \peckrob\SearchParser\Transforms\SQL\SQL("default_field", $pdo);
$where = $transform->transform($x);
Returns:
`from` = 'foo@example.com' and `default_field` = 'bar baz' and `default_field` != 'meef' and (`date` between '2018/01/01' and '2018/08/01') and `default_field` = '#hashtag'
SearchParser natively supports Laravel/Lumen Eloquent ORM queries. You can use the Eloquent transform.
$user = User::take(100);
$transform = new \peckrob\SearchParser\Transforms\Eloquent\Eloquent("default_field", $user);
$user = $transform->transform($x);
This will return the $user object with all the where()'s, etc. ready for a query.
$users = $user->get();
Both the native transforms support looseMode
, which treats every text query as a like
query. If you have defined custom parsers above, but not defined custom transforms (below), custom SearchQueryComponents
types are treated as text.
$pdo = new PDO("sqlite:/tmp/foo.sql");
$transform = new \peckrob\SearchParser\Transforms\SQL\SQL("default_field", $pdo);
$transform->looseMode = true;
$where = $transform->transform($x);
Returns:
`from` = 'foo@example.com' and `default_field` like '%bar baz%' and `default_field` not like '%meef%' and (`date` between '2018/01/01' and '2018/08/01') and `default_field` like '%#hashtag%'
In general you are free to transform the data however you like and you do not need to use any of the built-in transforms if you don't want to. However, the built-in transforms do also support custom component transforms as well, that they will call before they have run all their transforms. If you do not define a custom transform, custom parse types are treated as text in the standard transformer.
To create your own Transform, implement the \peckrob\SearchParser\Transforms\TransformsComponents
interface and implement the transformPart()
method. See the Hashtag transformer for an example.
$pdo = new PDO("sqlite:/tmp/foo.sql");
$transform = new \peckrob\SearchParser\Transforms\SQL\SQL("default_field", $pdo);
$transform->addComponentTransform(new \peckrob\SearchParser\Transforms\SQL\Hashtag("default_field", $pdo));
$where = $transform->transform($x);
Returns:
`from` = 'foo@example.com' and `default_field` = 'bar baz' and `default_field` != 'meef' and (`date` between '2018/01/01' and '2018/08/01') and hashtag = 'hashtag'
The SQL transform will escape data passed as arguments (that is why you pass a PDO object as the context), but not as fields. The Eloquent transform very likely works the same way under the hood.
The suggested approach is to filter the fields based on a whitelist and throw out things that aren't valid. Don't just pass the SearchQuery directly back to the SQL transform without filtering the fields.
SearchParser has a couple of filters availabe in the package. There is FieldFilter
and FieldNameMapper
. Filters are executed in the order that they are added to the Filter
object.
FieldFilter is a simple whitelist of valid fields. Any SearchQueryComponent
that has a field and does not match one of the whitelist of valid fields is removed rom the SearchQuery
.
$filter = new \peckrob\SearchParser\Filters\Filter();
$field_filter = new \peckrob\SearchParser\Filters\FieldNameMapper();
$field_filter->validFields = ['from'];
$filter->addFilter($field_filter);
$filter->filter($x);
Returns:
peckrob\SearchParser\SearchQuery Object
(
[position:peckrob\SearchParser\SearchQuery:private] => 5
[data:protected] => Array
(
[0] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => field
[field] => from
[value] => foo@example.com
[firstRangeValue] =>
[secondRangeValue] =>
[negate] =>
)
[1] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => text
[field] =>
[value] => bar baz
[firstRangeValue] =>
[secondRangeValue] =>
[negate] =>
)
[2] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => text
[field] =>
[value] => meef
[firstRangeValue] =>
[secondRangeValue] =>
[negate] => 1
)
[4] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => hashtag
[field] =>
[value] => hashtag
[firstRangeValue] =>
[secondRangeValue] =>
[negate] =>
)
)
)
Note that all fields except from
have been removed from the SearchQuery
object.
Suppose you have a field that you want to expose to your users that is differently titled on your backend. For instance, date
to your users might be created_on
on your backend. This where the FieldNameMapper
filter comes into play.
$filter = new \peckrob\SearchParser\Filters\Filter();
$field_filter = new \peckrob\SearchParser\Filters\FieldNameMapper();
$field_filter->mappingFields = [
'date' => 'created_on'
];
$filter->addFilter($field_filter);
$filter->filter($x);
Returns:
...
[3] => peckrob\SearchParser\SearchQueryComponent Object
(
[type] => range
[field] => date_created
[value] =>
[firstRangeValue] => 2018/01/01
[secondRangeValue] => 2018/08/01
[negate] =>
)
As with Parsers and Transformers, it is trivial to define custom filters. Simply create a class that implements the FiltersQueries
interface, and define a filter()
method. You will be passed in the SearchQuery
object.
There are some convenience methods on SearchQuery
that make writing filters a bit easier. Namely, they are:
remove(SearchQueryComponent $item)
- Removes an item from theSearchQuery
.replace(SearchQueryComponent $old, SearchQueryComponent $new)
- Replaces an item with a new item.merge(SearchQuery $query)
- Merges twoSearchQuery
objects together.
Once you have defined your custom filter, simply call addFilter()
on the Filter
instance. Again, filters are executed in the order they are added.
Tests are included. phpunit
is a require-dev in this project, so you will need to composer install with dev. Then just run phpunit
from the project root. Some tests may be skipped if optional components (such as Eloquent) are not installed.
Rebecca Peck
MIT