This library provides native PHP access to the Google App Engine Search API.
At the time of writing there is no off-the-shelf way to access the Google App Engine full text search API from the PHP runtime.
Generally this means developers cannot access the service without using Python/Java/Go proxy modules - which adds complexity, another language, additional potential points of failure and performance impact.
ALPHA This library is in the very early stages of development. Do not use it in production. It will change.
- Examples
- Install with Composer
- Queries
- Geo Queries
- Autocomplete
- Creating Documents - includes location (Geopoint) and Dates
- Facets
- Deleting Documents
- Local Development
- Best Practice, Free Quotas, Costs
- Google Software
I find examples a great way to decide if I want to even try out a library, so here's a couple for you.
// Schema describing a book
$obj_schema = (new \Search\Schema())
->addText('title')
->addText('author')
->addAtom('isbn')
->addNumber('price');
// Create and populate a document
$obj_book = $obj_schema->createDocument([
'title' => 'The Merchant of Venice',
'author' => 'William Shakespeare',
'isbn' => '1840224312',
'price' => 11.99
]);
// Write it to the Index
$obj_index = new \Search\Index('library');
$obj_index->put($obj_book);
In this example, I've used the Alternative Array Syntax for creating Documents - but you can also do it like this:
$obj_book = $obj_schema->createDocument();
$obj_book->title = 'Romeo and Juliet';
$obj_book->author = 'William Shakespeare';
$obj_book->isbn = '1840224339';
$obj_book->price = 9.99;
Now let's do a simple search and display the output
$obj_index = new \Search\Index('library');
$obj_response = $obj_index->search('romeo');
foreach($obj_response->results as $obj_result) {
echo "Title: {$obj_result->doc->title}, ISBN: {$obj_result->doc->isbn} <br />", PHP_EOL;
}
Search pubs!
Application: http://pub-search.appspot.com/
Code: https://github.com/tomwalder/pub-search
To install using Composer, use this require line in your composer.json
for bleeding-edge features, dev-master
"tomwalder/php-appengine-search": "v0.0.4-alpha"
Or, if you're using the command line:
composer require tomwalder/php-appengine-search
You may need minimum-stability: dev
You can supply a simple query string to Index::search
$obj_index->search('romeo');
For more control and options, you can supply a Query
object
$obj_query = (new \Search\Query($str_query))
->fields(['isbn', 'price'])
->limit(10)
->sort('price');
$obj_response = $obj_index->search($obj_query);
Some simple, valid query strings:
price:2.99
romeo
dob:2015-01-01
dob < 2000-01-01
tom AND age:36
For much more information, see the Python reference docs: https://cloud.google.com/appengine/docs/python/search/query_strings
$obj_query->sort('price');
$obj_query->sort('price', Query::ASC);
$obj_query->limit(10);
$obj_query->offset(5);
$obj_query->fields(['isbn', 'price']);
The library supports requesting arbitrary expressions in the results.
$obj_query->expression('euros', 'gbp * 1.45']);
These can be accessed from the Document::getExpression()
method on the resulting documents, like this:
$obj_doc->getExpression('euros');
You can fetch a single document from an index directly, by it's unique Doc ID:
$obj_index->get('some-document-id-here');
You can enable the MatchScorer by calling the Query::score
method.
If you do this, each document in the result set will be scored by the Search API "according to search term frequency" - Google.
Without it, documents will all have a score of 0.
$obj_query->score();
And the results...
foreach($obj_response->results as $obj_result) {
echo $obj_result->score, '<br />'; // Score will be a float
}
If you apply score()
and sort()
you may be wasting cycles and costing money. Only score documents when you intend to sort by score.
If you need to mix sorting of score and another field, you can use the magic field name _score
like this - here we sort by price then score, so records with the same price are sorted by their score.
$obj_query->score()->sort('price')->sort('_score');
A common use case is searching for documents that have a Geopoint field, based on their distance from a known Geopoint. e.g. "Find pubs near me"
There is a helper method to do this for you, and it also returns the distance in meters in the response.
$obj_query->sortByDistance('location', [53.4653381,-2.2483717]);
This will return results, nearest first to the supplied Lat/Lon, and there will be an expression returned for the distance itself - prefixed with distance_from_
:
$obj_result->doc->getExpression('distance_from_location');
Autocomplete is one of the most desired and useful features of a search solution.
This can be implemented fairly easily with the Google App Engine Search API, with a little slight of hand!
The Search API does not natively support "edge n-gram" tokenisation (which is what we need for autocomplete!).
So, you can do this with the library - when creating documents, set a second text field with the output from the included Tokenizer::edgeNGram
function
$obj_tkzr = new \Search\Tokenizer();
$obj_schema->createDocument([
'name' => $str_name,
'name_ngram' => $obj_tkzr->edgeNGram($str_name),
]);
Then you can run autocomplete queries easily like this:
$obj_response = $obj_index->search((new \Search\Query('name_ngram:' . $str_query)));
You can see a full demo application using this in my "pub search" demo app
As per the Python docs, the available field types are
- Atom - an indivisible character string
- Text - a plain text string that can be searched word by word
- HTML - a string that contains HTML markup tags, only the text outside the markup tags can be searched
- Number - a floating point number
- Date - a date with year/month/day and optional time
- Geopoint - latitude and longitude coordinates
We support DateTime
objects or date strings in the format YYYY-MM-DD
(PHP date('Y-m-d')
)
$obj_person_schema = (new \Search\Schema())
->addText('name')
->addDate('dob');
$obj_person = $obj_person_schema->createDocument([
'name' => 'Marty McFly',
'dob' => new DateTime()
]);
Create an entry with a Geopoint field
$obj_pub_schema = (new \Search\Schema())
->addText('name')
->addGeopoint('where')
->addNumber('rating');
$obj_pub = $obj_pub_schema->createDocument([
'name' => 'Kim by the Sea',
'where' => [53.4653381, -2.2483717],
'rating' => 3
]);
It's more efficient to insert in batches if you have multiple documents. Up to 200 documents can be inserted at once.
Just pass an array of Document objects into the Index::put()
method, like this:
$obj_index = new \Search\Index('library');
$obj_index->put([$obj_book1, $obj_book2, $obj_book3]);
There is an alternative to directly constructing a new Search\Document
and setting it's member data, which is to use the Search\Schema::createDocument
factory method as follows.
$obj_book = $obj_schema->createDocument([
'title' => 'The Merchant of Venice',
'author' => 'William Shakespeare',
'isbn' => '1840224312',
'price' => 11.99
]);
You can set a namespace when constructing an index. This will allow you to support multi-tenant applications.
$obj_index = new \Search\Index('library', 'client1');
The Search API supports 2 types of document facets for categorisation, ATOM and NUMBER.
ATOM are probably the ones you are most familiar with, and result sets will include counts per unique facet, kind of like this:
For shirt sizes
- small (9)
- medium (37)
$obj_doc->atomFacet('size', 'small');
$obj_doc->atomFacet('colour', 'blue');
$obj_query->facets();
You can delete documents by calling the Index::delete()
method.
It support one or more Document
objects - or one or more Document ID strings - or a mixture of objects and ID strings!
$obj_index = new \Search\Index('library');
$obj_index->delete('some-document-id');
$obj_index->delete([$obj_doc1, $obj_doc2]);
$obj_index->delete([$obj_doc3, 'another-document-id']);
The Search API is supported locally, because it's included to support the Python, Java and Go App Engine runtimes.
Like most App Engine services, search is free... up to a point!
And some best practice that is most certainly worth a read
I've had to include 2 files from Google to make this work - they are the Protocol Buffer implementations for the Search API. You will find them in the /libs
folder.
They are also available directly from the following repository: https://github.com/GoogleCloudPlatform/appengine-php-sdk
These 2 files are Copyright 2007 Google Inc.
As and when they make it into the actual live PHP runtime, I will remove them from here.
Thank you to @sjlangley for the assist.
If you've enjoyed this, you might be interested in my Google Cloud Datastore Library for PHP, PHP-GDS