eArc-data-elasticsearch

Elasticsearch-bridge for the earc/data persistence handler.

table of contents

installation

Install the earc data elasticsearch library via composer.

$ composer require earc/data-elasticsearch

basic usage

bootstrap

Initialize the earc/data package.

use eArc\Data\Initializer;

Initializer::init();

If your elasticsearch server is not located at localhost:9200 or you need to authenticate, you have to configure it.

use eArc\DataElasticsearch\ParameterInterface;

$hosts = ['https://user:pass@elasticsearch.my-server.com:32775'];

di_set_param(ParameterInterface::CLIENT_HOSTS, $hosts);

Then register the earc/data-elasticsearch bridge.

use eArc\Data\ParameterInterface;
use eArc\DataElasticsearch\ElasticsearchDataBridge;

di_tag(ParameterInterface::TAG_ON_PERSIST, ElasticsearchDataBridge::class);
di_tag(ParameterInterface::TAG_ON_REMOVE, ElasticsearchDataBridge::class);
di_tag(ParameterInterface::TAG_ON_FIND, ElasticsearchDataBridge::class);

Now your entities will be indexed and removed from the index automatically. You are ready to search your earc/data entities via elasticsearch.

initialize index

If you have persisted entities before installing the earc/data-elasticsearch bridge, you can use IndexService::rebuildIndex() to index the entities that are in your data-store already.

use eArc\DataElasticsearch\IndexService;

di_get(IndexService::class)->rebuildIndex([
    // list of your entity classes
]);

This has to be done only once. New entities and updates are indexed automatically.

search

To search is very straight forward.

$primaryKeys = data_find(MyUserEntity::class, ['name' => ['Max', 'Moritz'], 'age' => 21]);
$userEntities = data_load_batch(MyUserEntity::class, $primaryKeys);

This would find all entities of the MyUserEntitiy class with a name property of Max or Moritz and an age property of 21.

If you don't need the primary keys data_find_entities is shorter.

$userEntities = data_find_entities(MyUserEntity::class, ['name' => ['Max', 'Moritz'], 'age' => 21]);

To find all existing primary keys use the empty array.

$allUserPrimaryKeys = data_find(MyUserEntity::class, []);

enhanced syntax

earc/data-elasticsearch supports more than the complete earc/data data_find syntax. From the support for ranges and full text search to every elasticsearch query possible. This one size fits all approach is simple to start with, but as hard as the elasticsearch dsl to master.

range

The use of the range is done via a ..range postfix.

data_find(MyUserEntity::class, ['age..range' => ['>' => 18]]);

This gives an open range above 18.

Closed ranges are possible too.

data_find(MyUserEntity::class, ['lastLogin..range' => ['>=' => '2021-01-01', '<=' => '2021-01-31']]);

match

To perform a full text search (use the elasticsearch verb match against a text field) use the ..match postfix.

data_find(MyUserEntity::class, ['city..match' => 'Münster']);

text

The ..text postfix gives a keyword (term) search against a text field. This becomes handy if you search a single word but doesn't know if your target is uppercase, lowercase or uppercase first.

data_find(Price::class, ['currency..text' => 'eur']);

exists

Elasticsearch does not know null values. Instead of IS NOT NULL you can check if a property exists, which is in most cases the same.

data_find(Price::class, ['currency..exists' => null]);

Or check if a property does not exist, which is similar to IS NULL.

data_find(Price::class, ['currency..exists_not' => null]);

_id

The getter getPrimaryKey() is used as elasticsearch document id. Thus, you can use the property _id to test against one or more primary keys.

data_find(MyUserEntity::class, ['_id' => ['1', '2', '392']]);

embedded entity

To query embedded entities you can use the dot syntax.

data_find(MyUserEntity::class, [
    'login.email' => 'kai@email.com',
    'login.password' => 'l9TFoW5549', 
]);

embedded entity collection (nested)

Embedded entity collections have two properties _entityName and _items. _items invoke a nested query.

data_find(MyUserEntity::class, [
    'group._entityName' => Permission::class,
    'group._items' => [
        'name' => ['admin', 'moderator'],
        'active' => true,
    ], 
]);

joins

Elasticsearch does not know joins, but they can be realized via two separate queries.

$colorCategories = data_find(AttributeCategory::class, ['name..match' => ['colour', 'color']]);
$colors = data_find(Attribute::class, [
    'attributeCategory' => $colorCategories,
    'name..match' => ['blue', 'green', 'violet'],
]);

This works for one-to-one and many-to-one relations.

If your joined entity uses a collection e.g. the join represents a one-to-many or a many-to-many relation, then you have to use the .items embedded syntax.

$colors = data_find(Attribute::class, ['name..match' => ['white']]);
$colorCategories = data_find(AttributeCategory::class, ['attributes.items' => $colors]);

raw query

You can always call upon the raw power of the elasticsearch dsl via the .raw postfix.

data_find(Price::class, [        
    'offerStartDate.raw' => ['bool' => ['must' => [
        ['exists' => ['field' => 'offerStartDate']],
        ['range' => ['offerStartDate' => ['lte' => 'now']]],
    ]]],
]);

raw search body

To write the complete search in the elasticsearch dsl use the .raw_body key.

data_find(Price::class, [        
    '.raw_body' => [
        'query' => [
            'constant_score' => [
                'filter' => [
                    'bool' => [
                        'must' => [
                            ['exists' => ['field' => 'offerStartDate']],
                            ['range' => ['offerStartDate' => ['lte' => 'now']]],
                        ],
                    ],
                ],
            ],
        ],
    ],
]);

advanced usage

index name

The indices of the entities are named earc-data- plus the lowercase version of the fully qualified class name where the backslash \\ is replaced by the minus sign -. The earc-data prefix can be configured.

use eArc\DataElasticsearch\ParameterInterface;

di_set_param(ParameterInterface::INDEX_PREFIX, 'my-index-prefix');

entity whitelist and blacklist

All entities are indexed by default. This can be changed via whitelisting or blacklisting.

use eArc\DataElasticsearch\ParameterInterface;
use eArc\DataElasticsearchTests\Entities\BlacklistedEntity;

di_set_param(ParameterInterface::WHITELIST, [
    // list of entity class names
    BlacklistedEntity::class => true,
    // ...
]);

Only the entities on the whitelist will be indexed.

use eArc\DataElasticsearch\ParameterInterface;

di_set_param(ParameterInterface::BLACKLIST, [
    // list of entity class names
    Attribute::class => true,
    // ...
]);

All but the entities on the blacklist will be indexed.

If black- and whitelist is configured the whitelist is used only.

extend the elasticsearch bridge

To extend the elasticsearch bridge just decorate one of its classes.

use eArc\DataElasticsearch\DocumentFactory;

di_decorate(DocumentFactory::class, MyDocumentFactory::class);

Since there are only three classes (DocumentFactory, ElasticsearchDataBridge and IndexService) copy and pasting them into your project is a reasonable option.

releases

release 0.0

  • the first official release
  • php ^8.0 support
  • elasticsearch ^7.0 support