A user-friendly high-level Elasticsearch client wrapper
You can define an index as a class (although it's technically a dataclass) and define fields that you would like to have in your index.
from typing import Optional
from pylastic.indexes import ElasticIndex
class Comment(ElasticIndex):
timestamp: int
author: str
text: str
rating: Optional[float]
Now the Comment
class is a dataclass, and you can also do CRUD operations with it. However, we'll
postpone that to review available field definitions.
Specify field along with its type using annotations:
string_field: str
integer_field: int
You can use Pythonic types, but pylastic
also provides a pylastic.types
package with all the
types ES supports (see them here).
These types will be used to create a mapping so that ES can correctly process fields in your index.
GeoPoint
. Allows to specify fields of typegeo_point
. Read more hereText
(representstext
andmatch_only_text
types). Read more hereKeyword
(representskeyword
,constant_keyword
andwildcard
). Read more hereDate
(representsdate
and allows to store a date and datetime). Read more here
To mark a field as optional, use typing.Optional
with the type it's supposed to have, e.g. Optional[GeoPoint]
- Subclass
pylastic.types.base.ElasticType
- (Optional) Define
get_valid_object
class method that validates the object definition (see "GeoType" for example) - (Optional) Define
Meta.type
that contains ES type name (e.g.geo_point
forGeoPoint
, ...) - (Optional) Define
get_mapping(self) -> dict
method that returns object's mapping (e.g.{'type': '...', ...}
). This might be useful if your class has a custom__init__
method (mapping definition changes based on parameters provided)
In some rare cases (e.g. Text
type) you'll need to change the field type from the code. Since Meta
class won't work,
use self._index
attribute. It'll be picked up automatically by ElasticType.get_mapping()
Index metadata defines how an ElasticIndex
subclass is presented in Elasticsearch. The following
fields are available in the ElasticIndex.Meta
class:
index
. Specify a constant index name.index_prefix
. Index prefix to use, ifis_datastream
is Trueis_datastream
. IfTrue
, datastream logic will be applied:- Index template will be created:
Meta.index_prefix-*
- NOT IMPLEMENTED YET
- Index template will be created:
id_field
. Use one of the fields as_id
in ES. When deserialized, it'll be replaced with the field name you specify. Defaults to_id
. Note that ID field cannot be used in aggregations and is limited to 512 bytes. To customize index creation, redefineElasticIndex.get_index()
method that returns index name.
The following tips might help you reduce the size of the index:
- Decrease the number of replicas (via
Meta.replicas
attribute). This will have an impact on your index's availability but it may be a reasonable tradeoff - Use appropriate types for storing strings: consider
keyword
,match_only_text
, etc.
ElasticClient
is the wrapper for the official Elasticsearch
package and exposes all the available methods but
also provides convenience methods:
create_index(index: ElasticIndex
, index_name: Optional[str] = None). Creates an index.execute(template)
. Executes aRequestTemplate
instance.save(objects)
. Saves one or moreElasticType
instances to the index.refresh_index(index)
. Refreshes the index.
Various cluster configurations are available from the pylastic.configuration
module. They are documented in more detail below:
Subclass the ILMPolicy
class to define a policy. Use the declarative style. Specify configuration for different phases by
creating the nested classes with the corresponding name (Hot
, Warm
, Cold
, Frozen
, Delete
):
from pylastic.configuration.ilm import ILMPolicy
class MyILMPolicy(ILMPolicy):
class Hot:
...
class Meta:
name = "" # Policy name. If not specified, lowercase class name will be used