Basic Elasticsearch datasource connector for Loopback.
Table of Contents generated with DocToc
- Overview
- Install this connector in your loopback app
- Configuring connector
- About the example app
- How to achieve Instant search
- Troubleshooting
- Testing
- Contributing
- Frequently Asked Questions
- Release notes
lib
directory has the entire source code for this connector- this is what gets downloaded to your
node_modules
folder when you runnpm install loopback-connector-es --save --save-exact
examples
directory has a loopback app which uses this connector- this is not published to NPM, it is only here for demo purposes
1. it will not be downloaded to your
node_modules
folder! 1. similarly theexamples/server/datasources.json
andexamples/server/datasources.<env>.js
files are there for this demo app to use 1. you can copy their content over to<yourApp>/server/datasources.json
or<yourApp>/server/datasources.<env>.js
if you want and edit it there but don't start editing the files insideexamples/server
itself and expect changes to take place in your app! test
directory has unit tests- it does not reuse the loopback app from the
examples
folder - instead, loopback and ES/datasource are built and injected programatically
- this directory is not published to NPM.
1. Refer to
.npmignore
if you're still confused about what's part of the published connector and what's not. - You will find the
datasources.json
files in this repo mention various configurations: elasticsearch-ssl
elasticsearch-plain
db
- You don't need them all! They are just examples to help you see the various ways in which you can configure a datasource. Delete the ones you don't need and keep the one you want. For example, most people will start off with
elasticsearch-plain
and then move on to configuring the additional properties that are exemplified inelasticsearch-ssl
. You can mix & match if you'd like to have mongo and es and memory, all three! These are basics of the "connector" framework in loooback and not something we added. - Don't forget to edit your
model-config.json
file and point the models at thedataSource
you want to use.
cd <yourApp>
npm install loopback-connector-es --save --save-exact
- host: Elasticsearch engine host address.
- port: Elasticsearch engine port.
- name: Connector name.
- connector: Elasticsearch driver.
- index: Search engine specific index.
- apiVersion: specify the major version of the Elasticsearch nodes you will be connecting to.
- mappings: an array of elasticsearch mappings for your various loopback models.
- if your models are spread out across different indexes then you can provide an additional
index
field as an override for your model - if you don't want to use
type:ModelName
by default, then you can provide an additionaltype
field as an override for your model
- if your models are spread out across different indexes then you can provide an additional
- log: sets elasticsearch client's logging, you can refer to the docs here
- defaultSize: total number of results to return per page.
- refreshOn optional array with method names you want to set refresh option as true
- requestTimeout: this value is in milliseconds
- ssl: useful for setting up a secure channel
- protocol: can be
http
orhttps
(http
is the default if none specified) ... must behttps
if you're usingssl
- auth: useful if you have access control setup via services like
es-jetty
orfound
orshield
- amazonES: configuration for
http-aws-es
NOTE: The package needs to be installed in your project. Its not part of this Connector.
- Edit datasources.json and set:
"db": {
"connector": "es",
"name": "<name>",
"index": "<index>",
"hosts": [
{
"protocol": "http",
"host": "127.0.0.1",
"port": 9200,
"auth": "username:password"
}
],
"apiVersion": "<apiVersion>",
"refreshOn": ["save","create", "updateOrCreate"],
"log": "trace",
"defaultSize": <defaultSize>,
"requestTimeout": 30000,
"ssl": {
"ca": "./../cacert.pem",
"rejectUnauthorized": true
},
"amazonES": {
"region": "us-east-1",
"accessKey": "AKID",
"secretKey": "secret"
},
"mappings": [
{
"name": "UserModel",
"properties": {
"realm": {"type": "string", "index" : "not_analyzed" },
"username": {"type": "string", "index" : "not_analyzed" },
"password": {"type": "string", "index" : "not_analyzed" },
"email": {"type": "string", "analyzer" : "email" }
}
},
{
"name": "CoolModel",
"index": <useSomeOtherIndex>,
"type": <overrideTypeName>,
"properties": {
"realm": {"type": "string", "index" : "not_analyzed" },
"username": {"type": "string", "index" : "not_analyzed" },
"password": {"type": "string", "index" : "not_analyzed" },
"email": {"type": "string", "analyzer" : "email" }
}
}
],
"settings": {
"analysis": {
"filter": {
"email": {
"type": "pattern_capture",
"preserve_original": 1,
"patterns": [
"([^@]+)",
"(\\p{L}+)",
"(\\d+)",
"@(.+)"
]
}
},
"analyzer": {
"email": {
"tokenizer": "uax_url_email",
"filter": ["email", "lowercase", "unique"]
}
}
}
}
}
- You can peek at
/examples/server/datasources.json
for more hints.
- The
examples
directory contains a loopback app which uses this connector. - You can point this example at your own elasticsearch instance or use the quick instances provided via docker.
As a developer, you may want a short lived ES instance that is easy to tear down when you're finished dev testing. We recommend docker to facilitate this.
Pre-requisites You will need docker-engine and docker-compose installed on your system.
Step-1
- Set desired versions for node and Elasticsearch
- here are the valid values to use for Node
- here are the valid values to use for Elasticsearch
# combination of node v0.10.46 with elasticsearch v1
export NODE_VERSION=0.10.46
export ES_VERSION=1
echo 'NODE_VERSION' $NODE_VERSION && echo 'ES_VERSION' $ES_VERSION
# similarly feel free to try relevant combinations:
## of node v0.10.46 with elasticsearch v2
## of node v0.12 with elasticsearch v2
## of node v0.4 with elasticsearch v2
## of node v5 with elasticsearch v2
## elasticsearch v5 will probably not work as there isn't an `elasticsearch` client for it, as of this writing
## etc.
Step-2
- Run the setup with
docker-compose
commands.
git clone https://github.com/strongloop-community/loopback-connector-elastic-search.git myEsConnector
cd myEsConnector/examples
npm install
docker-compose up
Step-3
- Visit
localhost:3000/explorer
and you will find our example loopback app running there.
- Empty out
examples/server/datasources.json
so that it only has the following content remaining:{}
- Set the
NODE_ENV
environment variable on your local/host machine- Set the environment variable
NODE_ENV=sample-es-plain-1
if you want to useexamples/server/datasources.sample-es-plain-1.js
- Set the environment variable
NODE_ENV=sample-es-plain-2
if you want to useexamples/server/datasources.sample-es-plain-2.js
- Set the environment variable
NODE_ENV=sample-es-ssl-1
if you want to useexamples/server/datasources.sample-es-ssl-1.js
- a sample docker instance for this hasn't been configured yet, so it doesn't work out-of-the-box, use it only as readable (not runnable) reference material for now
- Set the environment variable
- You can configure your own
datasources.json
ordatasources.<env>.js
based on what you learn from these sample files. 1. Technically, to run the example, you don't need to setNODE_ENV
if you won't be configuring via the.<env>.js
files ... configuring everything withindatasources.json
is perfectly fine too. Just remember that you will lose the ability to have inline comments and will have to use double-quotes if you stick with.json
- Start elasticsearch version 1.x and 2.x using:
git clone https://github.com/strongloop-community/loopback-connector-elastic-search.git myEsConnector
cd myEsConnector
docker-compose -f docker-compose-for-tests.yml up
# in another terminal window or tab
cd myEsConnector/examples
npm install
DEBUG=boot:test:* node server/server.js
- Visit
localhost:3000/explorer
and you will find our example loopback app running there.
- Install dependencies and start the example server
git clone https://github.com/strongloop-community/loopback-connector-elastic-search.git myEsConnector
cd myEsConnector/examples
npm install
- Don't forget to create an index in your ES instance:
curl -X POST https://username:password@my.es.cluster.com/shakespeare
- If you mess up and want to delete, you can use:
curl -X DELETE https://username:password@my.es.cluster.com/shakespeare
- Don't forget to set a valid value for
apiVersion
field inexamples/server/datasources.json
that matches the version of ES you are running.
- Set up a
cacert.pem
file for communicating securely (https) with your ES instance. Download the certificate chain for your ES server using this sample (will need to be edited to use your provider) command:
cd myEsConnector
openssl s_client -connect my.es.cluster.com:9243 -showcerts | tee cacert.pem
- The command may not self terminate so you may need to use
ctrl+c
- It will be saved at the base of your cloned project
- Sometimes extra data is added to the file, you should delete everything after the following lines:
```
---
No client certificate CA names sent
---
```
- Run:
cd myEsConnector/examples
DEBUG=boot:test:* node server/server.js
- The
examples/server/boot/boot.js
file will automatically populate data for UserModels on your behalf when the server starts.
- Open this URL in your browser: http://localhost:3000/explorer
- Try fetching all the users via the rest api console
- You can dump all the data from your ES index, via cmd-line too:
curl -X POST username:password@my.es.cluster.com/shakespeare/_search -d '{"query": {"match_all": {}}}'
- To test a specific filter via GET method, use for example:
{"q" : "friends, romans, countrymen"}
From version 1.3.4, refresh
option is added which support's instant search after create
and update
. This option is configurable and one can activate or deactivate it according to their need. By default refresh is true
which makes response to come only after documents are indexed(searchable).
To know more about refresh
go through this article
Datasource File: Pass refreshOn
array from datasource file including methods name in which you want this to be true
"es": {
"name": "es",
"refreshOn": ["save","create", "updateOrCreate"],
.....
Model.json file: Configurable on per model and operation level (true
, false
, wait_for
)
"elasticsearch": {
"create": {
"refresh": false
},
"destroy": {
"refresh": false
},
"destroyAll": {
"refresh": "wait_for"
}
}
NOTE:- While a refresh is useful, it still has a performance cost. A manual refresh can be useful, but avoid manual refresh every time you index a document in production; it will hurt your performance. Instead, your application needs to be aware of the near real-time nature of Elasticsearch and make allowances for it.
- Do you have both
elasticsearch-ssl
andelasticsearch-plain
in yourdatasources.json
file? You just need one of them (not both), based on how you've setup your ES instance. - Did you forget to set
model-config.json
to point at the datasource you configured? Maybe you are using a different or misspelled name than what you thought you had! - Did you forget to set a valid value for
apiVersion
field indatasources.json
that matches the version of ES you are running? - Maybe the version of ES you are using isn't supported by the client that this project uses. Try removing the
elasticsearch
sub-dependency from<yourApp>/node_modules/loopback-connector-es/node_modules
folder and then install the latest client: cd <yourApp>/node_modules/loopback-connector-es/node_modules
- then remove the
elasticsearch
folder 1. unix/mac quickie:rm -rf elasticsearch
npm install --save --save-exact https://github.com/elastic/elasticsearch-js.git
- to "academically" prove to yourself that this will work with the new install:
1. on unix/mac you can quickly dump the supported versions to your terminal with:
cat elasticsearch/package.json | grep -A 5 supported_es_branches
2. on other platforms, look into theelasticsearch/package.json
and search for thesupported_es_branches
json block. - go back to yourApp's root directory
1. unix/mac quickie:
cd <yourApp>
- And test that you can now use the connector without any issues!
- These changes can easily get washed away for several reasons. So for a more permanent fix that adds the version you want to work on into a release of this connector, please look into Contributing.
- You can edit
test/resource/datasource-test.json
to point at your ES instance and then runnpm test
- If you don't have an ES instance and want to leverage docker based ES instances then:
- Start elasticsearch version 1.x and 2.x using:
docker-compose -f docker-compose-for-tests.yml up
- Edit the code to pick which datasource you want to test against in
test/init.js
:
var settings = require('./resource/datasource-test.json'); // comment this out if you'll be using either of the following
//var settings = require('./resource/datasource-test-v1-plain.json');
//var settings = require('./resource/datasource-test-v2-plain.json');
- Then run
npm test
- When you're finished and want to tear down the docker instances, run:
docker-compose -f docker-compose-for-tests.yml down
- Feel free to contribute via PR or open an issue for discussion or jump into the gitter chat room if you have ideas.
- I recommend that project contributors who are part of the team:
- should merge
master
intodevelop
... if they are behind, before starting thefeature
branch - should create
feature
branches from thedevelop
branch - should merge
feature
intodevelop
then create arelease
branch to: 1. update the changelog 1. close related issues and mention release version 1. update the readme 1. fix any bugs from final testing 1. commit locally and runnpm-release x.x.x -m "<some comment>"
1. mergerelease
into bothmaster
anddevelop
1. pushmaster
anddevelop
to GitHub - For those who use forks:
- please submit your PR against the
develop
branch, if possible - if you must submit your PR against the
master
branch ... I understand and I can't stop you. I only hope that there is a good reason likedevelop
not being up-to-date withmaster
for the work you want to build upon. npm-release <versionNumber> -m <commit message>
may be used to publish. Pubilshing to NPM should happen from themaster
branch. It should ideally only happen when there is something release worthy. There's no point in publishing just because of changes totest
orexamples
folder or any other such entities that aren't part of the "published module" (refer to.npmignore
) to begin with.
- How do we enable or disable the logs coming from the underlying elasticsearch client? There may be a need to debug/troubleshoot at times.
- Use the
"log": "trace"
field in your datasources file or omit it. You can refer to the detailed docs here and here - How do we enable or disable the logs coming from this connector?
- By default if you do not set the following env variable, they are disabled:
DEBUG=loopback:connector:elasticsearch
1. For example, try running tests with and without it, to see the difference:- with:
DEBUG=loopback:connector:elasticsearch npm test
- without:
npm test
- with:
- What are the tests about? Can you provide a brief overview?
- Tests are prefixed with
01
or02
etc. in order to run them in that order by leveraging default alphabetical sorting. - The
02.basic-querying.test.js
file uses two models to test various CRUD operations that any connector must provide, likefind(), findById(), findByIds(), updateAttributes()
etc. 1. the two models areUser
andCustomer
2. their ES mappings are laid out intest/resource/datasource-test.json
3. their loopback definitions can be found in the firstbefore
block that performs setup in02.basic-querying.test.js
file ... these are the equivalent of aMyModel.json
in your real loopback app.- naturally, this is also where we define which property serves as the
id
for the model and if its generated or not
- naturally, this is also where we define which property serves as the
- How do we get elasticserch to take over ID generation?
- An automatically generated id-like field that is maintained by ES is
_uid
. Without some sort of es-field-level-scripting-on-index (if that is possible at all) ... I am not sure how we could ask elasticsearch to take over auto-generating an id-like value for any arbitrary field! So the connector is setup such that addingid: {type: String, generated: true, id: true}
will tell it to use_uid
as the actual field backing theid
... you can keep using the doingmodel.id
abstraction and in the background_uid
values are mapped to it. - Will this work for any field marked as with
generated: true
andid: true
? 1. No! The connector isn't coded that way right now ... while it is an interesting idea to couple any such field with ES's_uid
field inside this connector ... I am not sure if this is the right thing to do. If you hadobjectId: {type: String, generated: true, id: true}
then you won't find a realobjectId
field in your ES documents. Would that be ok? Wouldn't that confuse developers who want to write custom queries and run 3rd party app against their ES instance? Don't useobejctId
, use_uid
would have to be common knowledge. Is that ok?
-
Release
1.0.6
of this connector updates the underlying elasticsearch client version to11.0.1
-
For this connector, you can configure an
index
name for your ES instance and the loopback model's name is conveniently/automatically mapped as the EStype
. -
Users must setup
string
fields asnot_analyzed
by default for predictable matches just like other loopback backends. And if more flexibility is required, multi-field mappings can be used too."name" : { "type" : "multi_field", "fields" : { "name" : {"type" : "string", "index" : "not_analyzed"}, "native" : {"type" : "string", "index" : "analyzed"} } } ... // this will treat 'George Harrison' as 'George Harrison' in a search User.find({order: 'name'}, function (err, users) {..} // this will treat 'George Harrison' as two tokens: 'george' and 'harrison' in a search User.find({order: 'name', where: {'name.native': 'Harrison'}}, function (err, users) {..}
-
Release
1.3.4
add's support for updateAll for elasticsearchv-2.3
and above. To make updateAll work you will have to add below options in yourelasticsearch.yml
config filescript.inline: true script.indexed: true script.engine.groovy.inline.search: on script.engine.groovy.inline.update: on
-
TBD