This repo is the source code to my solution for 2017 Zemanta challenge: Real-time Business Data Analytics.
- Elasticsearch for log storage
- Kibana for visualization
- Logstash as a processing pipeline
- Docker Compose for orchestrating the containers
Branches:
master
horizontal-scale
- demonstrating horizontal scalabilty of Elastisearchrollover-shrink
- demonstrating the Elasticsearch rollover and shrink APIs
Since I'm using Docker running the project is just two commands:
docker-compose build
docker-compose up
Before running the up
command it is good to change the ES_JAVA_OPTS
and LS_JAVA_OPTS
environment variables.
ES_JAVA_OPTS
should set the JVM heap size to half of the total memory available on the machine. LS_JAVA_OPTS
should be set to around 15% of the total
memory available.
Once all containers are running, the Elasticsearch API is available at localhost:9200
and Kibana is avilable at localhost:5601
. When running for the first time Kibana takes a couple of minutes to set up.
Switch to branch horizontal-scale
to see how to make Elasticsearch horizontally scalable. The commands slightly change since now we are booting
two additional Elasticsearch nodes.
docker-compose build
docker-compose up --scale elasticsearch-worker=2
Documents can be inserted using the bid_request_generator
available on GitHub. Once set up you can run it with the following command:
./bin/bid_request_generator 10000 1 | nc localhost 5000
We pipe the output from the generator to localhost:5000
since that is where Logstash is listening for input.
The suggested queries are available in the queries.txt file.
Each query can be copied to Kibana DevTools and ran from there (including the GET _search
header). DevTools are available on this link.
You can also query using cURL or any programming language with an HTTP library. An example of a cURL query:
curl -XGET 'localhost:9200/_search' -H 'Content-Type: application/json' -d'
{
... insert query here ...
}
'
Or using Python and requests library:
import requests
import json
query = { ... insert query here ... }
resp = requests.get("http://localhost:9200/_search", data=json.dumps(query), headers={'Content-Type': 'application/json'})
print(resp.json())