Elastic stack (ELK) on Docker for Running through Assorted Elasticsearch Tutorials

There are README docs for most of the tutorials I have completed and links and data associated with them:

Tutorials followed:

ELK Stack instructions from deviantony

Based on the official Docker images from Elastic:

Requirements
- Host setup
Usage
- Bringing up the stack
- Initial setup
Configuration
Storage
- How can I persist Elasticsearch data?
Extensibility
- How can I add plugins?
- How can I enable the provided extensions?
Going further
- Docker Swarm
Wolff Notes

Requirements

Host setup

Install Docker version 17.05+
Install Docker Compose version 1.6.0+
Clone this repository

Usage

Bringing up the stack

Note: In case you switched branch or updated a base image - you may need to run docker-compose build first

Start the stack using docker-compose:

$ docker-compose up

You can also run all services in the background (detached mode) by adding the -d flag to the above command.

Give Kibana a few seconds to initialize, then access the Kibana web UI by hitting http://localhost:5601 with a web browser.

By default, the stack exposes the following ports:

5000: Logstash TCP input.
9200: Elasticsearch HTTP
9300: Elasticsearch TCP transport
5601: Kibana

Now that the stack is running, you will want to inject some data using the bulk api:

Download the data files

The complete works of William Shakespeare, suitably parsed into fields. Download shakespeare.json.
A set of fictitious accounts with randomly generated data. Download accounts.zip.
A set of randomly generated log files. Download logs.jsonl.gz.

Post the mappings in Kibana console:

PUT /shakespeare
{
 "mappings": {
  "doc": {
   "properties": {
    "speaker": {"type": "keyword"},
    "play_name": {"type": "keyword"},
    "line_id": {"type": "integer"},
    "speech_number": {"type": "integer"}
   }
  }
 }
}

PUT /logstash-2015.05.18
{
  "mappings": {
    "log": {
      "properties": {
        "geo": {
          "properties": {
            "coordinates": {
              "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

PUT /logstash-2015.05.19
{
  "mappings": {
    "log": {
      "properties": {
        "geo": {
          "properties": {
            "coordinates": {
              "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

PUT /logstash-2015.05.20
{
  "mappings": {
    "log": {
      "properties": {
        "geo": {
          "properties": {
            "coordinates": {
              "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

Run bulk Imports to ElasticSearch

curl -H 'Content-Type: application/x-ndjson' -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary @accounts.json
curl -H 'Content-Type: application/x-ndjson' -XPOST 'localhost:9200/shakespeare/doc/_bulk?pretty' --data-binary @shakespeare_6.0.json
curl -H 'Content-Type: application/x-ndjson' -XPOST 'localhost:9200/_bulk?pretty' --data-binary @logs.jsonl

Initial setup

Default Kibana index pattern creation

When Kibana launches for the first time, it is not configured with any index pattern.

Via the Kibana web UI

NOTE: You need to inject data into Logstash before being able to configure a Logstash index pattern via the Kibana web UI. Then all you have to do is hit the Create button.

Refer to Connect Kibana with Elasticsearch for detailed instructions about the index pattern configuration.

On the command line

Create an index pattern via the Kibana API:

$ curl -XPOST -D- 'http://localhost:5601/api/saved_objects/index-pattern' \
    -H 'Content-Type: application/json' \
    -H 'kbn-version: 6.5.4' \
    -d '{"attributes":{"title":"logstash-*","timeFieldName":"@timestamp"}}'

The created pattern will automatically be marked as the default index pattern as soon as the Kibana UI is opened for the first time.

Configuration

NOTE: Configuration is not dynamically reloaded, you will need to restart the stack after any change in the configuration of a component.

How can I tune the Kibana configuration?

The Kibana default configuration is stored in kibana/config/kibana.yml.

It is also possible to map the entire config directory instead of a single file.

How can I tune the Logstash configuration?

The Logstash configuration is stored in logstash/config/logstash.yml.

It is also possible to map the entire config directory instead of a single file, however you must be aware that Logstash will be expecting a log4j2.properties file for its own logging.

How can I tune the Elasticsearch configuration?

The Elasticsearch configuration is stored in elasticsearch/config/elasticsearch.yml.

You can also specify the options you want to override directly via environment variables:

elasticsearch:

  environment:
    network.host: "_non_loopback_"
    cluster.name: "my-cluster"

How can I scale out the Elasticsearch cluster?

Follow the instructions from the Wiki: Scaling out Elasticsearch

Storage

How can I persist Elasticsearch data?

The data stored in Elasticsearch will be persisted after container reboot but not after container removal.

In order to persist Elasticsearch data even after removing the Elasticsearch container, you'll have to mount a volume on your Docker host. Update the elasticsearch service declaration to:

elasticsearch:

  volumes:
    - /path/to/storage:/usr/share/elasticsearch/data

This will store Elasticsearch data inside /path/to/storage.

NOTE: beware of these OS-specific considerations:

Linux: the unprivileged elasticsearch user is used within the Elasticsearch image, therefore the mounted data directory must be owned by the uid 1000.
macOS: the default Docker for Mac configuration allows mounting files from /Users/, /Volumes/, /private/, and /tmp exclusively. Follow the instructions from the documentation to add more locations.

Extensibility

How can I add plugins?

To add plugins to any ELK component you have to:

Add a RUN statement to the corresponding Dockerfile (eg. RUN logstash-plugin install logstash-filter-json)
Add the associated plugin code configuration to the service configuration (eg. Logstash input/output)
Rebuild the images using the docker-compose build command

How can I enable the provided extensions?

A few extensions are available inside the extensions directory. These extensions provide features which are not part of the standard Elastic stack, but can be used to enrich it with extra integrations.

The documentation for these extensions is provided inside each individual subdirectory, on a per-extension basis. Some of them require manual changes to the default ELK configuration.

Going further

Plugins and integrations

See the following Wiki pages:

Docker Swarm

Experimental support for Docker Swarm is provided in the form of a docker-stack.yml file, which can be deployed in an existing Swarm cluster using the following command:

$ docker stack deploy -c docker-stack.yml elk

If all components get deployed without any error, the following command will show 3 running services:

$ docker stack services elk

NOTE: to scale Elasticsearch in Swarm mode, configure zen to use the DNS name tasks.elasticsearch instead of elasticsearch.

Wolff Notes

=====

Using elasticdump to import files into elasticsearch.

Import local mappings into elastic search:

docker run -v $(pwd)/data:/temp/ --network docker-elk_elk taskrabbit/elasticsearch-dump \
--input=/temp/mapping.json \
--output=http://elasticsearch:9200/objects_050219 \
--type=mapping

Whatapalaver/docker_elk