/elastabot

Slack bot that listens for commands from Slack users to interact with Elasticsearch and silence / acknowledge alerts from ElastAlert

Primary LanguagePythonMIT LicenseMIT

Elastabot

Slack bot that listens for commands from Slack users to:

  • Search the Elasticsearch cluster
  • Query the health status of the Elasticsearch cluster
  • Acknowledge an alert created by elastalert
  • Triage alerts or arbitrary issues

MUST READ: Classic Slack bot is required!

Slack provides a special link to create a classic bot in your community: https://api.slack.com/apps?new_classic_app=1

Elastabot will only work with classic bots. Use the link above to create the new classic bot, but do not click the button to switch to granular permissions! Switching to granular permissions will effectively upgrade the classic bot to a modern Slack bot and you'll have to start over.

Once you create the classic bot and install it into your workspace you'll see a screen presenting two tokens. Use the bottom "Bot User" token for the configuration parameter mentioned below. Do not use the OAuth Token.

Step-by-step instructions:

  1. Click the special link to create a classic bot (shown above)
  2. A popup form will appear entitled "Create a Slack App (Classic)"
  3. Enter a name, such as "Elastabot", and choose your Slack community from the drop-down and then click the "Create App" button.
  4. Click the "Bots" box.
  5. Click "Add Legacy Bot User"
  6. Enter a name for the bot, such as "Elastabot", and enter a username, such "elastabot" (lowercase) and click "Add" button.
  7. Click "Install App" from the left-side menu.
  8. Click "Install App to Workspace".
  9. Click "Allow" on the permission confirmation screen.
  10. Copy the bottom "Bot User Oauth Access Token" for use during installation of Elastabot.

Search

Slack users can search the Elasticsearch cluster for arbitrary search criteria, using the Lucene syntax. This can be useful for maintaining a history of searches, but needs to be used with caution. Certain Slack communities with public access should not enable this feature if the Elasticsearch cluster contains sensitive data.

Examples:

  • Generic search across all indices and fields
!search hello

Response:

Found 672664 matching record(s), showing 1:
index: myindex-2018.05.19
 @timestamp: 2018-05-19T14:12:52.141Z
 message: hello
 @version: 1
  • Provide 3 most recent records that match this search
!search hello|3
  • Specific index and field match:
!search _index:"myindex-*" message:hello

Cluster Health

Display the current health state of the Elasticsearch cluster, including number of active nodes, queue wait times, and more.

Example:

  • Query the cluster health:
!health

Response:

Elasticsearch Cluster Health -> yellow
                    Name: docker-cluster
                   Nodes: 1
              Data Nodes: 1
           Active Shards: 817 (50%)
     Initializing Shards: 0
       Unassigned Shards: 795
           Pending Tasks: 0
          Inflight Tasks: 0
       Max Queue Time MS: 0

Acknowledge Elastalerts

When told to ack an alert generated by Elastalert, Elastabot will look for the alert and silence it by creating a silence document in the appropriate Elasticsearch index. Additionally, if the ack command includes a question mark, ?, then the alert will be sent through the triage process. The question mark symbolizes that there are unanswered questions related to the alert and therefore the alert needs to be triaged.

Deadman Switch rules are typically used in an inverse pattern, where the rule is always alerting, to show that the alerting system is working. Therefore, any rule beginning with "Deadman" will automatically be excluded from the ack command.

NOTE: Alert names provided in the command argument are searched as-is, with the only character replacement occuring on the space character, which is escaped prior to sending to Elasticsearch. This means that acknowledging an alert for a rule name provided as My rule* will include the asterisk into the search without escaping it, and so Elasticsearch will apply the wildcard match. It is important to consider this raw behavior of the query before exposing this bot to a public-facing Slack community. No additional input filtering/cleansing is currently implemented.

Examples:

  • Acknowledge the most recently triggered alert and start the triage process:
!ack ?

Response:

Acknowledged alert *IDS Offline* until 2018-05-18 16:59:13.595827 UTC
Triage process has started
  • Acknowledge the most recently triggered alert for rule IDS Offline for the next 2 hours (no triage in this example):
!ack IDS Offline|120

Triage

Elastabot understands the vague notion of a triage command. Currently, this is simply the generation of an SMTP email. This is useful for pushing issues into a ticketing system, such as Atlassian's JIRA tool, etc, and avoids complexities of direct integration to those tools such as how to handle downtime of the tool itself, or licensing costs of additional service users.

Triage is included with the !ack command, provided a question mark is added to as an argument. However, to explicitly start the triage process for an arbitrary topic, without an associated alert, use the !triage command.

Examples:

  • Start the triage process to investigate a drop in user logins:
!triage Unexpected drop in user logins

Response:

Triage process has started

Slack Commands

Users will interact with Elastabot in the Slack interface. A Slack community admin will need to register a bot for Elastabot and provide bot token needed for Elastabot to connect to Slack. Once Elastabot connects to Slack, users can invite the bot into one or more channels, or send direct messages to interaction with Elastabot.

Commands

To see a list of available commands, users can type:

!help

Or, to get detailed help on a specific command, user can type:

!ack help

Configuration

Elastabot expects two groups of configuration inputs.

  1. JSON configuration file with non-sensitive values
  2. Environment variables with sensitive values

JSON Configuration

An example configuration file is shown below, followed by descriptions of each setting.

{
  "elasticsearch": {
    "host": "elasticsearch",
    "port": 9200,
    "sslEnabled": false,
    "sslStrictEnabled": false,
    "timeoutSeconds": 10,
    "urlPrefix":""
  },
  "elastalert": {
    "index": "elastalert_status",
    "silenceMinutes": 240,
    "recentMinutes": 4320
  },
  "smtp": {
    "host": "email-smtp.us-east-1.amazonaws.com",
    "port": 587,
    "secure": false,
    "starttls": true,
    "timeoutSeconds": 4,
    "to": "jira@mycompany.atlassian.net",
    "from": "engineering_team@mycompany.invalid",
    "subjectPrefix": "[mini] ",
    "debug": false
  },
  "commandPrefix": "!",
  "triageTarget": "smtp",
  "searchEnabled": true,
  "debug": false
}
setting description
elasticsearch.host Hostname for the Elasticsearch server
elasticsearch.port Port for the Elasticsearch server
elasticsearch.sslEnabled If true, uses SSL/TLS to connect to Elasticsearch
elasticsearch.sslStrictEnabled If true, the SSL/TLS certificates will be validated against known certificate authorities
elasticsearch.timeoutSeconds Number of seconds to wait for an Elasticsearch response
elasticsearch.urlPrefix URL prefix for Elasticsearch, typically an empty string
elastalert.index The index prefix used by Elastalert within Elasticsearch, typically elastalert or elastalert_status
elastalert.silenceMinutes Number of minutes to silence an acknowledge alert if a silence duration is not explicitly given with the ack command.
elastalert.recentMinutes Number of minutes to look back in history for a fired alert in the Elasticsearch index
smtp.host Hostname for the SMTP server
smtp.port Port for the SMTP server
smtp.secure If true, will connect to the SMTP host over SSL/TLS
smtp.starttls If true, will send the starttls command (typically not used with smtp.secure=true
smtp.timeoutSeconds Number of seconds to wait for the SMTP server to respond
smtp.to Email address that will receive the triage email
smtp.from Sender email address
smtp.subjectPrefix If non-empty string, will be prepended to each email subject
smtp.debug If true, the SMTP connectivity details will be logged to stdout
commandPrefix Special character or phrase to trigger the bot, typically an exclamation point, !. Ex: !ack
triageTarget How to initiate the triage process, currently only smtp is supported.
searchEnabled Allows generic, arbitrary searching of the Elasticsearch cluster. Should not be enabled if sensitive information is stored in the cluster.
debug If true, will output debug logging to help troubleshoot connectivity problems.

Environment Variables

The following environment variables are used as inputs for sensitive information.

variable required description
SLACK_BOT_TOKEN true The Slack-generated bot token, provided by slack.com
ELASTICSEARCH_USERNAME false Optional Elasticsearch username, provided by your ES admin
ELASTICSEARCH_PASSWORD false Optional Elasticsearch password, provided by your ES admin
SMTP_USERNAME false Optional SMTP username, provided by your SMTP admin
SMTP_PASSWORD false Optional SMTP password, provided by your SMTP admin

Docker

A Dockerfile is provided for Elastabot, and a Docker image will auto-build at hub.docker.com/jertel/elastabot.

The image will expect a configuration file to exist in the /opt/elastabot/elastabot.json location, so the recommended way to configure Elastabot is to use a file-based volume mount override from the host to this location.

Ex:

docker run --rm -v /host/path/elastabot.json:/opt/elastabot/elastabot.json jertel/elastabot

Kubernetes

See the Helm chart README.md for information on installing this application into an existing Kubernetes cluster.