/elasticsearch_exporter

Elasticsearch stats exporter for Prometheus

Primary LanguageGoApache License 2.0Apache-2.0

Elasticsearch Exporter Build Status

Docker Pulls Go Report Card

Prometheus exporter for various metrics about ElasticSearch, written in Go.

Installation

For pre-built binaries please take a look at the releases. https://github.com/justwatchcom/elasticsearch_exporter/releases

Docker

docker pull justwatch/elasticsearch_exporter:1.1.0
docker run --rm -p 9114:9114 justwatch/elasticsearch_exporter:1.1.0

Example docker-compose.yml:

elasticsearch_exporter:
    image: justwatch/elasticsearch_exporter:1.1.0
    command:
     - '--es.uri=http://elasticsearch:9200'
    restart: always
    ports:
    - "127.0.0.1:9114:9114"

Kubernetes

You can find a helm chart in the stable charts repository at https://github.com/kubernetes/charts/tree/master/stable/elasticsearch-exporter.

Configuration

NOTE: The exporter fetches information from an ElasticSearch cluster on every scrape, therefore having a too short scrape interval can impose load on ES master nodes, particularly if you run with --es.all and --es.indices. We suggest you measure how long fetching /_nodes/stats and /_all/_stats takes for your ES cluster to determine whether your scraping interval is too short. As a last resort, you can scrape this exporter using a dedicated job with its own scraping interval.

Below is the command line options summary:

elasticsearch_exporter --help
Argument Introduced in Version Description Default
es.uri 1.0.2 Address (host and port) of the Elasticsearch node we should connect to. This could be a local node (localhost:9200, for instance), or the address of a remote Elasticsearch server. When basic auth is needed, specify as: <proto>://<user>:<password>@<host>:<port>. E.G., http://admin:pass@localhost:9200. Special characters in the user credentials need to be URL-encoded. http://localhost:9200
es.all 1.0.2 If true, query stats for all nodes in the cluster, rather than just the node we connect to. false
es.cluster_settings 1.1.0rc1 If true, query stats for cluster settings. false
es.indices 1.0.2 If true, query stats for all indices in the cluster. false
es.indices_settings 1.0.4rc1 If true, query settings stats for all indices in the cluster. false
es.shards 1.0.3rc1 If true, query stats for all indices in the cluster, including shard-level stats (implies es.indices=true). false
es.snapshots 1.0.4rc1 If true, query stats for the cluster snapshots. false
es.timeout 1.0.2 Timeout for trying to get stats from Elasticsearch. (ex: 20s) 5s
es.ca 1.0.2 Path to PEM file that contains trusted Certificate Authorities for the Elasticsearch connection.
es.client-private-key 1.0.2 Path to PEM file that contains the private key for client auth when connecting to Elasticsearch.
es.client-cert 1.0.2 Path to PEM file that contains the corresponding cert for the private key to connect to Elasticsearch.
es.clusterinfo.interval 1.1.0rc1 Cluster info update interval for the cluster label 5m
es.ssl-skip-verify 1.0.4rc1 Skip SSL verification when connecting to Elasticsearch. false
web.listen-address 1.0.2 Address to listen on for web interface and telemetry. :9114
web.telemetry-path 1.0.2 Path under which to expose metrics. /metrics
version 1.0.2 Show version info on stdout and exit.

Commandline parameters start with a single - for versions less than 1.1.0rc1. For versions greater than 1.1.0rc1, commandline parameters are specified with --. Also, all commandline parameters can be provided as environment variables. The environment variable name is derived from the parameter name by replacing . and - with _ and upper-casing the parameter name.

Elasticsearch 7.x security privileges

ES 7.x supports RBACs. The following security privileges are required for the elasticsearch_exporter.

Setting Privilege Required Description
exporter defaults cluster monitor All cluster read-only operations, like cluster health and state, hot threads, node info, node and cluster stats, and pending cluster tasks.
es.cluster_settings cluster monitor
es.indices indices monitor (per index or *) All actions that are required for monitoring (recovery, segments info, index stats and status)
es.indices_settings indices monitor (per index or *)
es.shards not sure if indices or cluster monitor or both
es.snapshots cluster:admin/snapshot/status and cluster:admin/repository/get ES Forum Post

Further Information

Metrics

Name Type Cardinality Help
elasticsearch_breakers_estimated_size_bytes gauge 4 Estimated size in bytes of breaker
elasticsearch_breakers_limit_size_bytes gauge 4 Limit size in bytes for breaker
elasticsearch_breakers_tripped counter 4 tripped for breaker
elasticsearch_cluster_health_active_primary_shards gauge 1 The number of primary shards in your cluster. This is an aggregate total across all indices.
elasticsearch_cluster_health_active_shards gauge 1 Aggregate total of all shards across all indices, which includes replica shards.
elasticsearch_cluster_health_delayed_unassigned_shards gauge 1 Shards delayed to reduce reallocation overhead
elasticsearch_cluster_health_initializing_shards gauge 1 Count of shards that are being freshly created.
elasticsearch_cluster_health_number_of_data_nodes gauge 1 Number of data nodes in the cluster.
elasticsearch_cluster_health_number_of_in_flight_fetch gauge 1 The number of ongoing shard info requests.
elasticsearch_cluster_health_number_of_nodes gauge 1 Number of nodes in the cluster.
elasticsearch_cluster_health_number_of_pending_tasks gauge 1 Cluster level changes which have not yet been executed
elasticsearch_cluster_health_task_max_waiting_in_queue_millis gauge 1 Max time in millis that a task is waiting in queue.
elasticsearch_cluster_health_relocating_shards gauge 1 The number of shards that are currently moving from one node to another node.
elasticsearch_cluster_health_status gauge 3 Whether all primary and replica shards are allocated.
elasticsearch_cluster_health_timed_out gauge 1 Number of cluster health checks timed out
elasticsearch_cluster_health_unassigned_shards gauge 1 The number of shards that exist in the cluster state, but cannot be found in the cluster itself.
elasticsearch_filesystem_data_available_bytes gauge 1 Available space on block device in bytes
elasticsearch_filesystem_data_free_bytes gauge 1 Free space on block device in bytes
elasticsearch_filesystem_data_size_bytes gauge 1 Size of block device in bytes
elasticsearch_filesystem_io_stats_device_operations_count gauge 1 Count of disk operations
elasticsearch_filesystem_io_stats_device_read_operations_count gauge 1 Count of disk read operations
elasticsearch_filesystem_io_stats_device_write_operations_count gauge 1 Count of disk write operations
elasticsearch_filesystem_io_stats_device_read_size_kilobytes_sum gauge 1 Total kilobytes read from disk
elasticsearch_filesystem_io_stats_device_write_size_kilobytes_sum gauge 1 Total kilobytes written to disk
elasticsearch_indices_docs gauge 1 Count of documents on this node
elasticsearch_indices_docs_deleted gauge 1 Count of deleted documents on this node
elasticsearch_indices_docs_primary gauge Count of documents with only primary shards on all nodes
elasticsearch_indices_fielddata_evictions counter 1 Evictions from field data
elasticsearch_indices_fielddata_memory_size_bytes gauge 1 Field data cache memory usage in bytes
elasticsearch_indices_filter_cache_evictions counter 1 Evictions from filter cache
elasticsearch_indices_filter_cache_memory_size_bytes gauge 1 Filter cache memory usage in bytes
elasticsearch_indices_flush_time_seconds counter 1 Cumulative flush time in seconds
elasticsearch_indices_flush_total counter 1 Total flushes
elasticsearch_indices_get_exists_time_seconds counter 1 Total time get exists in seconds
elasticsearch_indices_get_exists_total counter 1 Total get exists operations
elasticsearch_indices_get_missing_time_seconds counter 1 Total time of get missing in seconds
elasticsearch_indices_get_missing_total counter 1 Total get missing
elasticsearch_indices_get_time_seconds counter 1 Total get time in seconds
elasticsearch_indices_get_total counter 1 Total get
elasticsearch_indices_indexing_delete_time_seconds_total counter 1 Total time indexing delete in seconds
elasticsearch_indices_indexing_delete_total counter 1 Total indexing deletes
elasticsearch_indices_indexing_index_time_seconds_total counter 1 Cumulative index time in seconds
elasticsearch_indices_indexing_index_total counter 1 Total index calls
elasticsearch_indices_merges_docs_total counter 1 Cumulative docs merged
elasticsearch_indices_merges_total counter 1 Total merges
elasticsearch_indices_merges_total_size_bytes_total counter 1 Total merge size in bytes
elasticsearch_indices_merges_total_time_seconds_total counter 1 Total time spent merging in seconds
elasticsearch_indices_query_cache_cache_total counter 1 Count of query cache
elasticsearch_indices_query_cache_cache_size gauge 1 Size of query cache
elasticsearch_indices_query_cache_count counter 2 Count of query cache hit/miss
elasticsearch_indices_query_cache_evictions counter 1 Evictions from query cache
elasticsearch_indices_query_cache_memory_size_bytes gauge 1 Query cache memory usage in bytes
elasticsearch_indices_query_cache_total counter 1 Size of query cache total
elasticsearch_indices_refresh_time_seconds_total counter 1 Total time spent refreshing in seconds
elasticsearch_indices_refresh_total counter 1 Total refreshes
elasticsearch_indices_request_cache_count counter 2 Count of request cache hit/miss
elasticsearch_indices_request_cache_evictions counter 1 Evictions from request cache
elasticsearch_indices_request_cache_memory_size_bytes gauge 1 Request cache memory usage in bytes
elasticsearch_indices_search_fetch_time_seconds counter 1 Total search fetch time in seconds
elasticsearch_indices_search_fetch_total counter 1 Total number of fetches
elasticsearch_indices_search_query_time_seconds counter 1 Total search query time in seconds
elasticsearch_indices_search_query_total counter 1 Total number of queries
elasticsearch_indices_segments_count gauge 1 Count of index segments on this node
elasticsearch_indices_segments_memory_bytes gauge 1 Current memory size of segments in bytes
elasticsearch_indices_settings_stats_read_only_indices gauge 1 Count of indices that have read_only_allow_delete=true
elasticsearch_indices_shards_docs gauge 3 Count of documents on this shard
elasticsearch_indices_shards_docs_deleted gauge 3 Count of deleted documents on each shard
elasticsearch_indices_store_size_bytes gauge 1 Current size of stored index data in bytes
elasticsearch_indices_store_size_bytes_primary gauge Current size of stored index data in bytes with only primary shards on all nodes
elasticsearch_indices_store_size_bytes_total gauge Current size of stored index data in bytes with all shards on all nodes
elasticsearch_indices_store_throttle_time_seconds_total counter 1 Throttle time for index store in seconds
elasticsearch_indices_translog_operations counter 1 Total translog operations
elasticsearch_indices_translog_size_in_bytes counter 1 Total translog size in bytes
elasticsearch_indices_warmer_time_seconds_total counter 1 Total warmer time in seconds
elasticsearch_indices_warmer_total counter 1 Total warmer count
elasticsearch_jvm_gc_collection_seconds_count counter 2 Count of JVM GC runs
elasticsearch_jvm_gc_collection_seconds_sum counter 2 GC run time in seconds
elasticsearch_jvm_memory_committed_bytes gauge 2 JVM memory currently committed by area
elasticsearch_jvm_memory_max_bytes gauge 1 JVM memory max
elasticsearch_jvm_memory_used_bytes gauge 2 JVM memory currently used by area
elasticsearch_jvm_memory_pool_used_bytes gauge 3 JVM memory currently used by pool
elasticsearch_jvm_memory_pool_max_bytes counter 3 JVM memory max by pool
elasticsearch_jvm_memory_pool_peak_used_bytes counter 3 JVM memory peak used by pool
elasticsearch_jvm_memory_pool_peak_max_bytes counter 3 JVM memory peak max by pool
elasticsearch_os_cpu_percent gauge 1 Percent CPU used by the OS
elasticsearch_os_load1 gauge 1 Shortterm load average
elasticsearch_os_load5 gauge 1 Midterm load average
elasticsearch_os_load15 gauge 1 Longterm load average
elasticsearch_process_cpu_percent gauge 1 Percent CPU used by process
elasticsearch_process_cpu_time_seconds_sum counter 3 Process CPU time in seconds
elasticsearch_process_mem_resident_size_bytes gauge 1 Resident memory in use by process in bytes
elasticsearch_process_mem_share_size_bytes gauge 1 Shared memory in use by process in bytes
elasticsearch_process_mem_virtual_size_bytes gauge 1 Total virtual memory used in bytes
elasticsearch_process_open_files_count gauge 1 Open file descriptors
elasticsearch_snapshot_stats_number_of_snapshots gauge 1 Total number of snapshots
elasticsearch_snapshot_stats_oldest_snapshot_timestamp gauge 1 Oldest snapshot timestamp
elasticsearch_snapshot_stats_snapshot_start_time_timestamp gauge 1 Last snapshot start timestamp
elasticsearch_snapshot_stats_snapshot_end_time_timestamp gauge 1 Last snapshot end timestamp
elasticsearch_snapshot_stats_snapshot_number_of_failures gauge 1 Last snapshot number of failures
elasticsearch_snapshot_stats_snapshot_number_of_indices gauge 1 Last snapshot number of indices
elasticsearch_snapshot_stats_snapshot_failed_shards gauge 1 Last snapshot failed shards
elasticsearch_snapshot_stats_snapshot_successful_shards gauge 1 Last snapshot successful shards
elasticsearch_snapshot_stats_snapshot_total_shards gauge 1 Last snapshot total shard
elasticsearch_thread_pool_active_count gauge 14 Thread Pool threads active
elasticsearch_thread_pool_completed_count counter 14 Thread Pool operations completed
elasticsearch_thread_pool_largest_count gauge 14 Thread Pool largest threads count
elasticsearch_thread_pool_queue_count gauge 14 Thread Pool operations queued
elasticsearch_thread_pool_rejected_count counter 14 Thread Pool operations rejected
elasticsearch_thread_pool_threads_count gauge 14 Thread Pool current threads count
elasticsearch_transport_rx_packets_total counter 1 Count of packets received
elasticsearch_transport_rx_size_bytes_total counter 1 Total number of bytes received
elasticsearch_transport_tx_packets_total counter 1 Count of packets sent
elasticsearch_transport_tx_size_bytes_total counter 1 Total number of bytes sent
elasticsearch_clusterinfo_last_retrieval_success_ts gauge 1 Timestamp of the last successful cluster info retrieval
elasticsearch_clusterinfo_up gauge 1 Up metric for the cluster info collector
elasticsearch_clusterinfo_version_info gauge 6 Constant metric with ES version information as labels

Alerts & Recording Rules

We provide examples for Prometheus alerts and recording rules as well as an Grafana Dashboard and a Kubernetes Deployment.

The example dashboard needs the node_exporter installed. In order to select the nodes that belong to the ElasticSearch cluster, we rely on a label cluster. Depending on your setup, it can derived from the platform metadata:

For example on GCE

- source_labels: [__meta_gce_metadata_Cluster]
  separator: ;
  regex: (.*)
  target_label: cluster
  replacement: ${1}
  action: replace

Please refer to the Prometheus SD documentation to see which metadata labels can be used to create the cluster label.

Credit & License

elasticsearch_exporter is maintained by the nice folks from JustWatch and licensed under the terms of the Apache license.

This package was originally created and maintained by Eric Richardson, who transferred this repository to us in January 2017.

Maintainers of this repository:

Please refer to the Git commit log for a complete list of contributors.

Contributing

We welcome any contributions. Please fork the project on GitHub and open Pull Requests for any proposed changes.

Please note that we will not merge any changes that encourage insecure behaviour. If in doubt please open an Issue first to discuss your proposal.