/prometheus-nginxlog-exporter

Export metrics from Nginx access log files to Prometheus

Primary LanguageGoApache License 2.0Apache-2.0

NGINX-to-Prometheus log file exporter

Usage

You can either use a simple configuration, using command-line flags, or create a configuration file with a more advanced configuration.

Use the command-line:

$ ./prometheus-nginxlog-exporter \
  -format="<FORMAT>" \
  -listen-port=4040 \
  -namespace=nginx \
  [PATHS-TO-LOGFILES...]

Use the configuration file:

$ ./prometheus-nginxlog-exporter -config-file /path/to/config.hcl

You can verify your config file before deployment, which will exit with shell status indicating success:

$ ./prometheus-nginxlog-exporter -config-file /path/to/config.hcl -verify-config

Installation

There are multiple ways to install this exporter.

Docker

Docker images for this exporter are available at the quay.io and pkg.github.com registries:

  • quay.io/martinhelmich/prometheus-nginxlog-exporter:v1

  • docker.pkg.github.com/martin-helmich/prometheus-nginxlog-exporter/exporter:v1

Have a look at the releases page to see the available versions and how to pull their images. In general, I would recommend using the v1 tag instead of latest.

Run the exporter as follows (adjust paths like /path/to/logs and /path/to/config to your own needs):

$ docker run \
    --name nginx-exporter \
    -v /path/to/logs:/mnt/nginxlogs \
    -p 4040:4040 \
    quay.io/martinhelmich/prometheus-nginxlog-exporter \
    mnt/nginxlogs/access.log

Command-line flags and arguments can simply be appended to the docker run command, for example to use a configuration file:

$ docker run \
    --name nginx-exporter \
    -p 4040:4040 \
    -v /path/to/logs:/mnt/nginxlogs \
    -v /path/to/config.hcl:/etc/prometheus-nginxlog-exporter.hcl \
    quay.io/martinhelmich/prometheus-nginxlog-exporter \
    -config-file /etc/prometheus-nginxlog-exporter.hcl

DEB and RPM packages

Each release from 1.5.1 or newer provides both DEB and RPM packages.

$ wget https://github.com/martin-helmich/prometheus-nginxlog-exporter/releases/download/v1.8.0/prometheus-nginxlog-exporter_1.8.0_linux_amd64.deb
$ apt install ./prometheus-nginxlog-exporter_1.8.0_linux_amd64.deb

The package come with a dependency on systemd and configure the exporter to be running automatically:

$ systemctl status prometheus-nginxlog-exporter
$ # systemctl disable prometheus-nginxlog-exporter
$ # systemctl enable prometheus-nginxlog-exporter

The packages drop a configuration file to /etc/prometheus-nginxlog-exporter.hcl which you can adjust to your own needs.

Manual installation, with systemd

If you do not want to use one of the pre-built packages, you can download the binary itself and manually configure systemd to start it. You can find an example unit file for this service in this repository. Simply copy the unit file to /etc/systemd/system:

$ wget -O /etc/systemd/system/prometheus-nginxlog-exporter.service https://raw.githubusercontent.com/martin-helmich/prometheus-nginxlog-exporter/master/res/package/prometheus-nginxlog-exporter.service
$ systemctl enable prometheus-nginxlog-exporter
$ systemctl start prometheus-nginxlog-exporter

The shipped unit file expects the binary to be located in /usr/sbin/prometheus-nginxlog-exporter (if you sideload the exporter without using your package manager, you might want to put it to /usr/local, instead) and the configuration file in /etc/prometheus-nginxlog-exporter.hcl. Adjust to your own needs.

Kubernetes

If you run a logfile-generating service (be it NGINX, or anything that generates similar access log files) in Kubernetes, you can run the exporter as a sidecar along your "main" container within the same pod.

The following example shows you how to deploy the exporter as a sidecar, accepting logs from the main container via syslog:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-example
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "4040"
spec:
  containers:
    - name: web
      image: nginx
      # ...
    - name: exporter
      image: docker.pkg.github.com/martin-helmich/prometheus-nginxlog-exporter/exporter:v1
      args: ["-config-file", "/etc/prometheus-nginxlog-exporter/config.hcl"]
      volumeMounts:
      - name: exporter-config
        mountPath: /etc/prometheus-nginxlog-exporter
  volumes:
    - name: exporter-config
      configMap:
        name: exporter-config

In this example, the configuration file is passed via the exporter-config ConfigMap. This might look like follows:

apiVersion: v1
kind: ConfigMap
metadata:
  name: exporter-config
data:
  config.hcl: |
    listen {
      port = 4040
    }

    namespace "nginx" {
      source = {
        syslog {
          listen_address = "udp://127.0.0.1:5531"
          format = "rfc3164"
        }
      }

      format = "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\""

      labels {
        app = "default"
      }
    }

The config file instructs the exporter to accept log input via syslog. To forward logs to the exporter, just instruct your main container to send its access logs via syslog to 127.0.0.1:5531 (which works, since the main container and the sidecar share their network namespace).

Build from source

To build the exporter from source, simply build it with go get:

$ go get github.com/martin-helmich/prometheus-nginxlog-exporter

Alternatively, clone this repository and just run go build:

$ git clone git://github.com/martin-helmich/prometheus-nginxlog-exporter
$ cd prometheus-nginxlog-exporter
$ go build

Collected metrics

This exporter collects the following metrics. This collector can listen on multiple log files at once and publish metrics in different namespaces. Each metric uses the labels method (containing the HTTP request method) and status (containing the HTTP status code).

Keep in mind that some of these metrics will require certain values to be present in your access log format (for example, the http_upstream_time_seconds metric will require your access to contain the variable $upstream_response_time.

Metrics are exported at the /metrics path.

These metrics are exported:

<namespace>_http_response_count_total

The total amount of processed HTTP requests/responses.

<namespace>_http_response_size_bytes

The total amount of transferred content in bytes.

<namespace>_http_request_size_bytes

The total amount of received traffic in bytes. This metrics requires the $request_length variable in the log format.

<namespace>_http_upstream_time_seconds

A summary vector of the upstream response times in seconds. Logging these needs to be specifically enabled in NGINX using the $upstream_response_time variable in the log format.

<namespace>_http_upstream_time_seconds_hist

Same as <namespace>_http_upstream_time_seconds, but as a histogram vector. Also requires the $upstream_response_time variable in the log format.

<namespace>_http_response_time_seconds

A summary vector of the total response times in seconds. Logging these needs to be specifically enabled in NGINX using the $request_time variable in the log format.

<namespace>_http_response_time_seconds_hist

Same as <namespace>_http_response_time_seconds, but as a histogram vector. Also requires the $request_time variable in the log format.

Additional labels can be configured in the configuration file (see below).

<namespace> can be omitted or overridden - see [Namespace-as-labels] for more information.

Configuration file

You can specify a configuration file to read at startup. The configuration file is expected to be either in HCL or YAML format. Here’s an example file:

listen {
  port = 4040
  address = "10.1.2.3"
  metrics_endpoint = "/metrics"
}

consul {
  enable = true
  address = "localhost:8500"
  datacenter = "dc1"
  scheme = "http"
  token = ""
  service {
    id = "nginx-exporter"
    name = "nginx-exporter"
    address = "192.168.3.1"
    tags = ["foo", "bar"]
  }
}

namespace "app1" {
  format = "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\""
  source {
    files = [
      "/var/log/nginx/app1/access.log"
    ]
  }

  # log can be printed to std out, e.g. for debugging purposes (disabled by default)
  print_log = false

  # metrics_override = { prefix = "myprefix" }
  # namespace_label = "vhost"

  labels {
    app = "application-one"
    environment = "production"
    foo = "bar"
  }

  histogram_buckets = [.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10]
}

namespace "app2" {
  format = "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\" $upstream_response_time"
  source {
    files = [
      "/var/log/nginx/app2/access.log"
    ]
  }
}

The same file as YAML file:

listen:
  port: 4040
  address: "10.1.2.3"
  metrics_endpoint: "/metrics"

consul:
  enable: true
  address: "localhost:8500"
  datacenter: dc1
  scheme: http
  token: ""
  service:
    id: "nginx-exporter"
    name: "nginx-exporter"
    address = "192.168.3.1"
    tags: ["foo", "bar"]

namespaces:
  - name: app1
    format: "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\""
    source:
      files:
        - /var/log/nginx/app1/access.log
    # metrics_override:
    #   prefix: "myprefix"
    # namespace_label: "vhost"
    labels:
      app: "application-one"
      environment: "production"
      foo: "bar"
    histogram_buckets: [.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10]
  - name: app2
    format: "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\" $upstream_response_time"
    source:
      files:
        - /var/log/nginx/app2/access.log

Advanced features

Namespace as labels

For historic reasons, this exporter exports separate metrics for different namespaces (because the namespace is part of the metric name). However, in many (most) cases, it’s more convenient to have the same metric name across different namespaces (with different log formats and names).

This can be done in two steps:

  1. Override Prometheus metrics namespace to some common prefix (metrics_override)

  2. Set label name for nginxlog-exporter’s config namespace (namespace_label)

namespace "app1" {
  ...
  metrics_override = { prefix = "myprefix" }
  namespace_label = "vhost"
  ...
}

namespace "app2" {
  ...
  metrics_override = { prefix = "myprefix" }
  namespace_label = "vhost"
  ...
}

Exported metrics will have the following format:

myprefix_http_response_count_total{vhost="app1", ...}
myprefix_http_response_count_total{vhost="app2", ...}
...
  • prefix can be set to "", resulting metrics like http_response_count_total{…​}

  • namespace_label can be omitted - so you have full control on metric format

Some details and history on this can be found in issue #13.

Custom labels pass-through

Partial case of [Dynamic-re-labeling]:

namespace "app1" {
  format = "$remote_addr - $remote_user [$time_local] ... \"$geoip_country_code\" $upstream_addr"
  ...
  relabel "upstream_addr" { from = "upstream_addr" }
  relabel "country" { from = "geoip_country_code" }
  ...
}

Exported metrics will have upstream_addr and country labels.

Log sources

Currently, the exporter supports reading log data from

  1. files

  2. syslog

All log sources can be configured on a per-namespace basis using the source property.

Reading from files

When reading from log files, all that is needed is a files property:

namespace "test" {
  source {
    files = ["/var/log/nginx/access.log"]
    // ...
  }
}

Reading from syslog

The exporter can also open and listen on a Syslog port and read logs from there. Configuration works as follows:

namespace "test" {
  source {
    syslog {
      listen_address = "udp://127.0.0.1:8514" (1)
      format = "rfc3164" (2)
      tags = ["nginx"]
    }

    // ...
  }
}
  1. The listen_address might be either a TCP or UDP address. UNIX sockets are not supported (yet — pull requests are welcome)

  2. The format may be one of rfc3164, rfc5424, rfc6587 or auto. If omitted, it will default to auto.

Have a look at the respective section of the NGINX documentation on how to set up NGINX to log into syslog.

Dynamic re-labeling

Re-labeling lets you add arbitrary fields from the parsed log line as labels to your metrics. To add a dynamic label, add a relabel statement to your configuration file:

namespace "app-1" {
  // ...

  relabel "host" {
    from = "server_name"
    whitelist = [ (1)
      "host-a.com",
      "host-b.de"
    ]
  }
}
  1. The whitelist property is optional; if set, only the supplied values will be added as label. All other values will be subsumed under the "other" label value. See #16 for a more detailed discussion around the reasoning.

Dynamic relabeling also allows you to aggregate your metrics by request path (which replaces the experimental feature originally introduced in #23). The following example splits the content of the request variable at every space (using split) and return the second element (index 1) of the resulting list which is the base for the regex):

namespace "app1" {
  // ...

  relabel "request_uri" {
    from = "request"
    split = 2
    separator = " " // (1)

    // if enabled, only include label in response count metric (default is false)
    only_counter = false

    match "^/users/[0-9]+" {
      replacement = "/users/:id"
    }

    match "^/profile" {
      replacement = "/profile"
    }
  }
}
  1. The separator property is optional; if omitted, the space character (" ") will be assumed as separator.

If a match is found, the replacement replaces each occurrence of the corresponding match in the original value. Otherwise the processing continues to check the following match statements.

The YAML configuration for relabelings works similar to the HCL configuration:

namespaces:
- name: app1
  relabel_configs:
  - target_label: request_uri
    from: request
    split: 2
    separator: ' '
    matches:
    - regexp: "^/users/[0-9]+"
      replacement: "/users/:id"

If your regular expression contains groups, you can also use the matched values of those in the replacement value:

relabel "request_uri" {
  from = "request"
  split = 2

  match "^/(users|profiles)/[0-9]+" {
    replacement = "/$1/:id"
  }
}

File Globs

You can specify one or more wildcards in the source file names, in which case the wildcards will be resolved to the corresponding list of files at startup of the exporter.

Be aware that the list of matches is only evaluated at the start of the program. If a new file is added with a match of one glob filter, you’ll have to restart the program for it to be monitored.

Given a config like this:

---
namespace "test" {
  source {
    files = ["/var/log/nginx/*_access.log"]
    // ...
  }
}
---

And a folder containing these files:

# /var/log/nginx
main_access.log
main_error.log
virtualhost1_access.log
virtualhost1_error.log

The list of files monitored by this namespace will be /var/log/nginx/main_access.log,/var/log/nginx/virtualhost1_access.log.

JSON log_format

You can use the JSON parser by setting the --parser command line flag or parser config file property to json.

Frequently Asked Questions

I have started the exporter, but it is not exporting any application-specific metrics!

This may have several issues:

  1. Make sure that the access log files that your exporter is listening on are present. The exporter will exit with an error code if a file is present but cannot be opened (for example, due to bad permissions), but will wait for a file if it does not yet exist.

  2. Make sure that the exporter can parse the lines from your access log files. Pay attention to the <namespace>_parse_errors_total metric, which will indicate how many log lines could not be parsed.

The exporter exports the <namespace>_http_response_count_total metric, but not [other metric that is mentioned in the README]!

Most metrics require certain values to be present in the access log files that are not present in the NGINX default configuration. Especially, make sure that the access log contains the $upstream_response_time, $request_time and/or $body_bytes_sent variables. These need to be enabled in the NGINX configuration (more precisely, the log_format setting) and then added to the format specified for the exporter.

How can I configure NGINX to export these variables?

Have a look at NGINX’s Logging and Monitoring guide. It contains some good examples that contain the $request_time and $upstream_response_time:

log_format upstream_time '$remote_addr - $remote_user [$time_local] '
                         '"$request" $status $body_bytes_sent '
                         '"$http_referer" "$http_user_agent"'
                         'rt=$request_time uct="$upstream_connect_time" uht="$upstream_header_time" urt="$upstream_response_time"';

Credits