albertodonato/query-exporter

Error: error while attempting to bind on address ('::1', 9560, 0, 0): cannot assign requested address

connorourke opened this issue · 9 comments

Describe the bug

I'm running query-exporter within a Docker container. WHen I try to start it with the example config.yaml I get the following error:

unhandled exception during asyncio.run() shutdown
task: <Task finished name='Task-1' coro=<_run_app() done, defined at /query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web.py:289> exception=OSError(99, "error while attempting to bind on address ('::1', 9560, 0, 0): cannot assign requested address")>
Traceback (most recent call last):
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web.py", line 516, in run_app
    loop.run_until_complete(main_task)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web.py", line 415, in _run_app
    await site.start()
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web_runner.py", line 121, in start
    self._server = await loop.create_server(
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 1519, in create_server
    raise OSError(err.errno, 'error while attempting '
OSError: [Errno 99] error while attempting to bind on address ('::1', 9560, 0, 0): cannot assign requested address
Traceback (most recent call last):
  File "/query_exporter/.venv/bin/query-exporter", line 8, in <module>
Exception in thread Thread-1 (thread_fn):
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Exception in thread Thread-2 (thread_fn):
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    sys.exit(script())
  File "/query_exporter/.venv/lib/python3.10/site-packages/toolrack/script.py", line 110, in __call__
    self.run()
  File "/usr/local/lib/python3.10/threading.py", line 953, in run
    self.run()
    self._target(*self._args, **self._kwargs)
  File "/query_exporter/.venv/lib/python3.10/site-packages/sqlalchemy_aio/asyncio.py", line 53, in thread_fn
  File "/usr/local/lib/python3.10/threading.py", line 953, in run
    return self.main(parsed_args) or 0
  File "/query_exporter/.venv/lib/python3.10/site-packages/prometheus_aioexporter/script.py", line 143, in main
    self._loop.call_soon_threadsafe(request.set_finished)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 798, in call_soon_threadsafe
    self._target(*self._args, **self._kwargs)
  File "/query_exporter/.venv/lib/python3.10/site-packages/sqlalchemy_aio/asyncio.py", line 53, in thread_fn
    exporter.run()
  File "/query_exporter/.venv/lib/python3.10/site-packages/prometheus_aioexporter/web.py", line 71, in run
    run_app(
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web.py", line 516, in run_app
    loop.run_until_complete(main_task)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web.py", line 415, in _run_app
    self._loop.call_soon_threadsafe(request.set_finished)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 798, in call_soon_threadsafe
    self._check_closed()
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
    self._check_closed()
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    await site.start()
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web_runner.py", line 121, in start
    self._server = await loop.create_server(
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 1519, in create_server
    raise OSError(err.errno, 'error while attempting '
OSError: [Errno 99] error while attempting to bind on address ('::1', 9560, 0, 0): cannot assign requested address

The port 9560 is exposed (i am running in the slurmctld container:

version: "3.3"

services:
  prometheus:
    container_name: prometheus
    image: prom/prometheus
    restart: always
    volumes:
      - ./prometheus:/etc/prometheus/
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    ports:
      - 9090:9090
    networks:
      - prom_app_net

  grafana:
    container_name: grafana
    image: grafana/grafana
    user: '472'
    restart: always
    environment:
      GF_INSTALL_PLUGINS: 'grafana-clock-panel,grafana-simple-json-datasource'
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/provisioning/:/etc/grafana/provisioning/
      - './grafana/grafana.ini:/etc/grafana/grafana.ini'
    env_file:
      - ./grafana/.env_grafana
    ports:
      - 3000:3000
    depends_on:
      - prometheus
    networks:
      - prom_app_net


  mysql:
    image: mariadb:10.10
    hostname: mysql
    container_name: mysql
    environment:
      MYSQL_RANDOM_ROOT_PASSWORD: "yes"
      MYSQL_DATABASE: slurm_acct_db
      MYSQL_USER: slurm
      MYSQL_PASSWORD: password
    volumes:
      - var_lib_mysql:/var/lib/mysql

  slurmdbd:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    build:
      context: .
      args:
        SLURM_TAG: ${SLURM_TAG:-slurm-21-08-6-1}
    command: ["slurmdbd"]
    container_name: slurmdbd
    hostname: slurmdbd
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6819"
    ports:
      - "6819:6819"
    depends_on:
      - mysql
    privileged: true
    cgroup: host

  slurmctld:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmctld"] 
    container_name: slurmctld
    hostname: slurmctld
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - etc_prometheus:/etc/prometheus
      - /sys/fs/cgroup:/sys/fs/cgroup:rw
    expose:
      - "6817"
      - "8080"
      - "8081"
      - "9560"
    ports:
      - 8080:8080
      - 8081:8081
      - 9560:9560
    depends_on:
      - "slurmdbd"
    privileged: true
    cgroup: host

  c1:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmd"]
    hostname: c1
    container_name: c1
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6818"
    depends_on:
      - "slurmctld"
    privileged: true
    cgroup: host  
    

  c2:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmd"]
    hostname: c2
    container_name: c2
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6818"
      - "22"
    depends_on:
      - "slurmctld"
    privileged: true
    cgroup: host




volumes:
  etc_munge:
  etc_slurm:
  slurm_jobdir:
  var_lib_mysql:
  var_log_slurm:
  grafana_data:
  prometheus_data:
  cgroups: 
  etc_prometheus:

networks:
  prom_app_net:
    driver: bridge

Installation details

  • operating system: Rocky Linux 8 container
  • query-exporter installation type:
    • pip:
Package                   Version
------------------------- -----------
aiohttp                   3.8.4
aiosignal                 1.3.1
argcomplete               3.1.1
async-timeout             4.0.2
attrs                     23.1.0
charset-normalizer        3.1.0
croniter                  1.4.1
exceptiongroup            1.1.2
frozenlist                1.3.3
idna                      3.4
iniconfig                 2.0.0
jsonschema                4.18.0
jsonschema-specifications 2023.6.1
multidict                 6.0.4
outcome                   1.2.0
packaging                 23.1
pip                       23.0.1
pluggy                    1.2.0
prometheus-aioexporter    1.7.0
prometheus-client         0.17.0
pytest                    7.4.0
python-dateutil           2.8.2
PyYAML                    6.0
query-exporter            2.8.3
referencing               0.29.1
Represent                 1.6.0.post0
rpds-py                   0.8.8
setuptools                65.5.0
six                       1.16.0
SQLAlchemy                1.3.24
sqlalchemy-aio            0.17.0
tomli                     2.0.1
toolrack                  4.0.0
yarl                      1.9.2
  • docker image: Dockerfile is based on: eniocarboni/docker-rockylinux-systemd:8 with python 3.10 installed in the image:
   RUN set -ex \
    && yum makecache \
    && yum -y update \
    && yum -y install dnf-plugins-core \
    && yum config-manager --set-enabled powertools \
    && yum -y install \
       wget \
       bzip2 \
       perl \
       gcc \
       gcc-c++\
       git \
       gnupg \
       make \
       munge \
       munge-devel \
       python3-devel \
       python3-pip \
       python3 \
       libffi-devel \
       sqlite-devel \
       mariadb-server \
       mariadb-devel \
       psmisc \
       bash-completion \
       vim-enhanced \
       http-parser-devel \
       json-c-devel \
       golang \
    && yum clean all \
    && rm -rf /var/cache/yum

RUN  wget https://www.python.org/ftp/python/3.10.12/Python-3.10.12.tgz \
    && tar -xzvf Python-3.10.12.tgz\ 
    && pushd Python-3.10.12 \
    && ./configure --enable-optimizations --prefix=/usr/local\
    && make \
    && make altinstall
  • snap: [output from snap info query-exporter]

To Reproduce

If possible, please provide detailed steps to reproduce the behavior:

  1. Config file content (redacted of secrets if needed)
databases:
  db1:
    dsn: sqlite://
    connect-sql:
      - PRAGMA application_id = 123
      - PRAGMA auto_vacuum = 1
    labels:
      region: us1
      app: app1
  db2:
    dsn: sqlite://
    keep-connected: false
    labels:
      region: us2
      app: app1

metrics:
  metric1:
    type: gauge
    description: A sample gauge
  metric2:
    type: summary
    description: A sample summary
    labels: [l1, l2]
    expiration: 24h
  metric3:
    type: histogram
    description: A sample histogram
    buckets: [10, 20, 50, 100, 1000]
  metric4:
    type: enum
    description: A sample enum
    states: [foo, bar, baz]

queries:
  query1:
    interval: 5
    databases: [db1]
    metrics: [metric1]
    sql: SELECT random() / 1000000000000000 AS metric1
  query2:
    interval: 20
    timeout: 0.5
    databases: [db1, db2]
    metrics: [metric2, metric3]
    sql: |
      SELECT abs(random() / 1000000000000000) AS metric2,
             abs(random() / 10000000000000000) AS metric3,
             "value1" AS l1,
             "value2" AS l2
  query3:
    schedule: "*/5 * * * *"
    databases: [db2]
    metrics: [metric4]
    sql: |
      SELECT value FROM (
        SELECT "foo" AS metric4 UNION
        SELECT "bar" AS metric3 UNION
        SELECT "baz" AS metric4
      )
      ORDER BY random()
      LIMIT 1
  1. Ran query-exporter with the following command line ...
    query-exporter config.yaml

  2. Got the error when running the command above

If i expose and pass through a differernt port with query-exporter -p <port_no_here> config.yaml I get the same thing.

Though the port is exposed. Running the following in the container:

from aiohttp import web


async def handle(request):
    name = request.match_info.get('name', "World!")
    text = "Hello, " + name
    print('received request, replying with "{}".'.format(text))
    return web.Response(text=text)


app = web.Application()
app.router.add_get('/', handle)
app.router.add_get('/{name}', handle)

web.run_app(app, port=8082)

Works fine and serves up the hello world page.

What's the exact command line that's being called for query-exporter? it seems it's trying to bind IPv6, which is possibly not enabled in docker

query-exporter config.yaml -p 8082

(when i am using port 8082 rather than 9056 - but the error is the same either way)

by default, that will try to bind ipv6 as well, but that's likely not enabled in docker.

For this reason the docker image only binds ipv4:

ENTRYPOINT ["query-exporter", "/config.yaml", "-H", "0.0.0.0"]

OK - thanks. I'm not using the query-exporter image - I'm installing query-exporter on the container with pip. Is there a way to enable it to bind IPv6 then?

I have enabled IPv6, following the instructions by putting:

{
  "experimental": true,
  "ip6tables": true
}

into my docker/daemon.json and restarting. But the exporter still doesn't work. My compose file with the IPv6 network looks like:

version: "3.3"

services:
  prometheus:
    container_name: prometheus
    image: prom/prometheus
    restart: always
    volumes:
      - ./prometheus:/etc/prometheus/
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    ports:
      - 9090:9090
    networks:
      - prom_app_net
    

  grafana:
    container_name: grafana
    image: grafana/grafana
    user: '472'
    restart: always
    environment:
      GF_INSTALL_PLUGINS: 'grafana-clock-panel,grafana-simple-json-datasource'
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/provisioning/:/etc/grafana/provisioning/
      - './grafana/grafana.ini:/etc/grafana/grafana.ini'
    env_file:
      - ./grafana/.env_grafana
    ports:
      - 3000:3000
    depends_on:
      - prometheus
    networks:
      - prom_app_net


  mysql:
    image: mariadb:10.10
    hostname: mysql
    container_name: mysql
    environment:
      MYSQL_RANDOM_ROOT_PASSWORD: "yes"
      MYSQL_DATABASE: slurm_acct_db
      MYSQL_USER: slurm
      MYSQL_PASSWORD: password
    volumes:
      - var_lib_mysql:/var/lib/mysql
    networks:
      - slurm
#    network_mode: host


  slurmdbd:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    build:
      context: .
      args:
        SLURM_TAG: ${SLURM_TAG:-slurm-21-08-6-1}
    command: ["slurmdbd"]
    container_name: slurmdbd
    hostname: slurmdbd
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6819"
    ports:
      - "6819:6819"
    depends_on:
      - mysql
    privileged: true
    cgroup: host
    networks:
      - slurm
    #network_mode: host

  slurmctld:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmctld"] 
    container_name: slurmctld
    hostname: slurmctld
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - etc_prometheus:/etc/prometheus
      - /sys/fs/cgroup:/sys/fs/cgroup:rw
    expose:
      - "6817"
      - "8080"
      - "8081"
      - "8082/tcp"
    ports:
      - 8080:8080
      - 8081:8081
      - 8082:8082/tcp
    depends_on:
      - "slurmdbd"
    privileged: true
    cgroup: host

    #network_mode: host
    networks:
      - slurm

  c1:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmd"]
    hostname: c1
    container_name: c1
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6818"
    depends_on:
      - "slurmctld"
    privileged: true
    cgroup: host 
    #network_mode: host
    networks:
      - slurm
    

  c2:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmd"]
    hostname: c2
    container_name: c2
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6818"
      - "22"
    depends_on:
      - "slurmctld"
    privileged: true
    cgroup: host
    networks:
      - slurm
    #network_mode: host




volumes:
  etc_munge:
  etc_slurm:
  slurm_jobdir:
  var_lib_mysql:
  var_log_slurm:
  grafana_data:
  prometheus_data:
  cgroups: 
  etc_prometheus:

networks:
  prom_app_net:
  slurm:
    enable_ipv6: true
    ipam:
      config: 
        - subnet: 2001:0DB8::/112

When I run query-exporter like query-exporter config.yaml -p 8082 with the following config file:

databases:
  db1:
    dsn: sqlite:////test.db
    connect-sql:
      - PRAGMA application_id = 123
      - PRAGMA auto_vacuum = 1
    labels:
      region: us1
      app: app1


metrics:
  metric1:
    type: gauge
    description: A sample gauge


queries:
  query1:
    interval: 5
    databases: [db1]
    metrics: [metric1]
    sql: SELECT random() / 1000000000000000 AS metric1

It doesnt work:

Screenshot 2023-07-10 at 10 19 47

But if i run the following simple exporter on the same port:

from prometheus_client import start_http_server, Summary
import random
import time

# Create a metric to track time spent and requests made.
REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')

# Decorate function with metric.
@REQUEST_TIME.time()
def process_request(t):
    """A dummy function that takes some time."""
    time.sleep(t)

if __name__ == '__main__':
    # Start up the server to expose the metrics.
    start_http_server(8082)
    # Generate some requests.
    while True:
        process_request(random.random())

It works fine:

Screenshot 2023-07-10 at 10 22 55

Is this a bug, or is there some other step that I am missing?

Thanks!

Ah - ok. If I start it with query-exporter config.yaml -p 8082 -H 0.0.0.0 it works.