First node works, second does not...

Question

First node works, second does not...

Closed this issue 4 years ago · 3 comments

I have two nodes configured. I am using wmi_exporter for both. The first node is up and I can see all the metrics. Everything looks great, but then I added the second node and for some reason it just will not come up. I can't make much sense of it.

I am getting these errors in the proxy logs:

15:46:30.350 ERROR [ScrapeRequestManager.kt:46] - Missing ScrapeRequestWrapper for scrape_id: 91 [grpc-default-executor-1]
15:46:40.123 ERROR [ScrapeRequestManager.kt:46] - Missing ScrapeRequestWrapper for scrape_id: 93 [grpc-default-executor-1]
15:46:49.936 ERROR [ScrapeRequestManager.kt:46] - Missing ScrapeRequestWrapper for scrape_id: 95 [grpc-default-executor-1]
15:46:59.862 ERROR [ScrapeRequestManager.kt:46] - Missing ScrapeRequestWrapper for scrape_id: 97 [grpc-default-executor-1]

Agent logs look good:

15:44:12.266 INFO  [GenericServiceListener.kt:30] - Running AdminService{port=8093, paths=[/ping, /version, /healthcheck, /threaddump]} [AdminService STARTING]
15:44:12.266 INFO  [GenericService.kt:136] - All Agent services healthy [AdminService STARTING]
15:44:12.573 INFO  [AgentGrpcService.kt:144] - Connected to proxy at {IP Removed]:50051 using plaintext [Agent Unnamed-prometheus-agent]
15:44:12.697 INFO  [AgentPathManager.kt:65] - Registered http://10.100.61.63:9182/metrics as /bgr-rds02_metrics [Agent Unnamed-prometheus-agent]
15:44:12.723 INFO  [AgentPathManager.kt:65] - Registered http://10.100.61.61:9182/metrics as /bgr-rds01_metrics [Agent Unnamed-prometheus-agent]
15:44:12.767 INFO  [Agent.kt:194] - Heartbeat scheduled to fire after 5.00s of inactivity [DefaultDispatcher-worker-1]

prometheus.yml

  - job_name: 'bgr-rds02'
    metrics_path: '/bgr-rds02_metrics'
    static_configs:
      - targets: ['prometheus-proxy:8080']

  - job_name: 'bgr-rds01'
    metrics_path: '/bgr-rds01_metrics'
    static_configs:
      - targets: ['prometheus-proxy:8080']

prom-agent.conf

proxy {
  admin.enabled: true
  metrics.enabled: true
}

agent {
  proxy.hostname = ${HOSTNAME}
  admin.enabled: true
  metrics.enabled: true

  pathConfigs: [
    {
      name: "bgr-rds02"
      path: bgr-rds02_metrics
      url: "http://10.100.61.63:9182/metrics"
    }
    {
      name: "bgr-rds01"
      path: bgr-rds01_metrics
      url: "http://10.100.61.61:9182/metrics"
    }
  ]
}

Answer 1 · 2020-05-20T17:45:14.000Z

Hmmm. I am not sure what is going on with that.

Can you try a couple of experiments:

Alter the order of adding the nodes and see what happens.
Add a 3rd node (duplicating one of the 2) and see what happens with that.

Answer 2 · 2020-05-20T22:04:46.000Z

I just tried changing the order, adding another node entry for same box, changing the name. I can curl the metrics without any problem from the agent box.

I then added another node entirely and that node works. Very very odd why that one node seems to be having issues. I am going to try adding a few more modes tomorrow and will let you know how it goes. Could be just an odd anomaly with that particular node.

Answer 3 · 2020-05-21T02:40:37.000Z

Interesting. If you can hit it with curl, the agent should be able to hit as well. Even if it is something on your end, I should be producing a better error message than that. Please let me know what you see tomorrow.