influxdata/helm-charts

Problem with enterprise chart: error parsing index

korcky opened this issue · 2 comments

I'm trying to deploy a rather simple enterprise cluster for test purposes using influxdb-enterprise chart with 1 meta and 3 data nodes.

The deployment itself happens smoothly (influxd-ctl -bind-tls -k show shows running cluster), but any API request to meta node (like GET on /health or some simple queries like SHOW STATS, SHOW DIAGNOSTICS, SHOW DATABASES) end up with 400 Bad Request response and a message in it: error parsing index. The only exception is /ping request -- it returns 200.

The main problem is that I can't find any description of this error, which could lead me to the source of the problem. Am I doing something wrong or I missing something entirely?

values.yaml

license:
  key: not-real-key

meta:
  replicas: 1
  persistence:
    enabled: true
    storageClass: gp2
    accessMode: ReadWriteOnce
    size: 32Gi

data:
  replicas: 3
  persistence:
    enabled: true
    storageClass: gp2
    accessMode: ReadWriteOnce
    size: 256Gi

meta node logs

ts=2022-06-23T12:58:33.816745Z lvl=info msg="InfluxDB Meta starting" log_id=0bGA7sr0000 version=1.9.6-c1.9.6 branch=1.9 commit=facf255c969710c2973276b7116e4e84418cd986 tags=unknown
ts=2022-06-23T12:58:33.816781Z lvl=info msg="Go runtime" log_id=0bGA7sr0000 version=go1.17.2 maxprocs=2
ts=2022-06-23T12:58:33.816809Z lvl=info msg="Loading configuration file" log_id=0bGA7sr0000 path=/etc/influxdb/influxdb-meta.conf

8888888           .d888 888                   888b     d888
  888            d88P"  888                   8888b   d8888
  888            888    888                   88888b.d88888
  888   88888b.  888888 888 888  888 888  888 888Y88888P888
  888   888 "88b 888    888 888  888 ·Y8bd8P' 888 Y888P 888
  888   888  888 888    888 888  888   X88K   888  Y8P  888
  888   888  888 888    888 Y88b 888 .d8""8b. 888   "   888
8888888 888  888 888    888  "Y88888 888  888 888       888

ts=2022-06-23T12:58:33.821644Z lvl=info msg="Meta client error" log_id=0bGA7srG002 error="Post \"https://localhost:8091/join\": dial tcp 127.0.0.1:8091: connect: connection refused" retrying_after=20.000ms
ts=2022-06-23T12:58:33.839281Z lvl=info msg="Password hashing configuration: bcrypt;cost=10" log_id=0bGA7srG000 service=metastore
ts=2022-06-23T12:58:33.839370Z lvl=info msg="Password hashing is FIPS-ready: false" log_id=0bGA7srG000 service=metastore
ts=2022-06-23T12:58:33.845488Z lvl=info msg="Listening on TCP: [::]:8089" log_id=0bGA7srG000 service=metastore
ts=2022-06-23T12:58:33.845532Z lvl=info msg="Starting meta service" log_id=0bGA7srG000 service=meta
ts=2022-06-23T12:58:33.845769Z lvl=info msg="Listening on HTTP" log_id=0bGA7srG000 service=meta addr=[::]:8091 https=true
ts=2022-06-23T12:58:33.856733Z lvl=info msg="Using data dir" log_id=0bGA7srG000 service=meta path=/var/lib/influxdb/meta
ts=2022-06-23T12:58:33.857309Z lvl=info msg="Fetching and validating license" log_id=0bGA7srG000 service=licensing license=af7a14a1-e1b8-4a3f-b2c9-be51e6b0e4a5
ts=2022-06-23T12:58:33.875819Z lvl=info msg="Node at influxdb-test-influxdb-enterprise-meta-0.influxdb-test-influxdb-enterprise-meta.telemetry.svc.cluster.local:8089 [Follower]" log_id=0bGA7srG000 service=meta
ts=2022-06-23T12:58:33.898616Z lvl=info msg="No leader, so restarting raft state to run single node bootstrap" log_id=0bGA7srG000 service=meta
ts=2022-06-23T12:58:33.903627Z lvl=info msg="Node at influxdb-test-influxdb-enterprise-meta-0.influxdb-test-influxdb-enterprise-meta.telemetry.svc.cluster.local:8089 [Follower]" log_id=0bGA7srG000 service=meta
ts=2022-06-23T12:58:34.211141Z lvl=info msg="Saving license locally" log_id=0bGA7srG000 service=licensing path=/tmp/influx-enterprise.key.json
ts=2022-06-23T12:58:35.292408Z lvl=info msg="Node at influxdb-test-influxdb-enterprise-meta-0.influxdb-test-influxdb-enterprise-meta.telemetry.svc.cluster.local:8089 [Leader]" log_id=0bGA7srG000 service=meta peers=
Added meta node 1 at influxdb-test-influxdb-enterprise-meta-0.influxdb-test-influxdb-enterprise-meta.telemetry.svc.cluster.local:8091
ts=2022-06-23T12:58:35.903544Z lvl=info msg="Sending anonymous usage statistics to https://usage.influxdata.com" log_id=0bGA7srG000 service=metastore
ts=2022-06-23T12:58:35.903976Z lvl=info msg="Listening for signals" log_id=0bGA7srG000
2022/06/23 13:01:00 [INFO] snapshot: Creating new snapshot at /var/lib/influxdb/meta/snapshots/1-17-1655989260521.tmp
[tcp] 2022/06/23 13:04:32 tcp.Mux: handler not registered: 71. Connection from 172.19.130.138:41026 closed

data node logs

REGISTER WITH META SERVICE


time="2022-06-23T12:58:32Z" level=info msg="reset OCSP cache file. /root/.cache/snowflake/ocsp_response_cache.json" func="gosnowflake.(*defaultLogger).Infof" file="log.go:104"
time="2022-06-23T12:58:32Z" level=info msg="reading OCSP Response cache file. /root/.cache/snowflake/ocsp_response_cache.json\n" func="gosnowflake.(*defaultLogger).Infof" file="log.go:104"
time="2022-06-23T12:58:32Z" level=error msg="failed to open. Ignored. open /root/.cache/snowflake/ocsp_response_cache.json: no such file or directory\n" func="gosnowflake.(*defaultLogger).Errorf" file="log.go:120"

 8888888           .d888 888                   8888888b.  888888b.
   888            d88P"  888                   888  "Y88b 888  "88b
   888            888    888                   888    888 888  .88P
   888   88888b.  888888 888 888  888 888  888 888    888 8888888K.
   888   888 "88b 888    888 888  888  Y8bd8P' 888    888 888  "Y88b
   888   888  888 888    888 888  888   X88K   888    888 888    888
   888   888  888 888    888 Y88b 888 .d8""8b. 888  .d88P 888   d88P
 8888888 888  888 888    888  "Y88888 888  888 8888888P"  8888888P"

ts=2022-06-23T12:58:32.923764Z lvl=info msg="InfluxDB starting" log_id=0bGA7pMl000 version=1.9.6-c1.9.6 branch=1.9 commit=facf255c969710c2973276b7116e4e84418cd986
ts=2022-06-23T12:58:32.923955Z lvl=info msg="Go runtime" log_id=0bGA7pMl000 version=go1.17.2 maxprocs=2
ts=2022-06-23T12:58:32.924151Z lvl=info msg="Loading configuration file" log_id=0bGA7pMl000 path=/etc/influxdb/influxdb.conf
ts=2022-06-23T12:58:33.019947Z lvl=info msg="Using data dir" log_id=0bGA7pNG000 service=store path=/var/lib/influxdb/data
ts=2022-06-23T12:58:33.020308Z lvl=info msg="Compaction settings" log_id=0bGA7pNG000 service=store max_concurrent_compactions=1 throughput_bytes_per_second=50331648 throughput_bytes_per_second_burst=50331648
ts=2022-06-23T12:58:33.020438Z lvl=info msg="Open store (start)" log_id=0bGA7pNG000 service=store trace_id=0bGA7pk0000 op_name=tsdb_open op_event=start
ts=2022-06-23T12:58:33.020679Z lvl=info msg="Open store (end)" log_id=0bGA7pNG000 service=store trace_id=0bGA7pk0000 op_name=tsdb_open op_event=end op_elapsed=0.244ms
ts=2022-06-23T12:58:33.021364Z lvl=info msg="Password hashing configuration: bcrypt;cost=10" log_id=0bGA7pNG000
ts=2022-06-23T12:58:33.021392Z lvl=info msg="Password hashing is FIPS-ready: false" log_id=0bGA7pNG000
ts=2022-06-23T12:58:33.021505Z lvl=info msg="Connecting to meta service" log_id=0bGA7pNG000
ts=2022-06-23T12:58:33.021575Z lvl=warn msg="Could not connect to meta service" log_id=0bGA7pNG000 error="error while opening /var/lib/influxdb/meta/client.json"
ts=2022-06-23T12:58:33.021596Z lvl=info msg="Connected to meta service successfully" log_id=0bGA7pNG000
ts=2022-06-23T12:58:33.021879Z lvl=info msg="Starting query controller" log_id=0bGA7pNG000 service=flux-controller concurrency_quota=0 initial_memory_bytes_quota_per_query=9223372036854775807 memory_bytes_quota_per_query=9223372036854775807 max_memory_bytes=0 queue_size=0
ts=2022-06-23T12:58:33.022139Z lvl=info msg="Starting hinted handoff service" log_id=0bGA7pNG000 service=handoff
ts=2022-06-23T12:58:33.022164Z lvl=info msg="Registered diagnostics client" log_id=0bGA7pNG000 service=monitor name=hh
ts=2022-06-23T12:58:33.022170Z lvl=info msg="Using data dir" log_id=0bGA7pNG000 service=handoff path=/var/lib/influxdb/hh
ts=2022-06-23T12:58:33.022342Z lvl=info msg="Opened service" log_id=0bGA7pNG000 service=subscriber
ts=2022-06-23T12:58:33.022393Z lvl=info msg="Starting monitor service" log_id=0bGA7pNG000 service=monitor
ts=2022-06-23T12:58:33.022407Z lvl=info msg="Registered diagnostics client" log_id=0bGA7pNG000 service=monitor name=build
ts=2022-06-23T12:58:33.022413Z lvl=info msg="Registered diagnostics client" log_id=0bGA7pNG000 service=monitor name=runtime
ts=2022-06-23T12:58:33.022417Z lvl=info msg="Registered diagnostics client" log_id=0bGA7pNG000 service=monitor name=network
ts=2022-06-23T12:58:33.022422Z lvl=info msg="Registered diagnostics client" log_id=0bGA7pNG000 service=monitor name=system
ts=2022-06-23T12:58:33.022439Z lvl=info msg="Starting cluster service" log_id=0bGA7pNG000 service=cluster
ts=2022-06-23T12:58:33.022487Z lvl=info msg="Starting precreation service" log_id=0bGA7pNG000 service=shard-precreation check_interval=10m advance_period=30m
ts=2022-06-23T12:58:33.022499Z lvl=info msg="Starting snapshot service" log_id=0bGA7pNG000 service=snapshot
ts=2022-06-23T12:58:33.022505Z lvl=info msg="Starting continuous query service" log_id=0bGA7pNG000 service=continuous_querier
ts=2022-06-23T12:58:33.022524Z lvl=info msg="Starting HTTP service" log_id=0bGA7pNG000 service=httpd authentication=true
ts=2022-06-23T12:58:33.022530Z lvl=info msg="opened HTTP access log" log_id=0bGA7pNG000 service=httpd path=stderr
ts=2022-06-23T12:58:33.022534Z lvl=info msg="Auth is enabled but shared-secret is blank. BearerAuthentication is disabled." log_id=0bGA7pNG000 service=httpd
ts=2022-06-23T12:58:33.022886Z lvl=info msg="Listening on HTTP" log_id=0bGA7pNG000 service=httpd addr=[::]:8086 https=true
ts=2022-06-23T12:58:33.022932Z lvl=info msg="Starting retention policy enforcement service" log_id=0bGA7pNG000 service=retention check_interval=30m
ts=2022-06-23T12:58:33.022945Z lvl=info msg="Service is disabled." log_id=0bGA7pNG000 service=ae
ts=2022-06-23T12:58:33.022955Z lvl=info msg="Using index" log_id=0bGA7pNG000 version=inmem
ts=2022-06-23T12:58:33.022885Z lvl=info msg="Sending anonymous usage statistics to https://usage.influxdata.com" log_id=0bGA7pNG000
ts=2022-06-23T12:58:33.023423Z lvl=info msg="Listening for signals" log_id=0bGA7pNG000
ts=2022-06-23T12:58:33.023486Z lvl=info msg="Fetching and validating license" log_id=0bGA7pNG000 service=licensing license=af7a14a1-e1b8-4a3f-b2c9-be51e6b0e4a5
ts=2022-06-23T12:58:33.023099Z lvl=info msg="Storing statistics" log_id=0bGA7pNG000 service=monitor db_instance=_internal db_rp=monitor interval=10s
ts=2022-06-23T12:58:33.378487Z lvl=info msg="Saving license locally" log_id=0bGA7pNG000 service=licensing path=/tmp/influx-enterprise.key.json
ts=2022-06-23T12:58:36.023647Z lvl=info msg="node not found for host" log_id=0bGA7pNG000 service=metaclient tcpAddr=influxdb-test-influxdb-enterprise-data-0.influxdb-test-influxdb-enterprise-data.telemetry.svc.cluster.local:8088
[httpd] 172.19.129.248 - - [23/Jun/2022:12:58:37 +0000] "GET /ping HTTP/1.1" 204 0 "-" "kube-probe/1.19" 32b46ac1-f2f4-11ec-8001-4ea059d74f37 27
ts=2022-06-23T12:58:39.023463Z lvl=info msg="node not found for host" log_id=0bGA7pNG000 service=metaclient tcpAddr=influxdb-test-influxdb-enterprise-data-0.influxdb-test-influxdb-enterprise-data.telemetry.svc.cluster.local:8088
ts=2022-06-23T12:58:39.977268Z lvl=info msg="Using client state dir" log_id=0bGA7pNG000 service=metaclient path=/var/lib/influxdb/meta
ts=2022-06-23T12:58:39.991181Z lvl=info msg="Opened meta client" log_id=0bGA7pNG000 service=metaclient
{"id":3,"version":"","tcpAddr":"influxdb-test-influxdb-enterprise-data-0.influxdb-test-influxdb-enterprise-data.telemetry.svc.cluster.local:8088","httpAddr":"","httpScheme":"","labels":null}
ts=2022-06-23T12:58:43.024114Z lvl=info msg="Error writing count stats" log_id=0bGA7pNG000 service=stats error="database not found: _internal"
[httpd] 172.19.129.248 - - [23/Jun/2022:12:58:47 +0000] "GET /ping HTTP/1.1" 204 0 "-" "kube-probe/1.19" 38aaf1db-f2f4-11ec-8002-4ea059d74f37 24
[httpd] 172.19.129.248 - - [23/Jun/2022:12:58:57 +0000] "GET /ping HTTP/1.1" 204 0 "-" "kube-probe/1.19" 3ea083c3-f2f4-11ec-8003-4ea059d74f37 22
[httpd] 172.19.129.248 - - [23/Jun/2022:12:59:02 +0000] "GET /ping HTTP/1.1" 204 0 "-" "kube-probe/1.19" 416ebfbb-f2f4-11ec-8004-4ea059d74f37 25
[httpd] 172.19.129.248 - - [23/Jun/2022:12:59:07 +0000] "GET /ping HTTP/1.1" 204 0 "-" "kube-probe/1.19" 44962152-f2f4-11ec-8005-4ea059d74f37 20
[httpd] 172.19.129.248 - - [23/Jun/2022:12:59:12 +0000] "GET /ping HTTP/1.1" 204 0 "-" "kube-probe/1.19" 476442c7-f2f4-11ec-8006-4ea059d74f37 27
[httpd] 172.19.129.248 - - [23/Jun/2022:12:59:17 +0000] "GET /ping HTTP/1.1" 204 0 "-" "kube-probe/1.19" 4a8c10fa-f2f4-11ec-8007-4ea059d74f37 25
...

I do not think meta node implements InfluxDB API. As far as I know, its API is not public nor documented, and its only used by influxd-ctl commands.

You need to use data nodes for writing/querying, meta nodes serves cluster tasks and administration only.

Thank you, @alespour!
You are correct, querying is working on the data nodes perfectly!
It is a shame, that it wasn't obvious to me from the official docs 😞