Similar issue to #122 - Deployment Issue/System only stays up for minutes.
DylanDKnight opened this issue · 21 comments
I seem to be having a similar issue to #122; the node will show up in netdata cloud for around a minute and allow me to view the logs of the netdata before dropping off.
I am on agent v1.25.0 and running on GCP Kubernetes, v1.16.13-gke.1
Then it will switch to UNREACHABLE:
Then when it comes back up, it shows the gap in metrics, which each time has near enough the exact same amount of up-time, between drops.
I have attached the parent logs here: https://gist.github.com/DylanDKnight/7c9d73459eb64211d4298a65d14d11a2
I have attached the child logs here: https://gist.github.com/DylanDKnight/a6a6d45541ca729ccf2df79cc622e2f9
This is the Helm install command I am using; it gets me up time for a few minutes.
helm install \
--set parent.resources.limits.cpu=1 \
--set parent.resources.requests.cpu=1 \
--set parent.resources.limits.memory=1Gi \
--set parent.resources.requests.memory=1Gi \
--set child.resources.limits.cpu=1 \
--set child.resources.requests.cpu=1 \
--set child.resources.limits.memory=1Gi \
--set child.resources.requests.memory=1Gi \
--set parent.database.persistence=true \
--set parent.alarms.persistence=true \
--set parent.claiming.enabled=true \
--set service.port=19998 \
--set parent.claiming.token="TOKEN" \
--set parent.claiming.rooms="ROOM" \
netdata ./netdata-helmchart/charts/netdata
I seem to have to use
./netdata-helmchart/charts/netdata,
if I use
helm install netdata ./netdata-helmchart
I get
Error: validation: chart.metadata is required
Any help would be appreciated, let me know if you need me to grab anything else.
Hey @DylanDKnight,
Welcome to our community! I am so sorry that you are experiencing this issue, but we will get to the bottom of this! 🙇♂️
Thank you for providing so detailed bug details, it will greatly speed up the triaging :)
cc @cakrit because you seem to have insight on issue #122 , cc @prologic because ✌️😅
This line from the logs https://gist.github.com/DylanDKnight/7c9d73459eb64211d4298a65d14d11a2#file-gistfile1-txt-L333 looks to be related to whatever the root cause is.
@underhood @netdata/agent can you help with this? What could cause the entry above in the logs?
This line from the logs https://gist.github.com/DylanDKnight/7c9d73459eb64211d4298a65d14d11a2#file-gistfile1-txt-L333 looks to be related to whatever the root cause is.
@underhood @netdata/agent can you help with this? What could cause the entry above in the logs?
The entry is normal. There is no negotiation at this point so the default (fallback) version is 2. The agent shuts down after receiving a signal at
2020-10-04 03:26:51: netdata INFO : MAIN : SIGNAL: Received SIGTERM. Cleaning up to exit...
@DylanDKnight it could be the case that kubernetes kills the parent's pod because the liveness/readiness probes do not succeed after 90 seconds.
The default liveness/readiness probe thresholds are 90 seconds as seen here: https://github.com/netdata/helmchart/blob/master/charts/netdata/values.yaml#L76
The timestamps in your logs almost match, seeing that the pod initializes around 2020-10-04 04:25:26.192 BST https://gist.github.com/DylanDKnight/7c9d73459eb64211d4298a65d14d11a2#file-gistfile1-txt-L2 and get a SIGTERM around 2020-10-04 04:26:51.167 BST https://gist.github.com/DylanDKnight/7c9d73459eb64211d4298a65d14d11a2#file-gistfile1-txt-L334, that is, about 85 seconds later.
Could you try increasing these values with ?
--set parent.livenessProbe.failureThreshold=5
--set parent.readinessProbe.failureThreshold=5
Thanks
I increased the liveness/readiness probe.
The pods do stay up longer now, but still suffer the same issue.
Parent Logs: https://gist.github.com/DylanDKnight/02e1f3e9317306daf56f8a701f69682a
I have dug into the pod, as I am able to now (It would drop before I could query events)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 18m (x3 over 18m) default-scheduler pod has unbound immediate PersistentVolumeClaims (repeated 2 times)
Normal Scheduled 18m default-scheduler Successfully assigned default/netdata-parent-6f64dd8f64-jjbtd to gke-binance-futures--btc-usdt-market--d0f476b6-0xps
Normal SuccessfulAttachVolume 18m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-120e24aa-3aed-43a8-82a6-7959eb7eea7b"
Normal SuccessfulAttachVolume 18m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-ec78679b-ffc7-486a-8c06-9e1f8b6a021b"
Normal Killing 15m kubelet Container netdata failed liveness probe, will be restarted
Normal Pulling 15m (x2 over 18m) kubelet Pulling image "netdata/netdata:v1.25.0"
Normal Created 15m (x2 over 18m) kubelet Created container netdata
Normal Started 15m (x2 over 18m) kubelet Started container netdata
Warning Unhealthy 14m (x7 over 17m) kubelet Readiness probe failed: Get http://10.16.3.27:19998/api/v1/info: dial tcp 10.16.3.27:19998: connect: connection refused
Warning Unhealthy 13m (x9 over 17m) kubelet Liveness probe failed: Get http://10.16.3.27:19998/api/v1/info: dial tcp 10.16.3.27:19998: connect: connection refused
Normal Pulled 3m2s (x7 over 18m) kubelet Successfully pulled image "netdata/netdata:v1.25.0"
I have managed to get it to stay up.
I changed --set service.port=19998
back to --set service.port=19999
and the liveness/readiness probe is now succeeding.
now it appears to have the same issue as before:
Error
2020-10-05 14:46:50.492 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:50: netdata ERROR : ACLK_Query_0 : ACLK version negotiation failed. No reply to "hello" with "version" from cloud in time of 3s. Reverting to default ACLK version of 2.
Error
2020-10-05 14:46:59.847 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : WEB_SERVER[static1] : clients wants to STREAM metrics.
Error
2020-10-05 14:46:59.847 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-p8bd,[10.146.15.228]:39684] : thread created with task id 210
Error
2020-10-05 14:46:59.847 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-p8bd,[10.146.15.228]:39684] : set name of thread 210 to STREAM_RECEIVER
Error
2020-10-05 14:46:59.847 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-p8bd,[10.146.15.228]:39684] : STREAM gke-binance-futures-mark-default-pool-162627ff-p8bd [10.146.15.228]:39684: receive thread created (task id 210)
Error
2020-10-05 14:46:59.849 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : WEB_SERVER[static6] : clients wants to STREAM metrics.
Error
2020-10-05 14:46:59.859 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata ERROR : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-p8bd,[10.146.15.228]:39684] : HEALTH [gke-binance-futures-mark-default-pool-162627ff-p8bd]: cannot open health file: /var/lib/netdata/04759223-9d4c-46db-abd5-395d1f1ebe04/health/health-log.db.old (errno 2, No such file or directory)
Error
2020-10-05 14:46:59.865 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : WEB_SERVER[static4] : clients wants to STREAM metrics.
Error
2020-10-05 14:46:59.868 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-p8bd,[10.146.15.228]:39684] : Host 'gke-binance-futures-mark-default-pool-162627ff-p8bd' (at registry as 'gke-binance-futures-mark-default-pool-162627ff-p8bd') with guid '04759223-9d4c-46db-abd5-395d1f1ebe04' initialized, os 'linux', timezone 'UTC', tags '', program_name 'netdata', program_version 'v1.25.0', update every 1, memory mode save, history entries 3996, streaming disabled (to '' with api key ''), health enabled, cache_dir '/var/cache/netdata/04759223-9d4c-46db-abd5-395d1f1ebe04', varlib_dir '/var/lib/netdata/04759223-9d4c-46db-abd5-395d1f1ebe04', health_log '/var/lib/netdata/04759223-9d4c-46db-abd5-395d1f1ebe04/health/health-log.db', alarms default handler '/usr/libexec/netdata/plugins.d/alarm-notify.sh', alarms default recipient 'root'
Error
2020-10-05 14:46:59.868 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-p8bd,[10.146.15.228]:39684] : STREAM gke-binance-futures-mark-default-pool-162627ff-p8bd [receive from [10.146.15.228]:39684]: initializing communication...
Error
2020-10-05 14:46:59.868 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-p8bd,[10.146.15.228]:39684] : STREAM gke-binance-futures-mark-default-pool-162627ff-p8bd [receive from [10.146.15.228]:39684]: Netdata is using the stream version 3.
Error
2020-10-05 14:46:59.868 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-p8bd,[10.146.15.228]:39684] : Postponing health checks for 60 seconds, on host 'gke-binance-futures-mark-default-pool-162627ff-p8bd', because it was just connected.
Error
2020-10-05 14:46:59.868 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-p8bd,[10.146.15.228]:39684] : STREAM gke-binance-futures-mark-default-pool-162627ff-p8bd [receive from [10.146.15.228]:39684]: receiving metrics...
Error
2020-10-05 14:46:59.868 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures--btc-usdt-market--d0f476b6-0xps,[10.16.3.1]:55206] : thread created with task id 211
Error
2020-10-05 14:46:59.868 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures--btc-usdt-market--d0f476b6-0xps,[10.16.3.1]:55206] : set name of thread 211 to STREAM_RECEIVER
Error
2020-10-05 14:46:59.868 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures--btc-usdt-market--d0f476b6-0xps,[10.16.3.1]:55206] : STREAM gke-binance-futures--btc-usdt-market--d0f476b6-0xps [10.16.3.1]:55206: receive thread created (task id 211)
Error
2020-10-05 14:46:59.869 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-82gw,[10.146.15.234]:59215] : thread created with task id 212
Error
2020-10-05 14:46:59.869 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-82gw,[10.146.15.234]:59215] : set name of thread 212 to STREAM_RECEIVER
Error
2020-10-05 14:46:59.869 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures-mark-default-pool-162627ff-82gw,[10.146.15.234]:59215] : STREAM gke-binance-futures-mark-default-pool-162627ff-82gw [10.146.15.234]:59215: receive thread created (task id 212)
Error
2020-10-05 14:46:59.885 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata ERROR : STREAM_RECEIVER[gke-binance-futures--btc-usdt-market--d0f476b6-0xps,[10.16.3.1]:55206] : HEALTH [gke-binance-futures--btc-usdt-market--d0f476b6-0xps]: cannot open health file: /var/lib/netdata/19655a23-1800-4959-8b97-f9ffe13b214a/health/health-log.db.old (errno 2, No such file or directory)
Error
2020-10-05 14:46:59.889 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures--btc-usdt-market--d0f476b6-0xps,[10.16.3.1]:55206] : Host 'gke-binance-futures--btc-usdt-market--d0f476b6-0xps' (at registry as 'gke-binance-futures--btc-usdt-market--d0f476b6-0xps') with guid '19655a23-1800-4959-8b97-f9ffe13b214a' initialized, os 'linux', timezone 'UTC', tags '', program_name 'netdata', program_version 'v1.25.0', update every 1, memory mode save, history entries 3996, streaming disabled (to '' with api key ''), health enabled, cache_dir '/var/cache/netdata/19655a23-1800-4959-8b97-f9ffe13b214a', varlib_dir '/var/lib/netdata/19655a23-1800-4959-8b97-f9ffe13b214a', health_log '/var/lib/netdata/19655a23-1800-4959-8b97-f9ffe13b214a/health/health-log.db', alarms default handler '/usr/libexec/netdata/plugins.d/alarm-notify.sh', alarms default recipient 'root'
Error
2020-10-05 14:46:59.889 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures--btc-usdt-market--d0f476b6-0xps,[10.16.3.1]:55206] : STREAM gke-binance-futures--btc-usdt-market--d0f476b6-0xps [receive from [10.16.3.1]:55206]: initializing communication...
Error
2020-10-05 14:46:59.889 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures--btc-usdt-market--d0f476b6-0xps,[10.16.3.1]:55206] : STREAM gke-binance-futures--btc-usdt-market--d0f476b6-0xps [receive from [10.16.3.1]:55206]: Netdata is using the stream version 3.
Error
2020-10-05 14:46:59.889 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures--btc-usdt-market--d0f476b6-0xps,[10.16.3.1]:55206] : Postponing health checks for 60 seconds, on host 'gke-binance-futures--btc-usdt-market--d0f476b6-0xps', because it was just connected.
Error
2020-10-05 14:46:59.889 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : STREAM_RECEIVER[gke-binance-futures--btc-usdt-market--d0f476b6-0xps,[10.16.3.1]:55206] : STREAM gke-binance-futures--btc-usdt-market--d0f476b6-0xps [receive from [10.16.3.1]:55206]: receiving metrics...
Error
2020-10-05 14:46:59.895 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata INFO : WEB_SERVER[static5] : clients wants to STREAM metrics.
Error
2020-10-05 14:46:59.906 BST
netdatanetdata-parent-686bdf57f9-jcvz92020-10-05 13:46:59: netdata LOG FLOOD PROTECTION too many logs (201 logs in 18 seconds, threshold is set to 200 logs in 1200 seconds). Preventing more logs from process 'netdata' for 1182 seconds.
So digging in a bit further to the logs.
The first log on startup is:
netdata ERROR : MAIN : Ignoring host prefix '/host': path '/host' failed to stat() (errno 2, No such file or directory)
The last log before a crash is:
netdata LOG FLOOD PROTECTION too many logs (201 logs in 30 seconds, threshold is set to 200 logs in 1200 seconds). Preventing more logs from process 'netdata' for 1170 seconds.
Another thing to add to the issue.
When I only had MongoDB, Redis & PGBouncer deployed, Netdata ran without crashing, as soon as we deployed some of our in-house applications, it crashed.
@DylanDKnight I was able to reproduce the issue by setting service.port
to 19998
. I am trying to find a fix..
@knatsakis Nice,
I switched the port back to 19999, and that fixed that issue for me.
Although, I still see this in the logs:
2020-10-07 22:04:08: netdata ERROR : MAIN : LISTENER: Invalid listen port 0 given. Defaulting to 19999. (errno 22, Invalid argument)
The current issue I am having is:
2020-10-07 22:04:19: netdata LOG FLOOD PROTECTION too many logs (201 logs in 10 seconds, threshold is set to 200 logs in 1200 seconds). Preventing more logs from process 'netdata' for 1190 seconds.
That is the last message before it crashes.
Also when the system is up, I only see Netdata metrics within Netdata cloud, no CPU or even memory stats and no ability to add anything else other than Netdata stats to a dashboard.
I am also seeing these errors on the child
Error
2020-10-07 23:04:31.365 BST
2020-10-07 22:04:31: netdata LOG FLOOD PROTECTION too many logs (201 logs in 50 seconds, threshold is set to 200 logs in 1200 seconds). Preventing more logs from process 'netdata' for 1150 seconds.
Error
2020-10-07 23:04:31.365 BST
2020-10-07 22:04:31: netdata ERROR : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: restart stream because socket reports errors (POLLERR) - 313263 bytes transmitted.
Error
2020-10-07 23:04:31.365 BST
2020-10-07 22:04:31: netdata ERROR : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: failed to send metrics - closing connection - we have sent 313263 bytes on this connection. (errno 9, Bad file descriptor)
Error
2020-10-07 23:04:31.365 BST
2020-10-07 22:04:31: netdata ERROR : PLUGIN[proc] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send]: not ready - discarding collected metrics.
Error
2020-10-07 23:04:31.365 BST
2020-10-07 22:04:31: netdata ERROR : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: error during read (-1). Restarting connection (errno 104, Connection reset by peer)
Error
2020-10-07 23:04:30.388 BST
2020-10-07 22:04:30: netdata INFO : PLUGINSD[apps] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send]: sending metrics...
Error
2020-10-07 23:04:30.372 BST
2020-10-07 22:04:30: netdata INFO : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: established communication with a parent using protocol version 3 - ready to send metrics...
Error
2020-10-07 23:04:30.372 BST
2020-10-07 22:04:30: netdata INFO : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: waiting response from remote netdata...
Error
2020-10-07 23:04:30.372 BST
2020-10-07 22:04:30: netdata INFO : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: initializing communication...
Error
2020-10-07 23:04:30.371 BST
2020-10-07 22:04:30: netdata INFO : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: connecting...
Error
2020-10-07 23:04:30.371 BST
2020-10-07 22:04:30: netdata ERROR : PLUGIN[cgroups] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send]: not ready - discarding collected metrics. (errno 22, Invalid argument)
Error
2020-10-07 23:04:30.371 BST
2020-10-07 22:04:30: netdata ERROR : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: error during read (-1). Restarting connection (errno 104, Connection reset by peer)
Error
2020-10-07 23:04:30.371 BST
2020-10-07 22:04:30: netdata INFO : PLUGIN[cgroups] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send]: sending metrics...
Error
2020-10-07 23:04:30.368 BST
2020-10-07 22:04:30: netdata INFO : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: established communication with a parent using protocol version 3 - ready to send metrics...
Error
2020-10-07 23:04:30.368 BST
2020-10-07 22:04:30: netdata INFO : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: waiting response from remote netdata...
Error
2020-10-07 23:04:30.368 BST
2020-10-07 22:04:30: netdata INFO : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: initializing communication...
Error
2020-10-07 23:04:30.367 BST
2020-10-07 22:04:30: netdata INFO : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: connecting...
Error
2020-10-07 23:04:30.367 BST
2020-10-07 22:04:30: netdata ERROR : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: restart stream because socket reports errors (POLLERR) - 386120 bytes transmitted.
Error
2020-10-07 23:04:30.367 BST
2020-10-07 22:04:30: netdata ERROR : STREAM_SENDER[gke-binance-futures-m-binance-futures-3f39fac9-6678] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send to netdata:19999]: failed to send metrics - closing connection - we have sent 386120 bytes on this connection. (errno 9, Bad file descriptor)
Error
2020-10-07 23:04:30.367 BST
2020-10-07 22:04:30: netdata ERROR : PLUGIN[proc] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [send]: not ready - discarding collected metrics. (errno 22, Invalid argument)
So I set this to 2000000
errors to trigger flood protection = 200
and via the logs, it looks like it is now trying to collect metrics/charts.
Error
2020-10-08 00:59:55.517 BST
2020-10-07 23:59:55: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:37984] : Initializing file /var/cache/netdata/190dd876-0f7c-4008-8f06-cfef4aab4e69/cpu.cpu43_softirqs/HRTIMER.db.
Error
2020-10-08 00:59:55.517 BST
2020-10-07 23:59:55: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:37984] : Initializing file /var/cache/netdata/190dd876-0f7c-4008-8f06-cfef4aab4e69/cpu.cpu43_softirqs/SCHED.db.
Error
2020-10-08 00:59:55.517 BST
2020-10-07 23:59:55: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:37984] : Initializing file /var/cache/netdata/190dd876-0f7c-4008-8f06-cfef4aab4e69/cpu.cpu43_softirqs/TASKLET.db.
Error
2020-10-08 00:59:55.517 BST
2020-10-07 23:59:55: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:37984] : Initializing file /var/cache/netdata/190dd876-0f7c-4008-8f06-cfef4aab4e69/cpu.cpu43_softirqs/NET_RX.db.
Error
2020-10-08 00:59:55.517 BST
2020-10-07 23:59:55: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:37984] : Initializing file /var/cache/netdata/190dd876-0f7c-4008-8f06-cfef4aab4e69/cpu.cpu43_softirqs/NET_TX.db.
Error
2020-10-08 00:59:55.516 BST
2020-10-07 23:59:55: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:37984] : Initializing file /var/cache/netdata/190dd876-0f7c-4008-8f06-cfef4aab4e69/cpu.cpu43_softirqs/TIMER.db.
Error
2020-10-08 00:59:55.516 BST
2020-10-07 23:59:55: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:37984] : Initializing file /var/cache/netdata/190dd876-0f7c-4008-8f06-cfef4aab4e69/cpu.cpu43_softirqs/main.db.
Error
2020-10-08 00:59:55.516 BST
2020-10-07 23:59:55: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:37984] : Initializing file /var/cache/netdata/190dd876-0f7c-4008-8f06-cfef4aab4e69/cpu.cpu42_softirqs/RCU.db.
Error
2020-10-08 00:59:55.516 BST
2020-10-07 23:59:55: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:37984] : Initializing file /var/cache/netdata/190dd876-0f7c-4008-8f06-cfef4aab4e69/cpu.cpu42_softirqs/SCHED.db.
but there is a lot of these kind of logs:
2020-10-08 00:59:56.355 BST
2020-10-07 23:59:56: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:2497] : RRDSET: chart name 'netdata.aclk_write_q' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-08 00:59:56.355 BST
2020-10-07 23:59:56: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:2497] : RRDSET: chart name 'netdata.aclk_query_per_second' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-08 00:59:56.355 BST
2020-10-07 23:59:56: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:2497] : RRDSET: chart name 'netdata.aclk_status' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
and it crashes with:
Error
2020-10-08 00:59:56.397 BST
Shutting down spawn server loop complete.
Error
2020-10-08 00:59:56.397 BST
Shutting down spawn server event loop.
Error
2020-10-08 00:59:56.397 BST
EOF found in spawn pipe.
Hi @DylanDKnight, let me try to address all the issues that you mentioned, one by one.
The
2020-10-07 22:04:08: netdata ERROR : MAIN : LISTENER: Invalid listen port 0 given. Defaulting to 19999. (errno 22, Invalid argument)
error, although harmless, should be resolved by netdata/netdata#10045 above.
I am still working on the rest of the issues.
@knatsakis Not a problem, sorry for adding a chunk to your backlog! aha.
If you need me to do any digging on my end, let me know.
Hey @DylanDKnight,
v2.0.11 of the helm chart (with appVersion v1.26.0) should contain all the relevant fixes.
Could you try it and let me know?
Thanks
I did a clean clone, and install.
So log wise it there are fewer errors.
But it still appears to fall over every few minutes, the only logs I can see that have any errors in I have included below.
I also still don't see any charts apart from netdata charts in netdata.cloud.
If there is anything, in particular, I should look for/grab let me know
Error
2020-10-15 20:43:47.372 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : thread with task id 277 finished
Error
2020-10-15 20:43:47.372 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:5275]: receive thread ended (task id 277)
Error
2020-10-15 20:43:47.372 BST
netdata2020-10-15 19:43:47: netdata ERROR : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:5275]: disconnected (completed 0 updates). (errno 22, Invalid argument)
Error
2020-10-15 20:43:47.372 BST
netdata2020-10-15 19:43:47: netdata ERROR : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : requested a CHART, without a type.id, on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678'. Disabling it. (errno 22, Invalid argument)
Error
2020-10-15 20:43:47.370 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:5275]: receiving metrics...
Error
2020-10-15 20:43:47.370 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : Postponing health checks for 60 seconds, on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678', because it was just connected.
Error
2020-10-15 20:43:47.370 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:5275]: Netdata is using the stream version 3.
Error
2020-10-15 20:43:47.370 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:5275]: initializing communication...
Error
2020-10-15 20:43:47.370 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [10.16.0.1]:5275: receive thread created (task id 277)
Error
2020-10-15 20:43:47.370 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : set name of thread 277 to STREAM_RECEIVER
Error
2020-10-15 20:43:47.370 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:5275] : thread created with task id 277
Error
2020-10-15 20:43:47.370 BST
netdata2020-10-15 19:43:47: netdata INFO : WEB_SERVER[static1] : clients wants to STREAM metrics.
Error
2020-10-15 20:43:47.370 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : thread with task id 276 finished
Error
2020-10-15 20:43:47.370 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:14248]: receive thread ended (task id 276)
Error
2020-10-15 20:43:47.369 BST
netdata2020-10-15 19:43:47: netdata ERROR : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:14248]: disconnected (completed 0 updates). (errno 22, Invalid argument)
Error
2020-10-15 20:43:47.369 BST
netdata2020-10-15 19:43:47: netdata ERROR : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : requested a CHART, without a type.id, on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678'. Disabling it. (errno 22, Invalid argument)
Error
2020-10-15 20:43:47.367 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:14248]: receiving metrics...
Error
2020-10-15 20:43:47.367 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : Postponing health checks for 60 seconds, on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678', because it was just connected.
Error
2020-10-15 20:43:47.367 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:14248]: Netdata is using the stream version 3.
Error
2020-10-15 20:43:47.367 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:14248]: initializing communication...
Error
2020-10-15 20:43:47.367 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [10.16.0.1]:14248: receive thread created (task id 276)
Error
2020-10-15 20:43:47.367 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : set name of thread 276 to STREAM_RECEIVER
Error
2020-10-15 20:43:47.367 BST
netdata2020-10-15 19:43:47: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:14248] : thread created with task id 276
Error
2020-10-15 20:46:37.379 BST
netdata2020-10-15 19:46:37: netdata INFO : WEB_SERVER[static3] : clients wants to STREAM metrics.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : thread with task id 345 finished
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:23380]: receive thread ended (task id 345)
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata ERROR : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:23380]: disconnected (completed 23 updates). (errno 22, Invalid argument)
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata ERROR : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : requested a CHART, without a type.id, on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678'. Disabling it. (errno 22, Invalid argument)
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_kube_dns_autoscaler_645f7d66cf_s4w4j_543e5bb8_3ff1_44af_8552_62186282cf6d_autoscaler.mem_usage_limit' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_kube_dns_autoscaler_645f7d66cf_s4w4j_543e5bb8_3ff1_44af_8552_62186282cf6d_autoscaler.mem_usage' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_kube_dns_autoscaler_645f7d66cf_s4w4j_543e5bb8_3ff1_44af_8552_62186282cf6d_autoscaler.pgfaults' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_kube_dns_autoscaler_645f7d66cf_s4w4j_543e5bb8_3ff1_44af_8552_62186282cf6d_autoscaler.mem_activity' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_kube_dns_autoscaler_645f7d66cf_s4w4j_543e5bb8_3ff1_44af_8552_62186282cf6d_autoscaler.writeback' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_kube_dns_autoscaler_645f7d66cf_s4w4j_543e5bb8_3ff1_44af_8552_62186282cf6d_autoscaler.mem' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_kube_dns_autoscaler_645f7d66cf_s4w4j_543e5bb8_3ff1_44af_8552_62186282cf6d_autoscaler.cpu_per_core' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_kube_dns_autoscaler_645f7d66cf_s4w4j_543e5bb8_3ff1_44af_8552_62186282cf6d_autoscaler.cpu_limit' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_kube_dns_autoscaler_645f7d66cf_s4w4j_543e5bb8_3ff1_44af_8552_62186282cf6d_autoscaler.cpu' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_metrics_server_v0.3.6_64655c969_d4dd2_b584a363_6f1f_46df_b849_092c070338c1_metrics_server.mem_usage_limit' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_metrics_server_v0.3.6_64655c969_d4dd2_b584a363_6f1f_46df_b849_092c070338c1_metrics_server.mem_usage' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_metrics_server_v0.3.6_64655c969_d4dd2_b584a363_6f1f_46df_b849_092c070338c1_metrics_server.pgfaults' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_metrics_server_v0.3.6_64655c969_d4dd2_b584a363_6f1f_46df_b849_092c070338c1_metrics_server.mem_activity' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_metrics_server_v0.3.6_64655c969_d4dd2_b584a363_6f1f_46df_b849_092c070338c1_metrics_server.writeback' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.378 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_metrics_server_v0.3.6_64655c969_d4dd2_b584a363_6f1f_46df_b849_092c070338c1_metrics_server.mem' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.377 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_metrics_server_v0.3.6_64655c969_d4dd2_b584a363_6f1f_46df_b849_092c070338c1_metrics_server.cpu_per_core' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.377 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_metrics_server_v0.3.6_64655c969_d4dd2_b584a363_6f1f_46df_b849_092c070338c1_metrics_server.cpu_limit' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.377 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_metrics_server_v0.3.6_64655c969_d4dd2_b584a363_6f1f_46df_b849_092c070338c1_metrics_server.cpu' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.377 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_fluentd_gke_wpgzk_81a9af93_0dc8_4ea9_9068_d6ebe3efd83e_fluentd_gcp.mem_usage_limit' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.377 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_fluentd_gke_wpgzk_81a9af93_0dc8_4ea9_9068_d6ebe3efd83e_fluentd_gcp.mem_usage' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.377 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_fluentd_gke_wpgzk_81a9af93_0dc8_4ea9_9068_d6ebe3efd83e_fluentd_gcp.pgfaults' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.377 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_fluentd_gke_wpgzk_81a9af93_0dc8_4ea9_9068_d6ebe3efd83e_fluentd_gcp.mem_activity' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.377 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : RRDSET: chart name 'cgroup_k8s_kube_system_fluentd_gke_wpgzk_81a9af93_0dc8_4ea9_9068_d6ebe3efd83e_fluentd_gcp.writeback' on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678' already exists.
Error
2020-10-15 20:46:37.376 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:23380]: receiving metrics...
Error
2020-10-15 20:46:37.376 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : Postponing health checks for 60 seconds, on host 'gke-binance-futures-m-binance-futures-3f39fac9-6678', because it was just connected.
Error
2020-10-15 20:46:37.376 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:23380]: Netdata is using the stream version 3.
Error
2020-10-15 20:46:37.376 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [receive from [10.16.0.1]:23380]: initializing communication...
Error
2020-10-15 20:46:37.376 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : STREAM gke-binance-futures-m-binance-futures-3f39fac9-6678 [10.16.0.1]:23380: receive thread created (task id 345)
Error
2020-10-15 20:46:37.376 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : set name of thread 345 to STREAM_RECEIVER
Error
2020-10-15 20:46:37.376 BST
netdata2020-10-15 19:46:37: netdata INFO : STREAM_RECEIVER[gke-binance-futures-m-binance-futures-3f39fac9-6678,[10.16.0.1]:23380] : thread created with task id 345
Error
2020-10-15 20:46:37.376 BST
netdata2020-10-15 19:46:37: netdata INFO : WEB_SERVER[static1] : clients wants to STREAM metrics.
Unfortunately, I am not able to reproduce the issue. I have used a freshly cloned repo and installed netdata with:
helm install \
--set parent.resources.limits.cpu=1 \
--set parent.resources.requests.cpu=1 \
--set parent.resources.limits.memory=1Gi \
--set parent.resources.requests.memory=1Gi \
--set child.resources.limits.cpu=1 \
--set child.resources.requests.cpu=1 \
--set child.resources.limits.memory=1Gi \
--set child.resources.requests.memory=1Gi \
--set parent.database.persistence=true \
--set parent.alarms.persistence=true \
--set parent.claiming.enabled=true \
--set service.port=19998 \
--set parent.claiming.token="TOKEN" \
--set parent.claiming.rooms="ROOM" \
netdata ./charts/netdata
It stays up after that.
Could you upload the full netdata parent logs somewhere, preferably from start to finish?
Also output from
kubectl describe pod netdata-parent-xxxxxx
may show k8s events that maybe relevant.
Thanks!
I pulled fresh and did a clean install.
Logs CSV, I just pulled a load, as it's hard to see where it is falling over.
Netdata Parent Describe
Name: netdata-parent-bbd65d4fd-cmcwh
Namespace: default
Priority: 1000
Priority Class Name: low-priority
Node: gke-binance-futures-m-binance-futures-3f39fac9-6678/10.146.15.211
Start Time: Wed, 21 Oct 2020 14:23:39 +0000
Labels: app=netdata
pod-template-hash=bbd65d4fd
release=netdata
role=parent
Annotations: checksum/config: 2abecb8f6dbe6015e7f499b85f4f1473da705653a59f43acd9a1273b4999d4d4
Status: Running
IP: 10.16.0.107
IPs:
IP: 10.16.0.107
Controlled By: ReplicaSet/netdata-parent-bbd65d4fd
Containers:
netdata:
Container ID: docker://a58ca1b53bc58da02e445e2d119fe6df3689f45fe3ec5c9d2a59d657be2197f9
Image: netdata/netdata:v1.26.0
Image ID: docker-pullable://netdata/netdata@sha256:784cf58204a686ec461bd716d6697e4a842b7edbdeccf0ae4c4d0e8cd5186fc4
Port: 19998/TCP
Host Port: 0/TCP
Command:
sh
-c
exec /usr/sbin/run.sh -W set2 cloud global enabled true -W set2 cloud global "cloud base url" "https://app.netdata.cloud" -W "claim -token=FEYTdWgT5kR5e7b7_nTkxI8J-2ZzMtSZzlkgqgx3-pXGID2byoTc5E4G7P2EsRD4_v2K0Cvw9Zyfs_ej2mFKU0xDMQcr_tImLX9WPIoxLJKoDzEHyrzKxtLhawGJGqaSPTDgECU -rooms=c5cce931-f9d3-4f2d-917d-642a811542f9 -url=https://app.netdata.cloud"
State: Running
Started: Wed, 21 Oct 2020 14:24:01 +0000
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 1
memory: 1Gi
Liveness: http-get http://:http/api/v1/info delay=0s timeout=1s period=30s #success=1 #failure=3
Readiness: http-get http://:http/api/v1/info delay=0s timeout=1s period=30s #success=1 #failure=3
Environment:
MY_POD_NAME: netdata-parent-bbd65d4fd-cmcwh (v1:metadata.name)
MY_POD_NAMESPACE: default (v1:metadata.namespace)
NETDATA_LISTENER_PORT: 19998
Mounts:
/etc/netdata/health_alarm_notify.conf from config (rw,path="health")
/etc/netdata/netdata.conf from config (rw,path="netdata")
/etc/netdata/stream.conf from config (rw,path="stream")
/var/cache/netdata from database (rw)
/var/lib/netdata from alarms (rw)
/var/run/secrets/kubernetes.io/serviceaccount from netdata-token-ckq2f (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: netdata-conf-parent
Optional: false
database:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: netdata-parent-database
ReadOnly: false
alarms:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: netdata-parent-alarms
ReadOnly: false
netdata-token-ckq2f:
Type: Secret (a volume populated by a Secret)
SecretName: netdata-token-ckq2f
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 87s (x2 over 88s) default-scheduler pod has unbound immediate PersistentVolumeClaims
Normal Scheduled 84s default-scheduler Successfully assigned default/netdata-parent-bbd65d4fd-cmcwh to gke-binance-futures-m-binance-futures-3f39fac9-6678
Normal SuccessfulAttachVolume 77s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-d37da175-16cd-4b27-ab64-2529d0d3eaf0"
Normal SuccessfulAttachVolume 74s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-97dbec77-d085-4b53-a675-e199993ddc8d"
Normal Pulling 66s kubelet Pulling image "netdata/netdata:v1.26.0"
Normal Pulled 62s kubelet Successfully pulled image "netdata/netdata:v1.26.0"
Normal Created 62s kubelet Created container netdata
Normal Started 62s kubelet Started container netdata
Interestingly, I see almost the same sympthoms having the:
[web]
mode = none
in the following config (netdata-values.yaml):
parent:
claiming:
enabled: true
token: XXX
rooms: YYY
child:
claiming:
enabled: true
token: XXX
rooms: YYY
configs:
netdata:
data: |
[global]
memory mode = ram
history = 3600
access log = none
update every = 5
[health]
enabled = no
[web]
mode = none
ingress:
enabled: false
which I use to update values via helm upgrade -f netdata-values.yaml netdata netdata/netdata
@mbuczko @DylanDKnight are you guys still having problems?
pod has unbound immediate PersistentVolumeClaims
is usually an error when trying to point to storageclass for PVC which does not existis.
Closing due to lack of response.