netdata/helmchart

Netdata parent pod keeps running into error

mamutuberalles opened this issue · 9 comments

Hello All!

I've submitting this issue, as the people on the discord channel told me to.

The issue I've been encountering is that for some unknown reason the netdata parent keeps crashig and not starting despite being fine a few days ago (with the same version number: 3.7.65).

The most notable line in the error message is:
eof found in spawn pipe

I've found that maybe increasing the ram, either in the yaml config, or on the machine that hosts the kubernetes is the key, but neither of those things seem to work.
( I haven't actually increased the ram on the machines as none of the pods seem to have any issue with ram, and the child pods of netdata show that ram usage is at best at 70%)

I'll attach the logs.
netdata.log

The pod exits with exit code 139, which is segmentation fault, as you can see from the logs it has issues with the python.d plugins/modules, maybe that is the root cause?

Update, the issue is with when memory mode is set to dbengine , setting it to ram or save fixes the issue, however it is just a hotfix, as with parent-child relationships it is not recommended.

thanks for putting this into a ticket @mamutuberalles , a question the config on memory mode was something that was part of the installation or you did a change?
asking because we saw an issue reported by a user on community forums here related with a confusion between stream configuration settings and dbengine settings.

Update, setting the default memory mode in stream.conf to dbengine while the memory mode of the parent and child is dbengine fixes the issue

child is dbengine fixes the issue

the child you can probably set to ram or something that suits more your needs

ilyam8 commented

@mamutuberalles what combination of memory modes in netdata.conf/stream.conf for parent/child is causing the problem?

@mamutuberalles what combination of memory modes in netdata.conf/stream.conf for parent/child is causing the problem?

The problem was when the parent had dbengine , the child had ram, and in stream.conf the default was set to save.

When I changed parent to save the error went away but that's suboptimal, so I set each one to dbengine and the problem went away.

ilyam8 commented

I think the correct way is

  • parent dbengine
  • child ram

And dbengine in stream.conf (which is the default). It is better to have child with ram.

the correct way

Which is the default if you don't change memory mode settings.

ilyam8 commented

Closing in favor of netdata/netdata#15714

@mamutuberalles use parent/dbengine child/ram.