livelinessProbe.sh not returning error despite weblogic server process not found
jkramplify opened this issue · 3 comments
Hi,
Weblogic Kubernetes Operator Version: 3.2.3
We have noticed that the weblogic-server container in one of our pods was not running. We did some checking and found out weblogic-server process was terminated because memory issue.
<Dec 13, 2022 4:06:29 PM HKT> <SEVERE> <domain> <m3> <Unexpected error while monitoring server>
java.io.IOException: Cannot allocate memory
We expected the container to restart because of this but it didn't. We did some further checks and the livelinessprobe is not returning any error that's why the container was not restarted.
[oracle@domain-m3 scripts]$ bash livenessProbe.sh
[oracle@domain-m3 scripts]$ $?
bash: 0: command not found
[oracle@domain-m3 scripts]$
Can someone explain why the liveliness probe is behaving this way?
There is a Node Manager process that monitors the health of the weblogic server. When the weblogic server terminated unexpectedly, the Node Manager should have updated the state file corresponding to the server. The livenessProbe.sh expects the state file to exist ($DOMAIN_HOME/servers//data/nodemanger/.state). It's hard to say exactly why the liveness probe is giving the false result without some more information, such as the contents of the state file and/or any log information for the weblogic server and nodemanager.
@jkramplify The latest WebLogic Kubernetes Operator release is 3.4.4 You may want to see if the latest release contains a fix that resolves your issue. Our support statement indicates that we only support the latest minor release of a major line e.g. 3.4.x.
@lennyphan Thank you for the reply. At this point we will be trying the 4.0.0 in a dev environment.