saferwall/saferwall

clamav issue

Geexirooz opened this issue · 11 comments

Hello there!
I have just tried to setup your project in my environment and I had issue with one of pods.
The pod is "venus-saferwall-multiav-clamav-86f56dcc74-w5889" and here is output of the command "kubectl get pods"
can anyone help me to fix this issue?

Screenshot from 2021-01-09 15-27-57

Hi @XXXereXXX , perhaps your clamd process got killed due to out of memory?

From my build at saferwall-box v2.0.0, ClamAV (multiav-pod-clamav) may use 2.3GB of memory during startup:

[vagrant@saferwall-box saferwall]$ podman stats
ID            NAME                     CPU %   MEM USAGE / LIMIT  MEM %    NET IO   BLOCK IO  PIDS
11853f225dd8  317678c3bbd8-infra       --      946.2kB / 6.226GB  0.02%    -- / --  -- / --   1
15b193537df6  multiav-pod-comodo       --      11.69MB / 367MB    3.19%    -- / --  -- / --   6
17d04ebd7c8f  4f4aa7527eb2-infra       --      819.2kB / 6.226GB  0.01%    -- / --  -- / --   1
24ab309466fd  multiav-pod-windefender  --      11.82MB / 471.9MB  2.50%    -- / --  -- / --   6
36778137056f  rootless-cni-infra       --      26.98MB / 6.226GB  0.43%    -- / --  -- / --   7
428bbd7b7ab7  aee14c322568-infra       --      856.1kB / 6.226GB  0.01%    -- / --  -- / --   1
4ffade524794  couchbase-pod-couchbase  67.03%  915.6MB / 1.153GB  79.38%   -- / --  -- / --   292
588a03f0f935  nsq-pod-nsqlookup        --      9.167MB / 20.97MB  43.71%   -- / --  -- / --   8
8450c4ec9535  saferwall-pod-backend    --      23.58MB / 314.6MB  7.50%    -- / --  -- / --   7
8c51a591ebf5  5604833ebba1-infra       --      958.5kB / 6.226GB  0.02%    -- / --  -- / --   1
8f56cae2a15b  nsq-pod-nsqadmin         --      7.758MB / 20.97MB  36.99%   -- / --  -- / --   8
b4114a27840f  multiav-pod-sophos       --      11.7MB / 367MB     3.19%    -- / --  -- / --   5
cb4f4df790d0  saferwall-pod-ui         --      9.433MB / 20.97MB  44.98%   -- / --  -- / --   6
de52bf1845e4  multiav-pod-clamav       99.99%  2.307GB / 2.307GB  100.00%  -- / --  -- / --   8
e616bdfadf26  nsq-pod-nsq              0.39%   10.48MB / 20.97MB  49.98%   -- / --  -- / --   12
e9c76dc751fd  minio-pod-minio          --      150.9MB / 524.3MB  28.78%   -- / --  -- / --   9
faf570f12223  saferwall-pod-consumer   --      5.292MB / 2.202GB  0.24%    -- / --  -- / --   8
ffd286ddfbb2  9570a508a29b-infra       --      938kB / 6.226GB    0.02%    -- / --  -- / --   1

Yes, that should be the reason as @nikAizuddin said.

You can do a kubectl describe pod <name_of_clamav_pod> and check if the error reason is OOMKilled.

I actually restored the machine to a snapshot and I am setting it up again now and I will check the RAM issue and give feedback, however my VM has 8 GB RAM which meets the requirements.

Worth mentioning that, when I ran "make kind-up", I got below error:

"kubectl wait --namespace ingress-nginx
--for=condition=ready pod
--selector=app.kubernetes.io/component=controller
--timeout=90s
error: timed out waiting for the condition on pods/ingress-nginx-controller-55bc59c885-xjkbb
build/mk/kind.mk:15: recipe for target 'kind-deploy-ingress-nginx' failed
make[1]: *** [kind-deploy-ingress-nginx] Error 1
make[1]: Leaving directory '/root/saferwall'
build/mk/kind.mk:30: recipe for target 'kind-up' failed
make: *** [kind-up] Error 2"

And when I ran "make k8s-init-cert-manager" I got below error:

"# Verify the installation.
kubectl wait --namespace cert-manager
--for=condition=ready pod
--selector=app.kubernetes.io/instance=cert-manager
--timeout=90s
timed out waiting for the condition on pods/cert-manager-cainjector-68b467c55f-hp25c
timed out waiting for the condition on pods/cert-manager-db4ccd57b-p4wht
timed out waiting for the condition on pods/cert-manager-webhook-79c5595878-ttfgx
build/mk/k8s.mk:142: recipe for target 'k8s-init-cert-manager' failed
make: *** [k8s-init-cert-manager] Error 1"

Might these errors cause the problem?

@LordNoteworthy here is the output of "kubectl describe pod venus-saferwall-multiav-clamav-86f56dcc74-g8g9h" :

Name:         venus-saferwall-multiav-clamav-86f56dcc74-g8g9h
Namespace:    default
Priority:     0
Node:         saferwall-control-plane/172.18.0.2
Start Time:   Sun, 10 Jan 2021 07:37:55 +0000
Labels:       app.kubernetes.io/instance=venus
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=saferwall-multiav-clamav
              helm.sh/chart=saferwall-0.0.3
              pod-template-hash=86f56dcc74
Annotations:  <none>
Status:       Running
IP:           10.244.0.42
IPs:
  IP:           10.244.0.42
Controlled By:  ReplicaSet/venus-saferwall-multiav-clamav-86f56dcc74
Containers:
  clamav:
    Container ID:   containerd://55824f705070c433b704d7b4594eb20822732986514038e1806db505fbb6944d
    Image:          saferwall/goclamav:latest
    Image ID:       docker.io/saferwall/goclamav@sha256:a6d6ef4d733b8e9144695aa75ab1d6cb85f1d713dd2f73a8ca8263606031f678
    Port:           50051/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 12 Jan 2021 05:57:38 +0000
      Finished:     Tue, 12 Jan 2021 05:59:38 +0000
    Ready:          False
    Restart Count:  288
    Limits:
      cpu:     100m
      memory:  100Mi
    Requests:
      cpu:        50m
      memory:     50Mi
    Environment:  <none>
    Mounts:
      /samples from samples (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-ht6zd (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  samples:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  venus-saferwall-samples
    ReadOnly:   false
  default-token-ht6zd:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-ht6zd
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                    From     Message
  ----     ------   ----                   ----     -------
  Normal   Created  5m34s (x289 over 45h)  kubelet  Created container clamav
  Warning  BackOff  25s (x6522 over 44h)   kubelet  Back-off restarting failed container

@XXXereXXX Memory limits 100Mi seems too small for me. Can you try increase to 2500Mi?

For my build, I have to set 2200Mi limit for 32MB max_file_size upload: https://github.com/extra2000/saferwall-box/blob/master/salt/roots/pillar/saferwall.sls.example#L119-L128:

Thank you for support. Would you tell me where I should do the same thing (changing memory limit) in my environment?
because I did not find a file like what you mentioned.
here is the list of files and directories below the project parent directory:

api/    CHANGELOG.md  configs/      docs/         go.mod  LICENSE      Makefile  README.md  search.py  ui/   website/
build/  cmd/           deployments/  example.env  go.sum  linux-amd64/  pkg/       scripts/    test/       web/

Thank you for support.

You're welcome :)

I think the config map for MultiAV deployments can be found here deployments/saferwall/values.yaml#L777-L783 which will map the values into deployments/saferwall/templates/multiav-deployment.yaml.

You can try set both memory requests and limits to 2500Mi in deployments/saferwall/values.yaml#L777-L783. But this will be applied to all AV.

I modified the file and replace 100Mi with 2500Mi, but output of the command "kubectl describe pod venus-saferwall-multiav-clamav-86f56dcc74-g8g9h" has not changed (limit=100Mi) and the container's error still exists. Do I have to restart something?

Noteworthy, output of the command "kubectl describe pod venus-kibana-65f5794dc7-6vfvc" says CPU and RAM are not sufficient, although limits are different from clamav, stated 2Gi.
here it is:

Name:           venus-kibana-65f5794dc7-6vfvc
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=kibana
                pod-template-hash=65f5794dc7
                release=venus
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  ReplicaSet/venus-kibana-65f5794dc7
Containers:
  kibana:
    Image:      docker.elastic.co/kibana/kibana:7.9.3
    Port:       5601/TCP
    Host Port:  0/TCP
    Limits:
      cpu:     1
      memory:  2Gi
    Requests:
      cpu:      1
      memory:   2Gi
    Readiness:  exec [sh -c #!/usr/bin/env bash -e

# Disable nss cache to avoid filling dentry cache when calling curl
# This is required with Kibana Docker using nss < 3.52
export NSS_SDB_USE_CACHE=no

http () {
    local path="${1}"
    set -- -XGET -s --fail -L

    if [ -n "${ELASTICSEARCH_USERNAME}" ] && [ -n "${ELASTICSEARCH_PASSWORD}" ]; then
      set -- "$@" -u "${ELASTICSEARCH_USERNAME}:${ELASTICSEARCH_PASSWORD}"
    fi

    STATUS=$(curl --output /dev/null --write-out "%{http_code}" -k "$@" "http://localhost:5601${path}")
    if [[ "${STATUS}" -eq 200 ]]; then
      exit 0
    fi

    echo "Error: Got HTTP code ${STATUS} but expected a 200"
    exit 1
}

http "/app/kibana"
] delay=10s timeout=5s period=10s #success=3 #failure=3
    Environment:
      ELASTICSEARCH_HOSTS:  http://elasticsearch-master:9200
      SERVER_HOST:          0.0.0.0
      NODE_OPTIONS:         --max-old-space-size=1800
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-ht6zd (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  default-token-ht6zd:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-ht6zd
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  21s (x55 over 81m)  default-scheduler  0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory.

can anyone help me, please?

Hey @XXXereXXX

Locate the file in: deployments/saferwall/values.yaml:

And try to reduce the nunber of couchbase cluster to 1, NSQ replicas as well to 1, and as @nikAizuddin was saying, keep the limits of the multiav memory set to 2Gi or 3Gi like here:

https://github.com/saferwall/saferwall/pull/276/files

Once you do this changes, either you redeploy the cluster which will be recommended or you do:
helm upgrade <release_name> <chart_directory> => helm upgrade venus deployment/saferwall

Solved, thanks.