clamav issue
Geexirooz opened this issue · 11 comments
Hi @XXXereXXX , perhaps your clamd
process got killed due to out of memory
?
From my build at saferwall-box v2.0.0, ClamAV (multiav-pod-clamav
) may use 2.3GB of memory during startup:
[vagrant@saferwall-box saferwall]$ podman stats
ID NAME CPU % MEM USAGE / LIMIT MEM % NET IO BLOCK IO PIDS
11853f225dd8 317678c3bbd8-infra -- 946.2kB / 6.226GB 0.02% -- / -- -- / -- 1
15b193537df6 multiav-pod-comodo -- 11.69MB / 367MB 3.19% -- / -- -- / -- 6
17d04ebd7c8f 4f4aa7527eb2-infra -- 819.2kB / 6.226GB 0.01% -- / -- -- / -- 1
24ab309466fd multiav-pod-windefender -- 11.82MB / 471.9MB 2.50% -- / -- -- / -- 6
36778137056f rootless-cni-infra -- 26.98MB / 6.226GB 0.43% -- / -- -- / -- 7
428bbd7b7ab7 aee14c322568-infra -- 856.1kB / 6.226GB 0.01% -- / -- -- / -- 1
4ffade524794 couchbase-pod-couchbase 67.03% 915.6MB / 1.153GB 79.38% -- / -- -- / -- 292
588a03f0f935 nsq-pod-nsqlookup -- 9.167MB / 20.97MB 43.71% -- / -- -- / -- 8
8450c4ec9535 saferwall-pod-backend -- 23.58MB / 314.6MB 7.50% -- / -- -- / -- 7
8c51a591ebf5 5604833ebba1-infra -- 958.5kB / 6.226GB 0.02% -- / -- -- / -- 1
8f56cae2a15b nsq-pod-nsqadmin -- 7.758MB / 20.97MB 36.99% -- / -- -- / -- 8
b4114a27840f multiav-pod-sophos -- 11.7MB / 367MB 3.19% -- / -- -- / -- 5
cb4f4df790d0 saferwall-pod-ui -- 9.433MB / 20.97MB 44.98% -- / -- -- / -- 6
de52bf1845e4 multiav-pod-clamav 99.99% 2.307GB / 2.307GB 100.00% -- / -- -- / -- 8
e616bdfadf26 nsq-pod-nsq 0.39% 10.48MB / 20.97MB 49.98% -- / -- -- / -- 12
e9c76dc751fd minio-pod-minio -- 150.9MB / 524.3MB 28.78% -- / -- -- / -- 9
faf570f12223 saferwall-pod-consumer -- 5.292MB / 2.202GB 0.24% -- / -- -- / -- 8
ffd286ddfbb2 9570a508a29b-infra -- 938kB / 6.226GB 0.02% -- / -- -- / -- 1
Yes, that should be the reason as @nikAizuddin said.
You can do a kubectl describe pod <name_of_clamav_pod>
and check if the error reason is OOMKilled.
I actually restored the machine to a snapshot and I am setting it up again now and I will check the RAM issue and give feedback, however my VM has 8 GB RAM which meets the requirements.
Worth mentioning that, when I ran "make kind-up", I got below error:
"kubectl wait --namespace ingress-nginx
--for=condition=ready pod
--selector=app.kubernetes.io/component=controller
--timeout=90s
error: timed out waiting for the condition on pods/ingress-nginx-controller-55bc59c885-xjkbb
build/mk/kind.mk:15: recipe for target 'kind-deploy-ingress-nginx' failed
make[1]: *** [kind-deploy-ingress-nginx] Error 1
make[1]: Leaving directory '/root/saferwall'
build/mk/kind.mk:30: recipe for target 'kind-up' failed
make: *** [kind-up] Error 2"
And when I ran "make k8s-init-cert-manager" I got below error:
"# Verify the installation.
kubectl wait --namespace cert-manager
--for=condition=ready pod
--selector=app.kubernetes.io/instance=cert-manager
--timeout=90s
timed out waiting for the condition on pods/cert-manager-cainjector-68b467c55f-hp25c
timed out waiting for the condition on pods/cert-manager-db4ccd57b-p4wht
timed out waiting for the condition on pods/cert-manager-webhook-79c5595878-ttfgx
build/mk/k8s.mk:142: recipe for target 'k8s-init-cert-manager' failed
make: *** [k8s-init-cert-manager] Error 1"
Might these errors cause the problem?
@LordNoteworthy here is the output of "kubectl describe pod venus-saferwall-multiav-clamav-86f56dcc74-g8g9h" :
Name: venus-saferwall-multiav-clamav-86f56dcc74-g8g9h
Namespace: default
Priority: 0
Node: saferwall-control-plane/172.18.0.2
Start Time: Sun, 10 Jan 2021 07:37:55 +0000
Labels: app.kubernetes.io/instance=venus
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=saferwall-multiav-clamav
helm.sh/chart=saferwall-0.0.3
pod-template-hash=86f56dcc74
Annotations: <none>
Status: Running
IP: 10.244.0.42
IPs:
IP: 10.244.0.42
Controlled By: ReplicaSet/venus-saferwall-multiav-clamav-86f56dcc74
Containers:
clamav:
Container ID: containerd://55824f705070c433b704d7b4594eb20822732986514038e1806db505fbb6944d
Image: saferwall/goclamav:latest
Image ID: docker.io/saferwall/goclamav@sha256:a6d6ef4d733b8e9144695aa75ab1d6cb85f1d713dd2f73a8ca8263606031f678
Port: 50051/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 12 Jan 2021 05:57:38 +0000
Finished: Tue, 12 Jan 2021 05:59:38 +0000
Ready: False
Restart Count: 288
Limits:
cpu: 100m
memory: 100Mi
Requests:
cpu: 50m
memory: 50Mi
Environment: <none>
Mounts:
/samples from samples (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ht6zd (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
samples:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: venus-saferwall-samples
ReadOnly: false
default-token-ht6zd:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-ht6zd
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 5m34s (x289 over 45h) kubelet Created container clamav
Warning BackOff 25s (x6522 over 44h) kubelet Back-off restarting failed container
@XXXereXXX Memory limits 100Mi
seems too small for me. Can you try increase to 2500Mi
?
For my build, I have to set 2200Mi
limit for 32MB
max_file_size
upload: https://github.com/extra2000/saferwall-box/blob/master/salt/roots/pillar/saferwall.sls.example#L119-L128:
Thank you for support. Would you tell me where I should do the same thing (changing memory limit) in my environment?
because I did not find a file like what you mentioned.
here is the list of files and directories below the project parent directory:
api/ CHANGELOG.md configs/ docs/ go.mod LICENSE Makefile README.md search.py ui/ website/
build/ cmd/ deployments/ example.env go.sum linux-amd64/ pkg/ scripts/ test/ web/
Thank you for support.
You're welcome :)
I think the config map for MultiAV deployments can be found here deployments/saferwall/values.yaml#L777-L783 which will map the values into deployments/saferwall/templates/multiav-deployment.yaml.
You can try set both memory requests
and limits
to 2500Mi
in deployments/saferwall/values.yaml#L777-L783. But this will be applied to all AV.
I modified the file and replace 100Mi with 2500Mi, but output of the command "kubectl describe pod venus-saferwall-multiav-clamav-86f56dcc74-g8g9h" has not changed (limit=100Mi) and the container's error still exists. Do I have to restart something?
Noteworthy, output of the command "kubectl describe pod venus-kibana-65f5794dc7-6vfvc" says CPU and RAM are not sufficient, although limits are different from clamav, stated 2Gi.
here it is:
Name: venus-kibana-65f5794dc7-6vfvc
Namespace: default
Priority: 0
Node: <none>
Labels: app=kibana
pod-template-hash=65f5794dc7
release=venus
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/venus-kibana-65f5794dc7
Containers:
kibana:
Image: docker.elastic.co/kibana/kibana:7.9.3
Port: 5601/TCP
Host Port: 0/TCP
Limits:
cpu: 1
memory: 2Gi
Requests:
cpu: 1
memory: 2Gi
Readiness: exec [sh -c #!/usr/bin/env bash -e
# Disable nss cache to avoid filling dentry cache when calling curl
# This is required with Kibana Docker using nss < 3.52
export NSS_SDB_USE_CACHE=no
http () {
local path="${1}"
set -- -XGET -s --fail -L
if [ -n "${ELASTICSEARCH_USERNAME}" ] && [ -n "${ELASTICSEARCH_PASSWORD}" ]; then
set -- "$@" -u "${ELASTICSEARCH_USERNAME}:${ELASTICSEARCH_PASSWORD}"
fi
STATUS=$(curl --output /dev/null --write-out "%{http_code}" -k "$@" "http://localhost:5601${path}")
if [[ "${STATUS}" -eq 200 ]]; then
exit 0
fi
echo "Error: Got HTTP code ${STATUS} but expected a 200"
exit 1
}
http "/app/kibana"
] delay=10s timeout=5s period=10s #success=3 #failure=3
Environment:
ELASTICSEARCH_HOSTS: http://elasticsearch-master:9200
SERVER_HOST: 0.0.0.0
NODE_OPTIONS: --max-old-space-size=1800
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ht6zd (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
default-token-ht6zd:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-ht6zd
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 21s (x55 over 81m) default-scheduler 0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory.
can anyone help me, please?
Hey @XXXereXXX
Locate the file in: deployments/saferwall/values.yaml
:
And try to reduce the nunber of couchbase cluster to 1, NSQ replicas as well to 1, and as @nikAizuddin was saying, keep the limits of the multiav memory set to 2Gi or 3Gi like here:
https://github.com/saferwall/saferwall/pull/276/files
Once you do this changes, either you redeploy the cluster which will be recommended or you do:
helm upgrade <release_name> <chart_directory>
=> helm upgrade venus deployment/saferwall
Solved, thanks.