Test Documentation Updates
Opened this issue · 0 comments
Review and update test documentation in https://github.com/cnti-testcatalog/testsuite/blob/main/docs/TEST_DOCUMENTATION.md
The documentation below has been removed from the Certification repo in the interest of reducing duplication of efforts. The Test Suite is where we document tests in terms of what is the test covering, what is the rationale, what are the expected results of the test, and any potential remediation steps.
As the documentation below has been removed from the Certification repo, it is recommended to review the text below and incorporate missing information (of value) into the existing test documentation located in https://github.com/cnti-testcatalog/testsuite/blob/main/docs/TEST_DOCUMENTATION.md
Documentation removed from CNTi Certification Repo as of July 2024:
List of Workload Tests
Compatibility, Installability, and Upgradability Category
Increase decrease capacity:
- Added to CNF Certification in v1.0
- Expectation: The number of replicas for a Pod increases, then the number of replicas for a Pod decreases
What's tested: The pod is increased and replicated to 3 for the CNF image or release being tested. After increase_capacity
increases the replicas to 3, it decreases back to 1.
The increase and decrease capacity tests: HPA (horizonal pod autoscale) will autoscale replicas to accommodate when there is an increase of CPU, memory or other configured metrics to prevent disruption by allowing more requests
by balancing out the utilisation across all of the pods.
Decreasing replicas works the same as increase but rather scale down the number of replicas when the traffic decreases to the number of pods that can handle the requests.
You can read more about horizonal pod autoscaling to create replicas here and in the K8s scaling cheatsheet.
Helm chart published
- Added to CNF Certification in v1.0
- Expectation: Helm chart is published
What's tested: Checks if a Helm chart is published
Helm chart valid
- Added to CNF Certification in v1.0
- Expectation: Helm chart is valid
What's tested: This runs helm lint
against the helm chart being tested. You can read more about the helm lint command at helm.sh
Helm deploy
- Added to CNF Certification in v1.0
- Expectation: Helm deploy is successful
What's tested: This checks if the CNF was deployed using Helm
Rollback
- Added to CNF Certification in v1.0
- Expectation: CNF rollback is successful
What's tested: To check if a CNF version can be rolled back
CNI compatible
- Added to CNF Certification in v1.0
- Expectation: CNF should be compatible with multiple and different CNIs
What's tested: This installs temporary kind clusters and will test the CNF against both Calico and Cilium CNIs.
Microservice Category
Reasonable image size
- Added to CNF Certification in v1.0
- Expectation: CNF image size is under 5 gigs
What's tested: Checks the size of the image used.
Reasonable startup time
- Added to CNF Certification in v1.0
- Expectation: CNF starts up under 30 seconds
What's tested: This counts how many seconds it takes for the CNF to startup.
Single process type in one container
- Added to CNF Certification in v1.0
- Expectation: CNF container has one process type
What's tested: This verifies that there is only one process type within one container. This does not count against child processes. Example would be nginx or httpd could have a parent process and then 10 child processes but if both nginx and httpd were running, this test would fail.
Service discovery
- Added to CNF Certification in v1.0
- Expectation: CNFs should not expose their containers as a service
What's tested: This tests and checks if a container for the CNF has services exposed. Application access for microservices within a cluster should be exposed via a Service. Read more about K8s Service here.
Shared database
- Added to CNF Certification in v1.0
- Expectation: Multiple microservices should not share the same database.
What's tested: This tests if multiple CNFs are using the same database.
SIGTERM Handled
- Added to CNTi Certification in v2.0-beta
- Expectation: SIGTERM is handled by PID 1 process of containers.
- ID: sig_term_handled
What's tested: This tests if the PID 1 process of containers handles SIGTERM.
Specialized Init System
- Added to CNTi Certification in v2.0-beta
- Expectation: Container images should use specialized init systems for containers.
- ID: specialized_init_system
What's tested: This tests if containers in pods have dumb-init, tini or s6-overlay as init processes.
Zombie Handled
- Added to CNTi Certification in v2.0-beta
- Expectation: Zombie processes are handled/reaped by PID 1 process of containers.
- ID: zombie_handled
What's tested: This tests if the PID 1 process of containers handles/reaps zombie processes.
State Category
Node drain
- Added to CNF Certification in v1.0
- Expectation: A node will be drained and rescheduled onto other available node(s).
What's tested: A node is drained and rescheduled to another node, passing with a liveness and readiness check. This will skip when the cluster only has a single node.
No local volume configuration
- Added to CNF Certification in v1.0
- Expectation: Local storage should not be used or configured.
What's tested: This tests if local volumes are being used for the CNF.
Elastic volumes
- Added to CNF Certification in v1.0
- Expectation: Elastic persistent volumes should be configured for statefulness.
What's tested: This checks for elastic persistent volumes in use by the CNF.
Reliability, Resilience and Availability Category
Pod network latency
- Added to CNF Certification in v1.0
- Expectation: The CNF should continue to function when network latency occurs
What's tested: This experiment causes network degradation without the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container). The idea of this experiment is to simulate issues within your pod network OR microservice communication across services in different availability zones/regions etc.
The applications may stall or get corrupted while they wait endlessly for a packet. The experiment limits the impact (blast radius) to only the traffic you want to test by specifying IP addresses or application information. This experiment will help to improve the resilience of your services over time.
Disk fill
- Added to CNF Certification in v1.0
- Expectation: The CNF should continue to function when disk fill occurs
What's tested: Stressing the disk with continuous and heavy IO for example can cause degradation in reads written by other microservices that use this shared disk for example modern storage solutions for Kubernetes to use the concept of storage pools out of which virtual volumes/devices are carved out. Another issue is the amount of scratch space eaten up on a node which leads to the lack of space for newer containers to get scheduled (Kubernetes too gives up by applying an "eviction" taint like "disk-pressure") and causes a wholesale movement of all pods to other nodes. Similarly with CPU chaos, by injecting a rogue process into a target container, we starve the main microservice process (typically PID 1) of the resources allocated to it (where limits are defined) causing slowness in application traffic or in other cases unrestrained use can cause the node to exhaust resources leading to the eviction of all pods. So this category of chaos experiment helps to build the immunity on the application undergoing any such stress scenario.
Pod delete
- Added to CNF Certification in v1.0
- Expectation: The CNF should continue to function when pod delete occurs
What's tested: This experiment helps to simulate such a scenario with forced/graceful pod failure on specific or random replicas of an application resource and checks the deployment sanity (replica availability & uninterrupted service) and recovery workflow of the application.
Memory hog
- Added to CNF Certification in v1.0
- Expectation: The CNF should continue to function when pod memory hog occurs
What's tested: The pod-memory hog experiment launches a stress process within the target container - which can cause either the primary process in the container to be resource constrained in cases where the limits are enforced OR eat up available system memory on the node in cases where the limits are not specified.
IO Stress
- Added to CNF Certification in v1.0
- Expectation: The CNF should continue to function when pod io stress occurs
What's tested: This test stresses the disk with with continuous and heavy IO to cause degradation in reads/ writes by other microservices that use this shared disk.
Network corruption
- Added to CNF Certification in v1.0
- Expectation: The CNF should continue to function when pod network corruption occurs
What's tested: This test uses the LitmusChaos pod_network_corruption experiment.
Network duplication
- Added to CNF Certification in v1.0
- Expectation: The CNF should continue to function when pod network duplication occurs
What's tested: This test uses the LitmusChaos pod_network_duplication experiment.
Helm chart liveness
- Added to CNF Certification in v1.0
- Expectation: A liveness probe should be found in the CNF cluster
What's tested: This test checks for livenessProbe in the resource and container
Helm chart readiness
- Added to CNF Certification in v1.0
- Expectation: A readiness probe should be found in the CNF cluster
What's tested: This test check for readinessProbe in the resource and container
Observability and Diagnostic Category
Use stdout/stderr for logs
- Added to CNF Certification in v1.0
- Expectation: Resource output logs should be sent to STDOUT/STDERR
What's tested: This checks and verifies that STDOUT/STDERR is configured for logging.
For example, running kubectl get logs
returns useful information for diagnosing or troubleshooting issues.
Prometheus installed
- Added to CNF Certification in v1.0
- Expectation: Prometheus is being used for the cluster and CNF for metrics.
What's tested: Tests for the presence of Prometheus or if the CNF emit prometheus traffic.
Fluentd logs
- Added to CNF Certification in v1.0
- Expectation: Fluentd is capturing logs.
What's tested: Checks for fluentd presence and if logs are being captured for fluentd.
OpenMetrics compatible
- Added to CNF Certification in v1.0
- Expectation: CNF should emit OpenMetrics compatible traffic.
What's tested: Checks if OpenMetrics is being used and or compatible.
Jaeger tracing
- Added to CNF Certification in v1.0
- Expectation: The CNF should use tracing
What's tested: Checks if Jaeger is configured and tracing is being used.
Security Category
Container socket mounts
- Added to CNF Certification in v1.0
- Expectation: Container engine daemon sockets should not be mounted as volumes
What's tested This test uses the Kyverno policy called Disallow CRI socket mounts
[Sysctls test]
- Added to CNF Certification in v1.0
- Expectation: TBD
What's tested: TBD
External IPs
- Added to CNF Certification in v1.0
- Expectation: A CNF should not run services with external IPs
What's tested: Checks if the CNF has services with external IPs configured
Privilege escalation
- Added to CNF Certification in v1.0
- Expectation: Containers should not allow for privilege escalation
What's tested: TBD Privilege Escalation: Check that the allowPrivilegeEscalation field in securityContext of container is set to false.
See more at ARMO-C0016
Symlink file system
- Added to CNF Certification in v1.0
- Expectation: No containers allow a symlink attack
What's tested:
This control checks the vulnerable versions and the actual usage of the subPath feature in all Pods in the cluster.
See more at ARMO-C0058
Application credentials
- Added to CNF Certification in v1.0
- Exepectation: Application credentials should not be found in configuration files
What's tested:
Check if the pod has sensitive information in environment variables, by using list of known sensitive key names. Check if there are configmaps with sensitive information.
See more at ARMO-C0012
Host network
- Added to CNF Certification in v1.0
- Expectation: PODs should not have access to the host systems network.
What's tested: Checks if there is a host network attached to a pod. See more at ARMO-C0041
Service account mapping
- Added to CNF Certification in v1.0
- Expectation: The automatic mapping of service account tokens should be disabled.
What's tested: Check if service accounts are automatically mapped. See more at ARMO-C0034.
Ingress and Egress blocked
- Added to CNF Certification in v1.0
- Expectation: Ingress and Egress traffic should be blocked on Pods.
What's tested: Checks Ingress and Egress traffic policy
Privileged containers, Kubescape
- Added to CNF Certification in v1.0
- Expectation: Containers should not allow privilege escalation
What's tested: Check in POD spec if securityContext.privileged == true. Read more at ARMO-C0057
Insecure capabilities
- Added to CNF Certification in v1.0
- Expectation: Containers should not have insecure capabilities enabled
What's tested: Checks for insecure capabilities. See more at ARMO-C0046
This test checks against a blacklist of insecure capabilities.
Non-root containers
- Added to CNF Certification in v1.0
- Expectation: Containers should run with non-root user with non-root group membership
What's tested: Checks if containers are running with non-root user with non-root membership. Read more at ARMO-C0013
Host PID/IPC privileges
- Added to CNF Certification in v1.0
- Expectation: Containers should not have hostPID and hostIPC privileges
What's tested: Checks if containers are running with hostPID or hostIPC privileges. Read more at ARMO-C0038
[SELinux options]
- Added to CNF Certification in v1.0
- Expectation: SELinux options should not be used
What's tested: Checks if CNF resources use custom SELinux options that allow privilege escalation (selinux_options)
Linux hardening
- Added to CNF Certification in v1.0
- Expectation: Security services are being used to harden application
What's tested: Checks if security services are being used to harden the application. Read more at ARMO-C0055
CPU Limits
- Added to CNF Certification in v2.0-beta
- Expectation: Containers should have CPU limits defined
What's tested:
Check for each container if there is a ‘limits.cpu’ field defined. Check for each limitrange/resourcequota if there is a max/hard field defined, respectively. Read more at ARMO-C0270.
Memory Limits
- Added to CNF Certification in v2.0-beta
- Expectation: Containers should have memory limits defined
What's tested:
Check for each container if there is a ‘limits.memory’ field defined. Check for each limitrange/resourcequota if there is a max/hard field defined, respectively. Read more at ARMO-C0271.
Immutable File Systems
- Added to CNF Certification in v1.0
- Expectation: Containers should have immutable file system
What's tested:
Checks whether the readOnlyRootFilesystem field in the SecurityContext is set to true. Read more at ARMO-C0017
HostPath Mounts
- Added to CNF Certification in v1.0
- Expectation: Containers should not have hostPath mounts
What's tested: TBD
Read more at ARMO-C0045
[Default namespaces]
- Added to CNF Certification in v1.0
- Expectation: To check if resources of the CNF are not in the default namespace
What's tested: TBD
Configuration Category
[Latest tag]
- Added to CNF Certification in v1.0
- Expectation: Checks if a CNF is using 'latest' tag instead of a version.
What's tested: TBD
Require labels
- Added to CNF Certification in v1.0
- Expectation: Checks if pods are using the 'app.kubernetes.io/name' label
What's tested: TBD
nodePort not used
- Added to CNF Certification in v1.0
- Expectation: Checks for configured node ports in the service configuration.
What's tested: TBD
hostPort not used
- Added to CNF Certification in v1.0
- Expectation: Checks for configured host ports in the service configuration.
What's tested: TBD
Hardcoded IP addresses in K8s runtime configuration
- Added to CNF Certification in v1.0
- Expectation: Checks for hardcoded IP addresses or subnet masks in the K8s runtime configuration.
What's tested: TBD
Secrets used
- Added to CNF Certification in v1.0
- Expectation: Checks for K8s secrets.
What's tested: TBD
Immutable configmap
- Added to CNF Certification in v1.0
- Expectation: Checks for K8s version and if immutable configmaps are enabled.
What's tested: TBD