must-gather
is a tool built on top of OpenShift must-gather
that provides the scripts for an OpenStack control plane logs and data collection.
oc adm must-gather --image=quay.io/openstack-k8s-operators/openstack-must-gather
The command above will create a local directory where logs, configs and status of the OpenStack control plane services are dumped.
In particular the openstack-must-gather
will get a dump of:
- Service logs: Retrieved by the output of the pods (and operators) associated to the deployed services
- Services config: Retrieved for each component by the deployed
ConfigMaps
andSecrets
- Status of the services deployed in the OpenStack control plane
- Deployed CRs and CRDs
CSVs
,pkgmanifests
,subscriptions
,installplans
,operatorgroup
Pods
,Deployments
,Statefulsets
,ReplicaSets
,Service
,Routes
,ConfigMaps
, (part of / relevant)Secrets
- Network related info (
Metallb
info,IPAddressPool
,L2Advertisements
,NetConfig
,IPSet
) - SOS reports for OpenShift nodes that are running OpenStack service pods.
Some openstack-must-gather collectors can be configured via environmental
variables to behave differently. For example SOS gathering can be disabled
passing an empty SOS_SERVICES
environmental variable.
To provide environmental variables we'll need to invoke the gathering command manually like this:
oc adm must-gather --image=quay.io/openstack-k8s-operators/openstack-must-gather -- SOS_SERVICES= gather
This is the list of available environmental variables:
CONCURRENCY
: Must gather runs many operations, so to speed things up we run them in parallel with a concurrency of 5 by default. Users can change this environmental variable to adjust to its needs.SOS_SERVICES
: Comma separated list of services to gather SOS reports from. Empty string skips sos report gathering. Eg:cinder,glance
. Defaults to all of them.SOS_ONLY_PLUGINS
: List of SOS report plugins to use. Empty string to run them all. Defaults to:block,cifs,crio,devicemapper,devices,iscsi,lvm2, memory,multipath,nfs,nis,nvme,podman,process,processor,selinux,scsi,udev
.SOS_EDPM
: Comma separated list of edpm nodes to gather SOS reports from, empty string skips sos report gathering. Accepts keyword all to gather all nodes. eg:edpm-compute-0,edpm-compute-1
SOS_EDPM_PROFILES
: list of sos report profiles to use. Empty string to run them all. Defaults to:system,storage,virt
You can build the image locally using the Dockerfile included.
A Makefile
is also provided. To use it, you must pass:
- an image name using the variable
MUST_GATHER_IMAGE
. - an image registry using the variable
IMAGE_REGISTRY
(default is quay.io/openstack-k8s-operators) - an image tag using the variable
IMAGE_TAG
(default islatest
)
The targets for make
are as follows:
check-image
: Check if theMUST_GATHER_IMAGE
variable is setbuild
: build the image with the supplied name and pushes itcheck
: Run sanity check against the script collectionpytest
: Run sanity check and unit tests against the python script collectionpodman-build
: builds the must-gather imagepodman-push
: pushes an already-builtmust-gather
image
One possible workflow that can be used for development is to run the openstack must-gather in debug mode:
$ oc adm must-gather --image=quay.io/openstack-k8s-operators/openstack-must-gather -- gather_debug
[must-gather ] OUT Using must-gather plug-in image: quay.io/openstack-k8s-operators/openstack-must-gather:latest
When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
ClusterID: 6ffe21e8-7dc9-4719-926f-f34ec33e6916
ClusterVersion: Stable at "4.12.35"
ClusterOperators:
All healthy and stable
[must-gather ] OUT namespace/openshift-must-gather-b9fpw created
[must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-mq6st created
[must-gather ] OUT pod for plug-in image quay.io/openstack-k8s-operators/openstack-must-gather:latest created
[must-gather-mxc7k] POD 2023-09-29T09:56:24.995284210Z Must gather entering debug mode, will sleep until file /tmp/rm-to-finish-gathering is deleted
[must-gather-mxc7k] POD 2023-09-29T09:56:24.995284210Z
Running in debug mode makes the must gather container just sit waiting for a
file to be removed, allowing us to go into the container and test our scripts.
In the above case, where the namespace is openshift-must-gather-b9fpw
and the
pod name is must-gather-mxc7k
we would enter the container with:
$ oc -n openshift-must-gather-b9fpw rsh must-gather-mxc7k
And then if we were debugging a bash script called gather_trigger_gmr
, we
would run it in debug mode:
sh-5.1# bash -x /usr/bin/gather_trigger_gmr
And once we had the script working as intended we would copy the file from a terminal:
$ oc cp openshift-must-gather-b9fpw/must-gather-mxc7k:usr/bin/gather_trigger_gmr collection-scripts/gather_trigger_gmr
And finally, from inside the container shell we let the oc adm must-gather
command complete, optionally running everything to ensure we haven't
inadvertently broken anything in the process:
sh-5.1# gather
sh-5.1# rm /tmp/rm-to-finish-gathering
Besides the generic OpenShift etcd objects that must-gather scripts are currently gathering there's also component specific code that gather other information, so when adding or improving a component we should look into:
-
collection-scripts/common.sh
: Services are in theOSP_SERVICES
variable to indicate that the script must gather its config maps, secrets, additional information, trigger Guru Mediation Reports, etc. -
collection-scripts/gather_services_status
: Runs OpenStack commands to gather additional information. For example for Cinder it gather state of the services, volume types, qos specs, volume transfer requests, summary of existing volumes, pool information, etc. -
collection-scripts/gather_trigger_gmr
: Triggers Guru Meditation Reports on the services so they are present in the logs when those are gathered afterwards.
(optional) Build and push the must-gather image to a registry:
git clone ssh://git@github.com/openstack-k8s-operators/openstack-must-gather.git
cd openstack-must-gather
IMAGE_TAG=<tag> IMAGE_REGISTRY=<registry> MUST_GATHER_IMAGE=openstack-must-gather make build
On a machine where you have oc adm
access, do the following:
oc adm must-gather --image=<registry>/openstack-must-gather:<tag>
When generation is finished, you will find the dump in the current directory
under must-gather.local.XXXXXXXXXX
.
It is possible to gather debugging information about specific features by adding
to the oc adm must-gather
command the --image-stream
argument.
The must-gather tool supports multiple images, so you can gather data about more
than one feature by running a single command.
Adding --image-stream
assumes that an image has been imported in an OpenShift
namespace and the associated imagestream
is available.
To import an image as an imagestream
, run the following commands:
oc project <namespace>
oc import-image <registry>/<image>:<tag> --confirm
and double check the imagestream
exists in the specified namespace.
At this point, run the oc adm must-gather
command passing one or more
imagestream parameters.
For instance, assuming we import the kubevirt must-gather image:
within the existing openstack
namespace:
oc project openstack
oc import-image kubevirt-must-gather --from=quay.io/kubevirt/must-gather:latest --confirm
we can combine, in a single command, three different containers executions that gather different aspects of the same environement.
oc adm must-gather --image-stream=openstack/kubevirt-must-gather \
--image-stream=openshift/must-gather \
--image=quay.io/openstack-k8s-operators/openstack-must-gather:latest
The command above will create three pods associated to the existing imagestream
objects that point to openstack
, openshift
and kubevirt
must-gather container
images.
[must-gather] OUT pod for plug-in image quay.io/openstack-k8s-operators/openstack-must-gather:latest created
[must-gather] OUT pod for plug-in image quay.io/kubevirt/must-gather@sha256:501e30ac7d5b9840a918bb9e5aa830686288ccfeee37d70aaf99cd2e302a2bb0 created
[must-gather] OUT pod for plug-in image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e9601b492cbb375f0a05310efa6025691f8bba6a97667976cd4baf4adf0f244c created
...
...
If the image is pushed to quay.io
registry, make sure it's set to public
,
otherwise it can't be consumed by the must-gather tool.