replicatedhq/troubleshoot

All `In Cluster` collectors should run from within the Kubernetes cluster

diamonwiggins opened this issue · 3 comments

Bug Description

Today there are several In Cluster collectors which run from where the preflight or support bundle binary was executed and don't necessarily run inside a Pod in the cluster. An example of a few collectors that don't run inside a Pod are:

https://troubleshoot.sh/docs/collect/http/
https://troubleshoot.sh/docs/collect/postgresql/
https://troubleshoot.sh/docs/collect/mysql/

This can cause an inconsistent experience for users expecting the results of all "in-cluster" collectors to run from within the cluster. See - #371.

Expected Behavior

All "In Cluster" collectors should run in a Pod from within the cluster and all "Host Collectors" should execute directly from a troubleshoot binary

Steps To Reproduce

Additional Context

Include the following information.

  • Troubleshoot version. If you built from source, note that including the version of Go you used to build with.
  • Operating system
  • Operating system version
  • Other details that might be helpful in diagnosing the problem

Issue is if we're running the Troubleshoot binary within a cluster, we likely want to use that pod, rather than create a new one. Some kind of recursion would be good to have available, where we can run a Troubleshoot pod that runs the collector, if the binary is not running in-cluster already. That way, we have things like the SQL client available in the collector, rather than having to copy a client binary to the pod.

Internal tracking https://app.shortcut.com/replicated/story/110804/determine-a-path-for-in-cluster-collectors-to-run-in-cluster to decide what the best approach here is. There's no sense cutting any code to solve this until we have agreed a clear path to solving the problem.