cockroachdb/replicator

Add debug.zip functionality for investigations

Opened this issue · 0 comments

Currently, investigations require quite a bit of back and forth to get the information the team needs to properly triage and root cause. The main ask here is to provide a command for replicator to gather a debug.zip that can dump as much information that can be reasonably dumped:

  • Relevant schema definitions
  • Relevant errors and logs
  • Metrics
  • Etc.

We need to think more here if this is the right way to achieve this or if we just need to craft a template for what customers should provide for us, since we know they'll also want to send over metrics charts from replicator dashboards.

Edit 11/15
There is also value here in treating this as a tool that gather information that can help identify and diagnose the most common issues. I think this necessitates that we:

  • Identify the most common issues
  • Define those steps as a tree of if this, then this
  • Identify the information needed for each of these problems
  • Codify these into a tool

Thoughts

  • Think about what common issues
  • Think about the information we need to debug common issues
  • Think about how we dump the data