erikdarlingdata/DarlingData

sp_QuickieStore: Add the ability to sort by total wait time and by plan count.

Closed this issue · 10 comments

Is your feature request related to a problem? Please describe.
At one point in my life, I had quite a good grasp of the official documentation for Query Store. Every since learning, sp_QuickieStore, I've had the luxury of forgetting it. However, among the documented features of Query Store, I can still name a few that aren't in sp_QuickieStore. I recently saw this example in the official docs and this other example and became pretty sure that sp_QuickieStore had no way to replicate it. Let's fix that.

Describe the solution you'd like
My suggestion is to add two more features in to the @sort_order parameter. One for "waits" and one for "plans". I worry that the "waits" sort order would just be a proxy for duration, but I can imagine it coming up if you're filtering for a particular wait and want to see which queries are the worst by that particular wait.

Describe alternatives you've considered
Just leaving it alone might make sense. We already have @sort_order = 'duration' and the @execution_count parameters.

Are you ready to build the code for the feature?
Sure. I imagine this is easy.

IMPORTANT: If you're going to contribute code, please read the contributing guide first.
https://github.com/erikdarlingdata/DarlingData/blob/main/CONTRIBUTING.md
Will do.

There is a wait filter parameter, but the waits column is not sorted by it because it's not really suited for that. I'm open to alternate solutions, but it's not something I find myself wanting to do much. The main use case would probably be around locking, but then you'd only find the queries that got blocked and not the ones doing the blocking.

Before you go and build it, really think about when you'd use it and how it would benefit you in a way that existing sorting doesn't cover.

You mentioned a "plans" sort order, but not what it would actually be ordering by in that case. Can you explain a little more?

There is a wait filter parameter, but the waits column is not sorted by it because it's not really suited for that.

What about just sorting by total wait time?

Before you go and build it, really think about when you'd use it and how it would benefit you in a way that existing sorting doesn't cover.

I must admit, out of everything that I've suggested, these two were the ones that I was most struggling to think of a use for.

You mentioned a "plans" sort order, but not what it would actually be ordering by in that case. Can you explain a little more?

Plan count. Sort by total number of plans per query id, then tidy it up by also sorting by query id. I'm not suggesting aggregating any rows. I'm just suggesting giving a way to see which queries have the most plans.

@ReeceGoding Oh I see, it's in the title but not in the body.

Plan count might be cool, but I don't know if I'd go with the query id/plan id combo. A lot of the problems I see with a lack of parameterization result in the "same" query getting a lot of different query AND plan ids, which makes it difficult to aggregate across those. There are other hashes stored in Query Store that might tell a better story, but I'm not sure if it's the one you're setting out to write for this.

@erikdarlingdata There is a lot in the Query Store documentation about trying to find if you have an ad-hoc workload, but I don't think that Query Store is the right tool for handling it. If you have an ad-hoc workload, then you can't afford to use query_capture_mode ALL, so you won't have the ad-hoc workload's data in Query Store.

However... It sounds like I'm wrong! What column in the Query Store DMVs do I need to read up on? I suspect your idea is better than mine.

@ReeceGoding sort of like the plan cache, query_store_plan has query_plan_hash and query_store_query has query_hash

I haven't tested it, and I don't have good production data to do it with, but you could try something like:

  • Look for query plan hashes with multiple plan ids (and/or query ids)
  • Look for query hashes with multiple query ids

I'm not sure where I'd go from there. It would likely depend on what I found.

@erikdarlingdata I've taken your advice. These are my observations from an instance with a lot of non-parameterised queries:

  • Grouping by query_id is useless for detecting lack of parameterisation. "Identical" queries that differ only in what parameter they had embedded in (rather than being correctly parameterised) have different query_ids. I think the feature is still useful, but not for detecting parameterisation.
  • query_plan_hash probably doesn't do what we want. We want to group by query, not by plan.
  • query_hash finds a handful of different query_plan_hashes but loads of different query_ids and plan_ids. I think it's what we want.

It's funny what you find once you group by query_hash. I have one with 7 plan_ids, 4 query_plan_hashes, and 2 query_ids. It's hard to tell which of those numbers is the most useful. Which answer would we rather give?

  1. This group of queries has 7 different plans (plan_ids)
  2. This group of queries has 4 meaningfully different plans (query_plan_hashes)
  3. This group of queries has 2 members (query_ids).

I can see value in all of these facts, but we're not here to aggregate or change the output of sp_QuickieStore. What we want is just a sort order. If we sort by total number of query_plan_hashes per query_hash (and maybe break ties by query_hash, just so we don't mix up different hashes that happen to have the same sorting position), then the top of our output will be the stats for the practically identical queries with a lot of meaningfully different plans. I think that's useful?

Got super close to getting it working in this branch. I think that I've forgotten to break ties by query hash when several queries have the same number of plan hashes per query hash. I'm out of time for now and it will probably be a few days before I can get back to it, but you may enjoy playing around with the almost-working feature in the meantime.

I must say, adding in a new @sort_order was far harder than I expected. The current design is built around the assumption that the column that you want to sort by is in query_store_runtime_stats.

@erikdarlingdata I'm having a look at the wait stats part of this. Do you still have plans for the documentation of sp_QuickieStore? I recall you saying at some point that it's not worth touching the docs until you're done with them. I'm asking, because I've just noticed that we seem to only show the top five waits, but I don't recall seeing that anywhere in the readme or @help = 1.