Estimated Costs of All Queries

One question constantly comes up; What should the Cost Threshold for Parallelism be? The default value of 5 is pretty universally denigrated (well, not by Microsoft, but by most everyone else). However, what value should you set yours to?

What Do Your Plans Cost?

I have a question right back at you. What do your plans currently cost? Let’s say, for argument’s sake, that all your plans have an estimated cost (and all plan costs are estimates, let’s please keep that in mind, even on Actual plans) value of 3 or less. Do you need to adjust the cost threshold in this case? Probably not. But the key is, how do you look at the costs for your plans? Unfortunately, there isn’t a property in a DMV that shows this value. Instead, we have to query the XML:

WITH XMLNAMESPACES (
                      DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
                   )
, TextPlans
AS (SELECT CAST(detqp.query_plan AS XML) AS QueryPlan,
           detqp.dbid
    FROM sys.dm_exec_query_stats AS deqs
    CROSS APPLY sys.dm_exec_text_query_plan(
                                              deqs.plan_handle,
                                              deqs.statement_start_offset,
                                              deqs.statement_end_offset
                                           ) AS detqp
   ),
  QueryPlans
AS (SELECT RelOp.pln.value(N'@EstimatedTotalSubtreeCost', N'float') AS EstimatedCost,
           RelOp.pln.value(N'@NodeId', N'integer') AS NodeId,
           tp.dbid,
           tp.QueryPlan
    FROM TextPlans AS tp
    CROSS APPLY tp.queryplan.nodes(N'//RelOp')RelOp(pln)
   )
SELECT qp.EstimatedCost
FROM QueryPlans AS qp
WHERE qp.NodeId = 0;

You can add other fields if you need to, but this query will return all the costs from the execution plans currently in cache.

Query Store

Does Query Store make this any easier? Not really. There isn’t a property that just shows the cost of the plans, so once again, we’re forced to query the xml:

WITH XMLNAMESPACES (
                      DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
                   )
, QueryStore
AS (SELECT CAST(qsp.query_plan AS XML) AS QueryPlan
    FROM sys.query_store_plan AS qsp
   ),
  QueryPlans
AS (SELECT RelOp.pln.value(N'@EstimatedTotalSubtreeCost', N'float') AS EstimatedCost,
           RelOp.pln.value(N'@NodeId', N'integer') AS NodeId,
           qs.QueryPlan
    FROM QueryStore AS qs
    CROSS APPLY qs.queryplan.nodes(N'//RelOp')RelOp(pln)
   )
SELECT qp.EstimatedCost
FROM QueryPlans AS qp
WHERE qp.NodeId = 0;

Conclusion

With these queries, you can get the values for the costs on your plans. This information can be used to determine how high you make your Cost Threshold for Parallelism setting based on actual knowledge. Just be cautious, this query can be quite expensive. The next step is to take these results and derive an average and a standard deviation. That’s where we’re going with the next blog post, using R to determine Cost Threshold for Parallelism.

12 thoughts on “Estimated Costs of All Queries

  • Hey Grant,

    You ever get this error running the script provided?

    Msg 6335, Level 16, State 102, Line 1
    XML datatype instance has too many levels of nested nodes. Maximum allowed depth is 128 levels.

  • Yep. That means you have a large plan that won’t fit within the XML data type. I’m actually not entirely sure how to eliminate that. It’s why they have the sys.dm_exec_text_query_plan DMF and why the data type for plans in Query Store is text, not XML.

    • Jeff

      Well it’s a kludge, but I got it to work by filtering the top query by the length of the plan text. I.e.

      WHERE LEN(detqp.query_plan) < 380000

      Seems that mine started to fail with this error around the 400,000 mark.

      • Yeah, that’s one way around it. Another mechanism would be to export everything to files and then use PowerShell to run xpath against the files. Not something I’ve tried, but I’m pretty sure it would work.

  • Interesting. But are the quantity of plans really useful for anything without knowing their execution counts? For example, lots of folks have plan cache pollution due to dynamic SQL strings, in which case it’d look like the vast majority of queries have something in common – but in reality, they’ve only been executed once. You might have just a single plan with a cost of, say, 10, and it was executed a bazillion times.

    • Jeff Rosenberg

      Sorry, pasted the wrong thing, the CROSS APPLY should be:
      CROSS APPLY qp.queryplan.nodes(N’//StmtSimple’)RelOp(pln)

      Of course, then calling it RelOp probably doesn’t make sense, but I’ve kept it here just for consistency.

      • Michael McCormick

        Old thread is old, but I had to tell you that your modification DRAMATICALLY reduced the duration and cost of my statistics query. I’m assuming the Cross Apply on RelOp was returning records on every node, and retrieving the Estimated Cost (and, for me, the statement itself) from the single StmtSimple node was much more efficient.

        Many thanks!

  • Jeff Rosenberg

    I’ve been playing around with this, and one thing I’ve noticed is that sometimes the top-level RelOp node has an Id of 1, not 0.

    I think potentially a better way of getting the same data is to use this for the CROSS APPLY:
    CROSS APPLY qs.queryplan.nodes(N’//RelOp’)RelOp(pln)

    … and then this for the SELECT:
    SELECT RelOp.pln.value(N’@StatementSubTreeCost’, N’float’) AS EstimatedCost

Please let me know what you think about this article or any questions:

This site uses Akismet to reduce spam. Learn how your comment data is processed.