Sep 20 2011

Statistics in Execution Plans

I was presenting on execution plans when another question came up that I didn’t know the answer to immediately. Yes, I know you’ve seen that phrase before on this blog. I love presenting because you get exactly the kinds of questions that make you think and make you learn. I’m presenting, in part, to learn, just as much as I am to teach. It was the same with kenpo. The more I taught, the better I learned the art. Wait, this isn’t supposed to be a blog post about learning. This one is about statistics.

The question was, does the execution plan have the statistics that were used by the optimizer to decide on the execution plan. And no, what was meant, was not does it show the estimated rows, which come from the statistics, but specifically does it show that it used a set of statistics named X? The answer to that is yes & no. Or, to be more DBAish about it, it depends.

Let’s take a really simple query run against a freshly installed copy of AdventureWorks2008R2:

[sourcecode language="sql"]SELECT p.BusinessEntityID,
p.FirstName
FROM Person.Person AS p
WHERE p.FirstName LIKE ‘Toni%';[/sourcecode]

This query generates this execution plan:

Stats

Yes, a very sophisticated and hard to understand execution plan. Now, here’s the deal, there were two sets of statistics used to make this plan, but only one of them can be seen in the plan. See any statistics there? Sure you do. IX_Person_LastName_FirstName_MiddleName. That is one of the two sets of statistics that were used to make this execution plan. Where’s the second set? Not in the execution plan.

[sourcecode language=”sql”]sp_helpstats N’Person.Person’, ‘ALL';[/sourcecode]

The results are here:

statslist

There is the second set of statistics used for this query, right at the top. Because I was searching the FirstName column, the optimizer found that it did not have the statistics it needed, so they were created, on the fly, and then, were not a part of the execution plan. Further, I probably looked at other statistics such as the PK_Person_BusinessEntityID because that’s the clustered index for the table. It could have scanned that to get the list of values just as easily as the other index. But, that other index is probably smaller, which means fewer pages scanned.

So, back to the question, can you see the statistics used by the optimizer inside the execution plan? Some of them, yes, but not all of them.

Please, if I’m presenting, ask questions. I’ll know a few of the answers, right off the top of my head. Others will make me go and learn so that I can answer the next person who asks the same question. I don’t mind losing at a game of Stump the Chump, so let’s play.

15 Comments

  • By Gail Shaw, September 20, 2011 @ 9:01 am

    It’s irritating, because they must be stored somewhere in the internal representation of the plan so that the pre-execution validation checks know what stats need to be checked for updates.

  • By Grant Fritchey, September 20, 2011 @ 9:07 am

    Yeah, no doubt they’re in there somewhere, at least I assume so. Maybe not. I mean if the FirstName stats get updated, it doesn’t cause a recompile… I think. Probably should test that.

  • By Jes Schultz Borland, September 20, 2011 @ 9:25 am

    Cool! I learned something new. I didn’t know about sp_helpstats. I looked it up in BOL, and it says, “This feature will be removed in the next version of Microsoft SQL Server…To obtain information about statistics, query the sys.stats and sys.stats_columns catalog views.” Do you know how you would find the info using those?

  • By Jim Murphy, September 20, 2011 @ 9:31 am

    Ya, an optimizer decision tree log to show which stats were used and decision results would be awesome. I smell a trace flag! I’d love to see a history of ‘found CI PK_My_Index, decided not to use it because of Reason Code x: Covering NCI idx_FirstName found.” “Analyzed NCI idx_FirstName, need FirstName, all columns found.” “Used full Scan due to use of LIKE.”. Or whatever. I know the steps and decisions are far more exhaustive for what the optimizer decides and why, but it would be cool to peer inside.

  • By Grant Fritchey, September 20, 2011 @ 9:56 am

    Yeah, I was just being lazy. Here:
    SELECT *
    FROM sys.stats AS s
    WHERE s.object_id = object_id(‘Person.Person’)

  • By Gail Shaw, September 20, 2011 @ 9:58 am

    Here’s one alternative to sp_helpstats

    SELECT OBJECT_NAME(object_id), name, auto_created, user_created, no_recompute, has_filter, filter_definition, LEFT(list,LEN(list)-1)
    FROM sys.stats AS s
    CROSS APPLY (
    SELECT
    name + ‘, ‘ AS [text()]
    FROM
    sys.stats_columns AS sc INNER JOIN sys.columns AS c ON sc.object_id = c.object_id AND sc.column_id = c.column_id
    WHERE
    s.object_id = sc.object_id AND s.stats_id = sc.stats_id
    ORDER BY
    stats_column_id
    FOR XML PATH(”)
    ) columnlist (list)

  • By James Lean, September 20, 2011 @ 10:04 am

    @Jes I knocked this up quickly, think it looks ok:

    SELECT s.name AS statistics_name,
    STUFF
    (
    (
    SELECT ‘, ‘ + c.name AS [text()]
    FROM sys.stats_columns sc
    JOIN sys.columns c ON sc.object_id = c.object_id
    AND sc.column_id = c.column_id
    WHERE sc.object_id = s.object_id
    AND sc.stats_id = s.stats_id
    ORDER BY sc.stats_column_id
    FOR XML PATH (”)
    ),
    1,
    2,

    ) AS statistics_keys
    FROM sys.stats s
    WHERE s.object_id = OBJECT_ID(‘Person.Person’)
    ORDER BY s.name

  • By Paul White, September 20, 2011 @ 10:12 am

    The trace flags you are after are 9204 (statistics loaded) and/or 9292 (statistics header loaded). You also need 3604 to redirect the output to the Messages window.

    Text output only appears when a compilation or recompilation occurs – so clear the plan from cache first, dump the plan cache completely, or use OPTION (RECOMPILE) – though that may change the plan of course.

    You can also use another undocumented option to enable the trace flags just for the statement, e.g. for stats headers:

    SELECT
    p.BusinessEntityID,
    p.FirstName
    FROM Person.Person AS p
    WHERE
    p.FirstName LIKE ‘Toni%’
    OPTION
    (
    QUERYTRACEON 3604,
    QUERYTRACEON 9292,
    RECOMPILE
    );

    Cheers,

    Paul

  • By Grant Fritchey, September 20, 2011 @ 10:21 am

    OK. So I was being VERY lazy. Nice posts.

  • By Gail Shaw, September 20, 2011 @ 11:56 am

    Only good when you’re running the query, not examining an exec plan from cache.

  • By Grant Fritchey, September 20, 2011 @ 12:06 pm

    yep. What Gail said. Remember the question, can I see the statistics used to create the execution plan in the plan? The answer is still, it depends.

  • By Jim Murphy, September 21, 2011 @ 10:36 am

    Thanks Paul!

    I figured that there must be some undocumented trace flags for this since the SQL Engine/optimizer QA department needs something to ensure the engine behaves properly and decides accurately with each release/Service Pack. You certainly can’t leave that to the promises of the developers, can you? P’scha. I know they are not using trial and error.

    Outstanding find here. Nice work and thanks for reporting. I look forward to playing with this in my lab.

  • By Fabiano Amorim, July 3, 2012 @ 5:18 pm

    Hi Grant, I’ve just blogged about that… there is a way… take a look here…

    http://blogfabiano.com/2012/07/03/statistics-used-in-a-cached-query-plan/

    Regards
    Fabiano Amorim

Other Links to this Post

  1. Statistics in Execution Plans | Database, Performance and Performance Tuning | Syngu — September 21, 2011 @ 1:41 am

RSS feed for comments on this post. TrackBack URI

Leave a comment