I consider myself to be the most responsible for making such a huge deal about the differences between what is labeled as an Estimated Plan and an Actual Plan. I walked it back in the second edition of the Execution Plans book. Hugo and I completely debunked the issue in the third edition of the Execution Plans book. That is the one you should all be referencing now. As I like to joke, the guy who wrote the first two editions of the book was an idiot (and lest anyone take offense, let’s be clear, I’m the idiot). Now, I’m trying my best to make this whole issue more clear.
Let’s talk about the “different” plans you can capture in SQL Server.
This is where you have a query, batch, stored procedure, and instead of executing you, you use the button in SQL Server Management Studio (SSMS) labeled “Display Estimated Execution Plan” to generate an execution plan.
This is the classic estimated plan. These are very useful. It’s a quick and safe way to get an execution plan. Except when dealing with recompile events, this plan will always be the same overall shape as an actual plan or any of the “other” execution plans we’re going to talk about here. There may be differences in some of the property values between this plan and other plans. This is because you’re getting row counts and other estimated values at the moment you get this plan (unless, the plan is in cache, in which case, you’re getting the cached plan). However, the literal count of differences between these plans and what we call actual plans is quite small.
Most plans, but not all plans, land in the plan cache in memory. You can access these plans using sys.dm_exec_query_plan or sys.dm_exec_query_plan_text. These are the plans that were used by the query engine to retrieve or modify your data. However, these plans do not have runtime information in them. Therefore, they are estimated plans. But, as was noted earlier, you may see differences, again, primarily in the properties, not in the overall plan shape, between these plans and estimated plans you generate. These differences are dependent on whether or not the plan is in cache when you request an estimated plan. If the plan is in cache, you’ll receive this cached plan. If it’s not in cache, you’ll get a plan generated by the optimizer. The differences in when these are run means differences in statistics values may result in differences in estimated row counts. In fact, you could even see radically different plan shapes. However, it’s because the changes in your data and statistics, not how you generated the plans that matters.
However, at the end of the day, no runtime metrics, estimated plans. Even though these are the plans used by the query engine, they are estimated plans because they do not have runtime metrics.
Query Store Plans
Plans are stored in query store as they are generated and stored in cache as part of the execution of the query. By default, prior to SQL Server 2019, all the plans get captured. Starting in 2019, the default setting is to capture plans that have longer execution times or or are called more frequently. You can also control this more directly in 2019 with a few knobs. For details, check out Tracy Boggiano’s book.
However, the plans that are stored do not have runtime metrics. Therefore, they are estimated plans. Yes, if you compare these to the plans you get by generating an estimated plan or retrieving a plan from cache, you may see differences in the properties. Once more, we’re talking about a timing issue. Unless you have a completely static system (like in a test environment, where nothing is changing, ever), changes in row estimates within statistics is normal and will result in some differences, possibly even major differences, between the plans in Query Store and other types of plans.
I just have to point out once more though, even though these are the plans that were used in the initial execution of the query, the lack of runtime metrics makes them estimated plans.
You have to execute the query to get an actual plan (with the exception of new functionality in SQL Server 2019). This is because actual plans are defined by the runtime metrics; actual row counts, etc.. With these runtime metrics, you get an actual plan. Without these runtime metrics, you’re looking at an estimated plan.
Depending on how and when you capture an actual plan, you may see differences between it and the estimated plans, depending on how and when they were captured. These are affected by time, and therefore data, as well as method. For example, actual plans captured through extended events or trace differ in structure from plans captured through SSMS (or in 2019). No, I don’t know why. Last time I spoke with a Microsoft engineer, they were surprised by this, so I’m not sure they know why yet either. You may also see differences, small or major, because of recompiles (automatic or forced, procedure or batch wide or just the statement). Again, this is because of changes to the data resulting in changes to the plans, not a major difference between what an estimated plan represents and what an actual plan represents.
Even actual plans are just estimated plans with runtime metrics added. This is why the cost estimates are not accurate to the real row counts, giving you a quick indication of the accuracy and efficacy of a given plan, comparing estimates to actuals.
Yes, there are a bunch of tiny wrinkles that can also affect a precise estimated to actual comparison. These are affected by things like plan forcing, plan guides, and more. However, none of them change the core fundamentals. There are plans with runtime metrics and plans without runtime metrics. The plans with runtime metrics are currently called actual plans. The plans without runtime metrics are currently called estimated plans. Maybe, in the future, we can get more precise language for these, but it’s where we’re at currently.