Oct 24 2016

A Sub-Query Does Not Hurt Performance

The things you read on the internet, for example, “don’t use a sub-query because that hurts performance.”


Where do people get these things?

Let’s Test It

I’ve written before about the concept of cargo cult data professionals. They see one issue, one time, and consequently extrapolate that to all issues, all the time. It’s the best explanation I have for why someone would suggest that a sub-query is flat out wrong and will hurt performance.

Let me put a caveat up front (which I will reiterate in the conclusion, just so we’re clear), there’s nothing magically good about sub-queries just like there is nothing magically evil about sub-queries. You can absolutely write a sub-query that performs horribly, does horrible things, runs badly, and therefore absolutely screws up your system. Just as you can with any kind of query. I am addressing the bad advice that a sub-query is to be avoided because they will inherently lead to poor performance.

Let’s start with a simple test, just to validate the concept of how a sub-query performs within SQL Server:

SELECT sd.OrderQty,
   (SELECT *
    FROM Sales.SalesOrderDetail AS sod
   ) AS sd
   (SELECT *
    FROM Production.Product AS p
   ) AS pr
   ON pr.ProductID = sd.ProductID
WHERE sd.SalesOrderID = 52777;

SELECT sod.OrderQty,
FROM Sales.SalesOrderDetail AS sod
JOIN Production.Product AS p
   ON p.ProductID = sod.ProductID
WHERE sod.SalesOrderID = 52777;

If there is something inherently wrong with a sub-query, then there is something twice as wrong with two sub-queries. Here are the resulting execution plans:

sub-query plan matches query plan

Huh, look sort of, I don’t know, almost identical. Let’s compare the plans using the new SSMS plan comparison utility:

only slight diffences in sub-query plan

Well, darn. Displayed in pink are the common sets of operations between the two plans. In other words, for these plans, everything except the properties of the SELECT operator are exactly the same. Let’s take a look at those properties:

sub-query SELECT properties

OK. Now we have some interesting differences, and especially, some interesting similarities. Let’s start with the similarities. First of all, we have exactly the same QueryPlanHash value in both plans. In addition, we also have identical estimated rows and costs. In short, the optimizer created two identical execution plans. Now, this is where things get a little bit interesting. See, the optimizer actually worked a little harder to create the first plan than the second. It took an extra tic on the CPU and just a little more CompileMemory and CompileTime. Interesting.

What about execution times? With a few runs on average, the execution times were identical at about 149mc with 11 reads. However, running a query once or twice isn’t testing. Let’s get a few thousand runs of both queries. The average results from the Extended Events sql_batch_completed event were 75.9 microseconds for both queries.

However, what about that extra little bit of compile time in the query that used sub-queries? Let’s add in a statement to free the procedure cache on each run and retry the queries. There is a measurable difference now:

Query duration
Sub-query AVG: 5790.20864172835
Query AVG: 4539.49289857972

More work is done by the optimizer on the sub-query to compile the same execution plan. We’re adding work to the optimizer, requiring it to unpack the, admittedly, silly query written above.  When we refer only to the compile time and not the execution time, there is a performance hit. Once the query is compiled, the performance is identical. Whether or not you get a performance hit from a sub-query then, in part, depends on the degree to which you’re experiencing compiles or recompiles. Without the recompile, there is no performance hit. At least in this example.

Let’s Test It Again, Harder

I firmly believe in the old adage; if you ain’t cheatin’, you ain’t fightin’. It’s time to put the boot in.

Let’s go with much more interesting queries that are more likely to be written than the silly example above. Let’s assume some versioned data like in this article on Simple-Talk. We could express a query to bring back a single version of one of the documents in one of three ways from the article. We’re just going to mess with two of them. One that uses a sub-query, and one that does not:

--no sub-query
SELECT TOP 1 d.DocumentName,
FROM dbo.Document d
JOIN dbo.Version v
   ON d.DocumentID = v.DocumentID
WHERE d.DocumentID = 9729;

SELECT  d.[DocumentName],
FROM    dbo.[Document] d
                     FROM   dbo.[Version] v2
                     WHERE  v2.DocumentId = d.DocumentId
                     ORDER BY v2.DocumentId,
                            v2.VersionId DESC
                    ) v
WHERE   d.[DocumentId] = 9729;

As per usual, we can run these once and compare results, but that’s not really meaningful. We’ll run them thousands of times. Also, to be sure we’re comparing apples to apples, we’ll force a recompile on every run, just like in the first set of tests. The results this time:

Query duration
Sub-query AVG: 1852.14114114114
Query AVG: 2022.62162162162

You’ll note that, even with the compile on each execution, the query using a sub-query actually out-performed the query that was not using a sub-query. The results are even more dramatic when we take away the compile time:

Query duration
Sub-query AVG: 50.8368368368368
Query AVG: 63.3103103103103

We can also look to the execution plans to get an understanding of how these queries are being resolved:


The plan on top is the sub-query plan, and the plan on the bottom is the plan for just the plain query. You can see that the regular query is doing a lot more work to arrive at an identical set of data. The differences are visible in the average execution time, about a 20% improvement.

You could argue that we’re comparing two completely different queries, but that’s not true. Both queries return exactly the same result set. It just so happens that the query using the sub-query performs better overall in this instance. In short, there’s no reason to be scared of using a sub-query.

Sub-Query Conclusion

Is it possible for you to write horrid code inside of a sub-query that seriously negatively impacts performance? Yes. Absolutely. I’m not arguing that you can’t screw up your system with poor coding practices. You absolutely can. The query optimization process within SQL Server deals well with common coding practices. Therefore, the queries you write can be fairly sophisticated before, by nature of that sophistication, you begin to get serious performance degradation.

You need to have a method of validation for some of what you read on the internet. People should provide both the queries they are testing with and the numbers that their tests showed. If you’re just seeing completely unsupported, wildly egregious statements, they’re probably not true.

In conclusion, it’s safe to use sub-queries. Just be careful with them.

If you’re finding any of this useful and you’d like to dig down a little more, you can, because I’ll be putting on an all day seminar on execution plans and query tuning. The event takes place before SQLSaturday Providence in Rhode Island, December 2016, therefore, if you’re interested, sign up here.

Oct 17 2016

SELECT * Does Not Hurt Performance

SELECT *I read all the time how SELECT * hurts performance. I even see where people have said that you just have to supply a column list instead of SELECT * to get a performance improvement. Let’s test it, because I think this is bunkum.

The Test

I have here two queries:

FROM Warehouse.StockItemTransactions AS sit;


SELECT sit.StockItemTransactionID,
FROM Warehouse.StockItemTransactions AS sit;

I’m basically going to run this a few hundred times each from PowerShell. I’ll capture the executions using Extended Events and we’ll aggregate the results.

The Results

I ran the test multiple times because, funny enough, I kept seeing some disparity in the results. One test would show a clear bias for one method, another test would show the opposite. However, averaging the averages we see that things broke down as follows:

* 167.247ms
Column List 165.500ms

That’s after about 2000 separate executions of each query. There’s a 2ms bias towards the Column List query as opposed to the *. That’s an improvement, if you want to call it that, of 1%. It’s hardly worth the bother, assuming that with more testing this continued to hold true. In multiple tests, the SELECT * ran faster. I just feel honor bound to put up the full results. They show an improvement, but not one I’d get excited about. Oh, and the reads, the execution plan, everything else… identical.

SELECT * Conclusion

Don’t get me wrong, there are lots of reasons to not use SELECT *. Yes, performance is one of the reasons to not use SELECT *. However, when most people suggest that maybe using SELECT * is a bad idea for performance reasons, what they’re saying is you ought to only move the columns you need and the data you are actually using, not everything. I’m not aware of anyone with experience and knowledge suggesting that using the complete column list instead of SELECT * is faster. As we can see in the tests above, it isn’t (or is by so small a margin, who cares).

I love talking performance tuning. In fact, I’ll be doing an all day seminar on execution plans and query tuning before SQLSaturday Providence in Rhode Island, December 2016, therefore, if you’re interested, sign up here.

Oct 03 2016

Correlated Datetime Columns

SQL Server is a deep and complex product. There’s always more to learn. For example, I had never heard of Correlated Datetime Columns. They were evidently introduced as a database option in SQL Server 2005 to help support data warehousing style queries (frequently using dates and times as join criteria or filter criteria). You can read up on the concept here from this older article from 2008 on MSDN. However, doing a search online I didn’t find much else explaining how this  stuff worked (one article here, that didn’t break this down in a way I could easily understand). Time for me to get my learn on.

The concept is simple, turning this on for your database means that dates which have a relationship, the example from MSDN uses OrderDate and DueDate from the Purchasing.PurchaseOrderHeader and Purchasing.PurchaseOrderDetail tables respectively. Clearly, the DueDate would be near the OrderDate in general terms. Cool. Makes sense. Let’s see how it works, if it works.

Correlated Datetime Columns At Work

I’m going to use the example code from the Microsoft article, partly because there are aspects of it I don’t understand and exploring it will help me learn. Let’s start with the (slightly modified) base query that we hope will benefit from the Correlated Datetime setting:

FROM Purchasing.PurchaseOrderHeader AS poh
JOIN Purchasing.PurchaseOrderDetail AS pod
   ON poh.PurchaseOrderID = pod.PurchaseOrderID
WHERE poh.OrderDate
BETWEEN '20130801' AND '20130901';

The changes I made were the aliases and the values in the WHERE clause. I left everything else the same. Performance on this was 115ms on average with 59 reads. This is the execution plan:


Click to embiggen. The missing index suggestion is on OrderDate using an INCLUDE of all the columns (the joy of SELECT * and Missing Index Hints, I should post a rant on Missing Index Hints at some point). The plan itself uses two Clustered Index Scan operations to retrieve the data. Since the scans are Ordered (check the properties in each operator to validate this, but the lack of a Sort operator is also a good hint), a Merge Join was used to put the data from the two tables together.

The example code drops and recreates the clustered index on the PurchaseOrderDetail table because one of the tables must have a clustered index with a datetime column as the first (or only) column in the key for Correlated Datetime Columns. However, I like to see how things behave. Let’s first create a non-clustered index on PurchaseOrderDetail.

CREATE INDEX IX_PurchaseOrderDetail_DueDate
ON Purchasing.PurchaseOrderDetail(DueDate);

I don’t expect this to do anything for the above query, but I want to test each step. The non-clustered index doesn’t change anything. The execution plan and performance are the same as above. Fine. Let’s enable Correlated Datetime Columns:

ALTER DATABASE AdventureWorks2014

Running the query again, I can’t detect any differences. The execution plans are identical and the performance is the same. In case the plan wasn’t removed from cache because of the database setting change, I sure would expect it to be, but I could be wrong, I’m going to remove it from the cache:


SELECT  @PlanHandle = deps.plan_handle
FROM    sys.dm_exec_procedure_stats AS deps
CROSS APPLY sys.dm_exec_sql_text(deps.sql_handle) AS dest
WHERE   dest.text LIKE 'SELECT *
FROM Purchasing.PurchaseOrderHeader AS poh%';

IF @PlanHandle IS NOT NULL
        DBCC FREEPROCCACHE(@PlanHandle);

Nope. It’s not because of a cached plan. The Correlated Datetime Column setting just isn’t digging how I’ve got the clustered index set up. However, let’s also test this. I’ll create an index on the PurchaseOrderHeader.OrderDate as well:

CREATE INDEX IX_PurchaseOrderHeader_OrderDate
ON Purchasing.PurchaseOrderHeader (OrderDate);

Nope. No joy. All right. Clean up the test indexes and follow Microsoft’s lead:

DROP INDEX IX_PurchaseOrderDetail_DueDate
ON Purchasing.PurchaseOrderDetail;
DROP INDEX IX_PurchaseOrderHeader_OrderDate
ON Purchasing.PurchaseOrderHeader;
ON Purchasing.PurchaseOrderDetail(PurchaseOrderID,PurchaseOrderDetailID);
ALTER TABLE Purchasing.PurchaseOrderDetail
DROP CONSTRAINT PK_PurchaseOrderDetail_PurchaseOrderID_PurchaseOrderDetailID;
ON Purchasing.PurchaseOrderDetail(DueDate);

Rerunning the original query I immediately see a difference. The reads dropped from 59 to 57. The execution time is a little better at 105ms. The execution plan is very different:


We’re making use of that clustered index and it’s changed to a Hash Join. Let’s see the predicate for the Seek operation:

Seek Keys[1]: Start: [AdventureWorks2014].[Purchasing].[PurchaseOrderDetail].DueDate >= Scalar Operator(‘2013-07-07 00:00:00.000’), End: [AdventureWorks2014].[Purchasing].[PurchaseOrderDetail].DueDate < Scalar Operator(‘2013-10-05 00:00:00.000’)

So, the optimizer has picked a range of values that are near to the values that have been passed in, assuming a correlation between the data. According to the documentation the optimizer also ensures that using this correlation will still result in the same data being returned.

The original scan of PurchaseOrderHeader is still the same. In fact, we can see that when comparing the plans:


You’ll note that I have the similarities highlighted, so you can tell that the Scan is the same. Yes, it’s cost percentage within the plan has changed, but it’s still the same basic operation.

The magic comes from the optimizer creating statistics in the form of a materialized view that are created and maintained around the data in tables that qualify for the behavior of the Correlated Datetime Columns. You can even see these in your views. They have a common naming standard as outlined in the documentation: _MPStats_Sys_<constraint_object_id>_<GUID>_<FK_constraint_name> (no word on if MP is anything as crazy as the WA of system generated statistics).

In short, Correlated Datetime Columns worked… Or did it. We’re comparing apples to hammers at the moment. I’ve got a new clustered index on one of the tables. That changes all the choices, whether or not I’ve changed some database setting. Let’s remove Correlated Datetime Columns, pull the plan from cache and then rerun the query.

Oops. Performance is now at 145ms on average with 118 reads. The execution plan has changed yet again:


In short, it works.

I could go farther and look to replace the clustered index on the PurchaseOrderHeader table, but you get the idea.


Correlated Datetime Columns works. Clearly it’s not something you’re going to enable on all your databases. Probably most of your databases don’t have clustered indexes on datetime columns let alone enough tables with correlation between the data stored in them. However, when you do have that type of data correlation, enabling Correlated Datetime Columns and ensuring you have a clustered index on the datetime column is a viable tuning mechanism. Further, this is a mechanism that has been around since 2005. Just so you know, I did all my testing in SQL Server 2016, so this something that anyone in the right situation can take advantage of. Just remember that TANSTAAFL always applies. Maintaining the statistics needed for the Correlated Datetime Columns is done through materialized views that are automatically created through the optimization process. You can see the views in SSMS and any queries against the objects. You’ll need to take this into account during your statistics maintenance. However, if Correlated Datetime Columns is something you need, this is really going to help with this, fairly narrow, aspect of query tuning.

Sep 06 2016

Query Store, Force Plan and “Better” Plans

I am endlessly fascinated by how the Query Store works. I love teaching it at every opportunity too. Plus, almost every time I teach it, I get a new question about the behavior that makes me delve into the Query Store just a little bit more, enabling me to better understand how it works. I received just such a question at SQLSaturday Norway:

If you are forcing a plan, and the physical structure changes such that a “better” plan is possible, what happens with plan forcing?

Let’s answer a different question first. What happens when the plan gets invalidated, when the index being used gets dropped or some other structural change occurs so that the plan is no longer valid? I answered that question in this blog post. The plan being forced, after the object is dropped, becomes invalid, so that plan can no longer be used. The Query Store still attempts to apply the plan during any recompile or compile event of the query in question, but it fails and a proper plan is used. All this means, I think, the Query Store is going to ignore the new index, since a new index doesn’t invalidate an existing plan. A new index just makes new plans possible. However, when I was asked this question, this wasn’t something I had tested, so I gave a speculative, best guess, answer with plenty of caveats and the promise to provide a tested answer ASAP. Here we go.

I’ll start with the same sample query:

SELECT  sit.Quantity,
FROM    Warehouse.StockItemTransactions AS sit
JOIN    Sales.Invoices AS i
        ON i.InvoiceID = sit.InvoiceID
JOIN    Warehouse.StockItems AS si
        ON si.StockItemID = sit.StockItemID
WHERE   sit.TransactionOccurredWhen BETWEEN '3/1/2015'
                                    AND     '3/5/2015';

The results are returned in about 53ms with 4850 reads and this execution plan:


As you can see, there’s the suggestion of a possible missing index. We’ll apply that in a moment. First though, I’m going to get the plan and query identifiers from the Query Store:

SELECT  qsq.query_id,
        CAST(qsp.query_plan AS XML)
FROM    sys.query_store_query AS qsq
JOIN    sys.query_store_query_text AS qsqt
        ON qsqt.query_text_id = qsq.query_text_id
JOIN    sys.query_store_plan AS qsp
        ON qsp.query_id = qsq.query_id
WHERE   qsqt.query_sql_text LIKE 'SELECT  sit.Quantity,

With this information, I’ll use Force Plan to ensure this is the plan used going forward.

EXEC sys.sp_query_store_force_plan 42995,487;

With that done, I’ll create the index:

CREATE INDEX TransactionOccurredWhen
ON Warehouse.StockItemTransactions
INCLUDE (StockItemID,InvoiceID,Quantity);

When I do this same set of operations, run the query, identify a missing index, create a new index, rerun the query, when there is no plan forcing occurring, the execution plan above is replaced with one that uses the index and results in about 32ms execution time with 2942 reads, a significant improvement. You get a recompile event because the schema involved with the query has changed. With the change, a new index is available, so the recompile event uses that new index. What happens when you force the plan?

The recompile event after running CREATE INDEX still occurs. However, because we have elected to force a plan, that plan is what is used. In this instance, a recompile to a new plan would result in a faster query using fewer resources. However, as long as we’re forcing a plan and that plan stays valid, the plan will be forced.

In short, the behavior is exactly as I expected. Choosing to force a plan in the Query Store results in that plan being forced. While I think that the Query Store and plan forcing are wonderful new tools in our tool box, I am concerned that plan forcing will become far too easy a thing to implement. I worry that people will implement it without thinking through the implications and potential impacts.

It gets worse. If I change the query, let’s say I make it into a stored procedure and parameterize the query, and, instead of a very limited date range, I send in a whole month, the execution plan is quite different (with or without the index). Forcing the plan that is expecting less than 1,000 rows onto a query that is retrieving 10,000 rows results in pretty horrific performance. We really are going to have to be careful about using plan forcing appropriately because, as in so much of the rest of SQL Server, and in all of programming for that matter, the code is going to do exactly what we tell it do, whether that’s what we really want it to do or not.

Aug 15 2016

Query Store, Force Plan and Dropped Objects

I love the Query Store. Seriously. It’s a huge leap forward in the capabilities of Azure SQL Database and SQL Server in support of performance monitoring and query optimization. One of my favorite aspects of the Query Store is the ability to force plans. Frankly though, it’s also the scariest part of the Query Store. I do believe that plan forcing will be one of the most ill-used functions in SQL Server since the multi-statement table-valued user-defined function (don’t get me started). However, unlike the UDF, this ill-use will be because of poor understanding on the part of the user, not a fundamental design issue. No, plan forcing and the Query Store are very well constructed. Let me give you an example of just how well constructed they are.

Let’s imagine that have a situation such as bad parameter sniffing where you’ve determined that from the more than one possible execution plans against a table, there is a preferred plan. Enabling plan forcing to ensure that plan gets used is a no-brainer. Let’s further imagine that you have a junior DBA who is… let’s just say overly aggressive in their duties such that they do silly things occasionally. What happens when your pretty plan, which uses a particular index meets your junior DBA who just dropped that index?

Here’s the setup. We’re using the WideWorldImporters database and we have this query:

FROM    Warehouse.StockItemTransactions AS sit
WHERE   sit.TransactionOccurredWhen BETWEEN '9/9/2015'
                                    AND     '9/11/2015';

This query, with the default configuration, will scan the existing table, so I’ll add an index:

CREATE INDEX TransactionOccurredWhenNCI
ON Warehouse.StockItemTransactions

For a limited range such as the one I’m passing above, I’ll get a plan with a key lookup operation which runs faster than the scan, so I’m happy. For a broader range, I’m likely to see a scan again, but since most of my queries have a very narrow range, I’d sure like to be able to force the plan to always compile to the seek and key lookup. To do this I need to find the query_id and plan_id from the Query Store (assuming I’m not using the GUI):

SELECT  qsp.plan_id,
FROM    sys.query_store_plan AS qsp
JOIN    sys.query_store_query AS qsq
        ON qsq.query_id = qsp.query_id
JOIN    sys.query_store_query_text AS qsqt
        ON qsqt.query_text_id = qsq.query_text_id
WHERE   qsqt.query_sql_text LIKE 'SELECT  *
FROM    Warehouse.StockItemTransactions AS sit%';

With those values, I can force the execution plan so that it will always use the plan I want:

EXEC sys.sp_query_store_force_plan 42460,463;

That’s it. I’m happy because I’m going to see the execution plan used over and over, despite any values passed during a recompile.


Along comes our aggressive junior DBA who decides that there are “too many” indexes on the server. No, I don’t know what that means either, but they evidently read it on the internet or something so they drop the index we created before:

DROP INDEX TransactionOccurredWhenNCI ON Warehouse.StockItemTransactions;

What now happens to our lovely execution plan and the plan forcing? We’ll take a look at two events in Extended Events, sql_statement_recompile and query_store_plan_forcing_failed. Nothing happens immediately on dropping the index. The plans associated with that object, if any, are marked as invalid in the cache. The next time we call the query it’s going to recompile and we can see the event:


The most important part of the event is the recompile_cause which is “Schema changed”. However, I would also note the attach_activity_id.guid. I’ve chosen to enable causality tracking in this Extended Event session. This will cause all events associated with a common activity to get a GUID and then a sequence. This is interested because, after the recompile event, we get the query_store_plan_forcing_failed event:


The guid value is the same as the event above and the *.seq number is now 2, showing that, for these events, the recompile event occurred and then this event occurred. That makes perfect sense. The plan is marked for recompile, so, it’s going to be recompiled. I have enabled plan forcing though, so I have a particular plan that I want the optimizer to use. However, thanks to my “helpful” junior DBA, the plan is now invalid. You even get the description of what happened in the message field for the event:

Index ‘WideWorldImporters.Warehouse.StockItemTransactions.TransactionOccurredWhenNCI’, specified in the USE PLAN hint, does not exist. Specify an existing index, or create an index with the specified name.

The first question now is, what happens with this query and the execution plan? Does the new plan generated now that the index is missing get stored in cache? Yes, it does. We can validate that by querying the cache, or, when capturing the actual execution plan, checking the “Retrieved from cache” property.

Because plan forcing is enabled, do we see a recompile every time this query is called? The answer to that question is slightly complex. Under normal circumstances, no. As long as that plan remains in cache, it’s simply reused. No other recompiles occur. A normal recompile event will cause another attempt at applying the invalid execution plan and we would see yet another query_store_plan_forcing_failed event for each recompile on the query. However, during testing, Joey D’Antoni (who was helping me play with this when we discussed what would happen when a plan was made invalid) had severe memory pressure on his server. He saw intermittent recompiles with a cause message that said plan forcing had failed. So if your server is under extreme stress and you cause this issue, you might see different messages. Just remember, the cause of the recompiles was not the plan forcing, but the memory pressure.

The fun thing is, as long as I don’t remove the plan forcing or take the query and plan out of the Query Store manually, if I recreate the index on my table with the same name and definition as that expected by the plan, the Query Store will simply reapply the plan and then successfully force it during any subsequent recompile situation. This is because Query Store is persisted with the database and barring outside activity, the information there will remain, just like the rest of the data in the database.

All of this means that Query Store works exactly the way we would expect, not forcing additional recompiles when you, or your junior DBA, inadvertently invalidate a plan. It also works as expected in that forcing a plan is stored with your database so that, assuming you don’t remove that plan from the Query Store, it will simply be reapplied after you fix the problem. It’s fun to see the thought that went behind the design of the behavior of Query Store. However, please, use plan forcing judiciously.

Jul 25 2016

Monitor Query Performance

Blog post #7 in support of Tim Ford’s (b|t) #iwanttohelp, #entrylevel. Read about it here.

Sooner or later when you’re working with SQL Server, someone is going to complain that the server is slow. I already pointed out the first place you should look when this comes up. But what if they’re more precise? What if, you know, or at least suspect, you have a problem with a query? How do you get information about how queries are behaving in SQL Server?

Choices For Query Metrics

It’s not enough to know that you have a slow query or queries. You need to know exactly how slow they are. You must measure. You need to know how long they take to run and you need to know how many resources are used while they run. You need to know these numbers in order to be able to determine if, after you do something to try to help the query, you’ll know whether or not you’ve improved performance. To measure the performance of queries, you have a number of choices. Each choice has positives and negatives associated with them. I’m going to run through my preferred mechanisms for measuring query performance and outline why. I’ll also list some of the other mechanisms you have available and tell you why I don’t like them. Let’s get started.

Dynamic Management Views

Since SQL Server 2005, Dynamic Management Views (DMV) and Functions (DMF) have been available for access all sorts of information about the server. Specifically there are a few DMVs that are focused on queries and query performance. If you go back through my blog, you can find tons of examples where I illustrate their use. You can also see them at work in commercial tools and free tools. Adam Machanic’s sp_WhoIsActive, a free tool, makes extensive use of DMVs. To learn more about DMVs, you can download a free book, Performance Tuning with SQL Server Dynamic Management Views. DMVs are available in Azure SQL Database, Azure SQL Data Warehouse, and all editions of SQL Server.

The information captured by DMVs is an aggregation of all the times the query has been run. This means you can’t find how long the query ran at 3PM yesterday. You can though see the minimum and maximum time the query took as well as the average. The ability to see this information is what makes DMVs useful. However, another important point about DMVs is that they only collect information while a query is in memory. As soon as it leaves the cache (the area of memory it is stored in), so does all the aggregated information about the query in the DMVs.

You use the DMVs for a general understanding of how a query is behaving. They’re not meant for detailed for long term collection of information about queries. For that we use other tools.

Extended Events

Introduced in SQL Server 2008, Extended Events (ExEvents) are a mechanism for capturing detailed information about SQL Server and the processes within. One of those processes is how queries behave. I have multiple examples on this blog on using ExEvents. You can’t go wrong reading about them on Jonathan Kehayias’ blog. Extended events are available in Azure SQL Database and all editions of SQL Server.

When you need to know every query against a database, or each time a particular query is called, and all the details associated with the query (reads, writes, duration), ExEvents are the way to go. ExEvents are very lightweight on the server (but not free) and can be filtered so that you capture just the information you need. The information is detailed and not aggregated. Instead it’s raw. The real issue with capturing this data is the amount of data you’ll be capturing. Testing and careful filtering to ensure you’re dealing with too much information is important. Prior to SQL Server 2012, there was no graphical user interface for reading ExEvent data, so you would have been forced to run queries against the XML that the information is captured within. With the tools available in SQL Server Management Studio, this is no longer the case.

You use ExEvents when you need specific and detailed information about a query. ExEvents are not so good for generalized monitoring.

Query Store

Introduced in Azure SQL Database, and first released in SQL Server with 2016, Query Store is another mechanism for capturing aggregated information about queries. As before, I have examples on how to work with Query Store on my blog. You can also find quite a bit on it over at Simple-Talk. Query Store is pretty specialized still and only available in Azure and SQL Server 2016, but it is in all editions of SQL Server 2016.

Query Store captures information similar to what is available in the DMVs. However, unlike the DMVs, the information that Query Store captures is kept around, even after a query ages out or is removed from cache. This persistence makes Query Store very exciting. You do have to choose to turn it on for each database you wish to capture queries for. It’s not automatic like DMVs. The capture processes are asynchronous, so they should be relatively light weight for most databases.

You use the Query Store when you need to capture query metrics over the long term, but you don’t need detailed information and aggregations works well for you.


There are other ways to measure query performance. You can use the Profiler GUI, but that actually seriously negatively impacts the server. You can bring a server down by using it, so it should be avoided. Profiler generates scripts called trace, which can be used to monitor your server. However, they have a much higher impact than ExEvents and they’re on the deprecation list. Microsoft is not added new trace events for new functionality, so they’re becoming less and less useful with each release. You also can’t use trace against Azure. If you’re writing a query and you just want to see how long it takes to run, you can use SET STATISTICS TIME ON, to capture the execution time. This is a handy way to quickly measure performance. There is also the ability to capture reads and writes using SET STATISTICS IO ON, but, while this does capture the metrics we need, it adds considerable overhead to the query, skewing performance measurement. This is why I stick to ExEvents when I need an accurate measure.


Honest people can disagree about the best way to capture query performance. I have my preferences as you can see. However, I’m fairly certain that everyone would agree that it’s important to know how to capture performance metrics in order to be able to assert that performance has increased or decreased in a measured fashion. You don’t want to guess at query performance, you want to know.

Jul 18 2016

Common Table Expression, Just a Name

The Common Table Expression (CTE) is a great tool in T-SQL. The CTE provides a mechanism to define a query that can be easily reused over and over within another query. The CTE also provides a mechanism for recursion which, though a little dangerous and overused, is extremely handy for certain types of queries. However, the CTE has a very unfortunate name. Over and over I’ve had to walk people back from the “Table” in Common Table Expression. The CTE is just a query. It’s not a table. It’s not providing a temporary storage space like a table variable or a temporary table. It’s just a query. Think of it more like a temporary view, which is also just a query.

Every time I explain this, there are people who don’t believe me. They point to the “Table” in the name, “See. Says so right there. It’s a table.”

It’s not and I can prove it. Let’s create a relatively simple CTE and use it in a query:

          AS (SELECT    c.CustomerName,
              FROM      Sales.Customers AS c
              JOIN      Sales.CustomerCategories AS cc
              ON        cc.CustomerCategoryID = c.CustomerCategoryID
              WHERE     c.CustomerCategoryID = 4)
    SELECT  *
    FROM    MyCTE;

Now, I’m going to run the query within the CTE and the CTE together as two statements in a batch and capture the execution plans:


On the top, the CTE, on the bottom, the query. You’ll note that the execution plans are identical. They each have the exact same Query Plan Hash value in the properties, 0x88EFD2B7C165E667, even though they have different Query Hash values, 0x192FFC125A08CC35 and 0xFEB7F2BCAC853CD5, respectively. Further, if I capture the query metrics using extended events, I get identical reads and, on average, identical execution times:


This is because, there is no table being created. The data is not treated differently. A CTE is just a query, not some type of temporary storage.

Heck, let’s do one more thing. Let’s use the latest SSMS plan comparison tool and highlight one of the operators to see what differences there are internally in the plan:



I don’t see a lot of differences. In fact, I don’t see any. That’s because the optimizer recognizes these two queries as identical. If it was loading data into temporary storage, you would see differences in something. We don’t. This is because, despite the somewhat unfortunate emphasis that gets placed on the Table portion of the name, the emphasis of the name, Common Table Expression, should be on the word Expression.

I will point out an interesting difference, especially useful for those who plug in CTEs everywhere, whether it’s needed or not. Let’s look at the properties of the two plans:


You can see the similarities and differences that I pointed out earlier in the Statement, Query Hash and Query Plan Hash, as well as the Estimated Subtree Cost and others. What’s truly interesting is that the CompileCPU, CompileMemory and CompileTime for the CTE is higher than the regular query. While the CTE is just a query, it’s a query that adds a non-zero overhead when used, and therefore, should only be used where appropriate (good gosh, I’ve seen people put it EVERWHERE, on every single query, don’t do that).

Hopefully, this is enough to establish, truly, completely, and thoroughly, that the Common Table Expression is an expression, not a table.

Yeah, I did this before, but it keeps coming up, so I tried a different approach. Let’s see if the word gets out. Your Common Table Expression is not a table.

I love talking about execution plans and query tuning. I’ll be doing this at an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.

Don’t  miss your chance to an all day training course on execution plans before SQL Saturday Oslo in September.

Jul 11 2016

Same Query, Different Servers, Different Performance. Now What?

Based on the number of times I see this question on forums, it must be occurring all the time. You have two different servers that, as far as you know, are identical in terms of their options and setup (although not necessarily in terms of power, think a test or pre-production system versus production). On these servers you have a database on each that, as far as you know, is the same as the other in terms of options, objects, maybe even data (although, this does mean that you have unmasked production information in your QA environment, which potentially means you’re going to jail, might want to address this, especially now that I’ve told you about it, mens rea, you’re welcome). On each database you run, as far as you know, the exact same query (whether a straight up ad hoc query, stored procedure, parameterized query, whatever). On your production server, let’s call it PROD, the query runs for 3 hours before it returns, but on the test server, let’s call it TEST, it returns in about 3 seconds.

What. The. Hell.

This is especially troubling because PROD has 8 processors that are much more powerful than TEST, which has only 2 (either can go parallel, so this is the same). TEST only has 16GB of RAM and PROD has 64GB. Further, you know that the databases, data, and statistics are the same because you restored the TEST database from PROD (although, see that jail thing above). However, you’re on top of things. You capture the query metrics so you know exactly the amount of time and the reads or writes from each query and you even do this using extended events so you’re minimizing observer effects. You also have monitoring in place, so you know that there is no blocking in PROD. The query is just a pig there, but not on TEST. You’ve even gone so far as to get an estimated execution plan from each server and they are absolutely different.

Now what?

Well obviously something somewhere is different. Start by comparing everything on both servers and both databases down to… hang on, here, we’ll write a PowerShell script and then….

Wait, wait, wait!

You have the execution plans? Before we start digging through all the properties everywhere and comparing everything to everything, updating statistics 14 times, and all the rest, what if we look at the execution plans. They’re different, so we should start looking at scans & indexes & statistics &….


Let’s start simple and clear. Let’s take a look at the properties of the first operator:


This contains a ton of information about the settings of the systems where each plan was generated. Let’s assume that, probably, the databases are the same, as is the data and the statistics, but the problem is a system or database setting. These differences can result in different execution plans and therefore different execution times. If we are dealing with two different servers and we are fairly sure the data, structure and the statistics are the same, the properties of the first operator are a great starting point for understanding what went wrong.

Oh, and the first operator is this one with the red square around it:

first operator

Let’s use the new SSMS Compare Showplan to see the differences between our two execution plans that we captured:

properties compared

(clicking on that will make it bigger)

There are a bunch of differences highlighted, but one ought to jump out pretty quick. That’s right, these two databases have different compatibility levels which resulted in one of them using the new optimizer and one using the old optimizer as evidenced by the CardinalityEstimatorModelVersion. Fun point, both have identical query hash values. Not surprising, but additional, quick, validation that we really are talking about the same query on each server (and one of them isn’t using a trace flag to change the cardinality estimator). You now have enough information to go and make some changes to your system without lots of further fumbling and work.

When dealing with the same query from two servers that ought to result in similar behavior, but doesn’t, get the execution plans (estimated plans are fine here) and compare the the properties of the first operator. That’s the quickest way to identify the issues that could be leading to the differences between the servers.

Want to talk more about execution plans and query tuning? Let’s do it.

In August, I’ll be doing an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.

I’m also going to be in Oslo Norway for a pre-con before SQL Saturday Oslo in September.

Jun 28 2016

CASE Statement in GROUP BY

Set based operations means you should put everything into a single statement, right?

Well, not really. People seem to think that having two queries is really bad, so when faced with logical gaps, they just cram them into the query they have. This is partly because SQL Server and T-SQL supports letting you do this, and it’s partly because it looks like a logical extension of code reuse to arrive at a query structure that supports multiple logic chains. However, let’s explore what happens when you do this on particular situation, a CASE statement in a GROUP BY clause.

You see this a lot because a given set of data may be needed in slightly different context by different groups within the company. Like many of my example queries, this could be better written. Like many of my example queries, it mirrors what I see in the wild (and for those following along at home, I’m using the WideWorldImporters database for tests now):

CREATE PROCEDURE dbo.InvoiceGrouping (@x INT)
SELECT  SUM(il.UnitPrice),
FROM    Sales.Invoices AS i
JOIN    Sales.InvoiceLines AS il
        ON il.InvoiceID = i.InvoiceID
GROUP BY CASE WHEN @x = 7 THEN i.ContactPersonID
              WHEN @x = 15 THEN i.AccountsPersonID
              ELSE i.SalespersonPersonID

Running this for any given value above, 7, 15 or other, you’ll get the same execution plan, regardless of the column used in the GROUP BY. However, Parameter Sniffing is still something of a factor. When you group this data by SalesPersonID, you only get 10 rows back. This will be shown as the estimated number of rows returned if some value other than 7 or 15 is used as a parameter. However, this is always the plan:


You can click on that to expand it into something readable. We can eliminate the Parameter Sniffing from the equation if we want to by modifying the query thus:

CREATE PROCEDURE dbo.InvoiceGrouping_NoSniff (@x INT)
SET @x2 = @x;

SELECT  SUM(il.UnitPrice),
FROM    Sales.Invoices AS i
JOIN    Sales.InvoiceLines AS il
        ON il.InvoiceID = i.InvoiceID
GROUP BY CASE WHEN @x2 = 7 THEN i.ContactPersonID
              WHEN @x2 = 15 THEN i.AccountsPersonID
              ELSE i.SalespersonPersonID

However, except for some deviation on the estimated rows (since it’s averaging the rows returned), the execution plan is the same.

What’s the big deal right? Well, let’s break down the code into three different procedures:

CREATE PROCEDURE dbo.InvoiceGrouping_Contact
SELECT  SUM(il.UnitPrice),
FROM    Sales.Invoices AS i
JOIN    Sales.InvoiceLines AS il
        ON il.InvoiceID = i.InvoiceID
GROUP BY i.ContactPersonID;

CREATE PROCEDURE dbo.InvoiceGrouping_Sales
SELECT  SUM(il.UnitPrice),
FROM    Sales.Invoices AS i
JOIN    Sales.InvoiceLines AS il
        ON il.InvoiceID = i.InvoiceID
GROUP BY i.SalespersonPersonID;

CREATE PROCEDURE dbo.InvoiceGrouping_Account
SELECT  SUM(il.UnitPrice),
FROM    Sales.Invoices AS i
JOIN    Sales.InvoiceLines AS il
        ON il.InvoiceID = i.InvoiceID
GROUP BY i.AccountsPersonID;

Interestingly enough, these three queries produce a nearly identical execution plan. The one big difference is the Compute Scalar operator that is used to generate a value for the Hash Match Aggregate is no longer in the query:


The same basic set of structures, scans against both tables, to arrive at the data. Cost estimates between the two plans are very different though, with the targeted queries having a much lower estimated cost.

Performance-wise, interestingly enough, the average execution time of the first query, only returning the 10 rows, is 157ms on average, while the query grouping directly on the SalesPersonID averages about 190ms. Now, the reads tell a slightly different story with 17428 on the generic query and 5721 on the specific query. So, maybe a server under load will see a significant performance increase. However, let’s deal with what we have in front of us and say that, at least for these tests, the catch-all GROUP BY query performs well.

Now let’s change the paradigm slightly. Let’s add an index:

CREATE INDEX TestingGroupBy ON Sales.Invoices (SalespersonPersonID);

Frankly, this isn’t a very useful index. However, after adding it, the execution plan for the InvoiceGrouping_Sales query changes. Instead of scanning the table, it’s now scanning the index. Despite recompiles and attempts to force it using hints, the original InvoiceGrouping query will not use this index. Duration of the InvoiceGrouping_Sales query drops to 140ms on average and the reads drop a little further to 5021. Getting an 11% increase on performance is a win.

This is a pretty simplified example, however, making the CASE statement more complex won’t improve performance or further assist the optimizer to make good choices. Instead of trying to cram multiple different logical groupings into a single query, a better approach would be to create the three new procedures that I did above, and make the original InvoiceGrouping procedure into a wrapping procedure that chooses which of the individual procedures to call. This way, if you do add indexes in support of each of the different possible groupings, you would realize a positive outcome in your performance.

Want to talk more about execution plans and query tuning?. In August, I’ll be doing an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.

I’m also going to be in Oslo Norway for a pre-con before SQL Saturday Oslo in September.

May 31 2016

Precedence Goes to Query Store or Plan Guide?

While presenting at SQLDay in Wroclaw, Poland, on the Query Store, I was asked a pretty simple question, which takes precedence, the Query Store or a Plan Guide?

One of my favorite answers to questions is “I don’t know” because it gives me the opportunity to learn. Let’s figure this one out together.

I’ll post the code to recreate this experiment within AdventureWorks at the end of the article. I’m doing this because the code for forcing execution plans using Plan Guides can be pretty doggone long (you may need to generate your own XML from a plan on your own system, fair warning).

I have a stored procedure that I use frequently to demonstrate parameter sniffing and bad parameter sniffing, AddressByCity (listed below). This query when passed the value ‘London’ will generate a plan with a Merge Join. When passed the value ‘Mentor’ it will generate a plan with a Loops Join. These different plans are caused by the statistics for the different values. 434 rows for ‘London’ and 1 row for ‘Mentor’. To start the test, I’ll generate a plan for each value, dropping the plan from cache between executions, so that I have both plans in the Query Store. Now, we’re ready for phase 1.

Phase 1

I’m going to remove the current plan from cache. I will then use sp_query_store_force_plan to select the plan with the Loops join. I’ll run the query using ‘London’ to validate that the plan generated is NOT the plan for ‘London’ but rather the plan for ‘Mentor’. Done, and I can see the Use Plan results in the Properties of the Select operator:


I’ve highlighted a couple of interesting points. First, even though we are using a plan that is compiled for the value of ‘Mentor’, because this plan was compiled using the value ‘London’, that’s what gets stored, despite the plan used. You can then see down at the bottom, that the ‘Use plan’ property is set to true. Now, let’s force the plan using a plan guide (in my script below). To be very sure what happens, I’ll again clear the plan from cache.

If the Query Store wins, I should see a plan with a Loops Join. If the Plan Guide wins, I should see a plan with a Merge Join. The moment of truth:


Query Store wins! Let’s see how it shows in the Properties:


You’ll see that the Parameter Compile & Runtime values are the same and that the ‘Use plan’ property is the same. Note however, the PlanGuideName property is in play. In fact, the indication is that the optimizer thinks that a Plan Guide has been successfully applied. In fact, if I watch the Extended Events for the plan_guide_successful event, the best way to check how Plan Guides are being used, apart from looking at the execution plans, it shows success for the Plan Guide we created.

Phase 2

Precedence could matter. First, I remove the Plan Guide using sp_control_plan_guide. Then, I remove the Query Store force using sp_query_store_unforce_plan. I’ll also clear the plan out of cache again, just to be sure. Next, I create the Plan Guide. I’ll run the procedure using the value ‘Mentor’ to ensure that I see the Merge/’London’ plan, the PlanGuideName property in that plan, and a successful application of a Plan Guide from the plan_guide_successful ExEvent. All good.

Now, I once again force the plan from Query Store. Rerun the plan…

The winner is Query Store.

Phase 3

OK, so, if we try to force plans using Plan Guide, Query Store wins. What about if I’m just trying to apply a hint? Let’s clear everything up again and then create an OPTIMIZE FOR query hint in a Plan Guide. Simple right?

Query Store wins again… Except things get a little odd. Take a look at the Properties now:


Normally, with a Plan Guide that includes an OPTIMIZE FOR hint, I’d see the Compiled Value as the defined value within the hint (in this case ‘London’). However, this time I’m seeing just the value of ‘Mentor’.


I don’t know what’s occurring in the internals. I just know the evidence I have in front of me. First, Query Store vs. Plan Guides? Query Store wins. However, internally, something is still going on. The Plan Guide is still successfully applied, at least according to what I can see. The processes that determine if a Plan Guide should be applied or not are not interfered with by the Query Store plan forcing. The fact that I’m seeing a Query Store plan and evidence of a successful Plan Guide though, that’s a problem that could cause issues when troubleshooting. Now you know, and knowing, as they say, is half the battle.

Let’s get together and talk about execution plans. In August, I’ll be doing an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.

I’m extremely exited to be in Oslo Norway for a pre-con before SQL Saturday Oslo in September.

/*NOTE: These scripts are not in the exact order in which
they must be run to re-create all my tests. These are
however, all the scripts you'll need.
--the procedure
CREATE PROC dbo.AddressByCity @City NVARCHAR(30)
SELECT  a.AddressID,
        sp.Name AS StateProvinceName,
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City;

--generating the first plan
EXEC dbo.AddressByCity
    @City = N'London';

--remove this from cache

SELECT  @PlanHandle = deps.plan_handle
FROM    sys.dm_exec_procedure_stats AS deps
WHERE   deps.object_id = OBJECT_ID('dbo.AddressByCity');

IF @PlanHandle IS NOT NULL
        DBCC FREEPROCCACHE(@PlanHandle);

--generate the second plan
EXEC dbo.AddressByCity
    @City = N'Mentor';

--get the query & plan id
SELECT  qsq.query_id,
FROM    sys.query_store_query AS qsq
JOIN    sys.query_store_plan AS qsp
        ON qsp.query_id = qsq.query_id
WHERE   qsq.object_id = OBJECT_ID('dbo.AddressByCity');

--force the plan
EXEC sys.sp_query_store_force_plan 414,460;

--unforce the plan
EXEC sys.sp_query_store_unforce_plan 414,460;

--Get the plan_handle and offset 
SELECT  deqs.plan_handle,
FROM    sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE   dest.objectid = OBJECT_ID('dbo.AddressByCity');

--create a plan guide
EXEC sys.sp_create_plan_guide_from_handle
    @name = N'PlanGuidePrecedence',
    @plan_handle = 0x05000500C26A7610A03A4F53A600000001000000000000000000000000000000000000000000000000000000,
    @statement_start_offset = 108;

--remove plan guide
EXEC sys.sp_control_plan_guide
    @operation = N'DROP',
    @name = N'PlanGuidePrecedence';

--pull a plan out of cache. You'll need to make sure the right plan is there first

SELECT  @xmlplan = deqp.query_plan
FROM    sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_text_query_plan(deqs.plan_handle,DEFAULT,DEFAULT) AS deqp
WHERE   deqp.objectid = OBJECT_ID('dbo.AddressByCity');

SELECT @xmlplan

--copied and pasted the value from @xmlplan for easy reuse
'<ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.5" Build="13.0.1300.275"><BatchSequence><Batch><Statements><StmtSimple StatementText="CREATE PROC dbo.AddressByCity @City NVARCHAR(30)
SELECT  a.AddressID,
        sp.Name AS StateProvinceName,
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City" StatementId="1" StatementCompId="3" StatementType="SELECT" StatementSqlHandle="0x0900D56A5A43E8B56B541DFF8DC13F149C6A0000000000000000000000000000000000000000000000000000" DatabaseContextSettingsId="1" ParentObjectId="276196034" StatementParameterizationType="0" RetrievedFromCache="true" StatementSubTreeCost="0.316799" StatementEstRows="434" SecurityPolicyApplied="false" StatementOptmLevel="FULL" QueryHash="0xDD75E124763781F2" QueryPlanHash="0x6C408B64C5E8A83F" StatementOptmEarlyAbortReason="GoodEnoughPlanFound" CardinalityEstimationModelVersion="120"><StatementSetOptions QUOTED_IDENTIFIER="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" NUMERIC_ROUNDABORT="false"></StatementSetOptions><QueryPlan CachedPlanSize="32" CompileTime="3" CompileCPU="3" CompileMemory="360"><MissingIndexes><MissingIndexGroup Impact="89.7983"><MissingIndex Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]"><ColumnGroup Usage="EQUALITY"><Column Name="[City]" ColumnId="4"></Column></ColumnGroup></MissingIndex></MissingIndexGroup></MissingIndexes><MemoryGrantInfo SerialRequiredMemory="512" SerialDesiredMemory="720"></MemoryGrantInfo><OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="260900" EstimatedPagesCached="65225" EstimatedAvailableDegreeOfParallelism="2"></OptimizerHardwareDependentProperties><RelOp NodeId="0" PhysicalOp="Merge Join" LogicalOp="Inner Join" EstimateRows="434" EstimateIO="0" EstimateCPU="0.0069349" AvgRowSize="225" EstimatedTotalSubtreeCost="0.316799" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row"><OutputList><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine1"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine2"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="City"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="PostalCode"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="Name"></ColumnReference></OutputList><Merge ManyToMany="0"><InnerSideJoinColumns><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference></InnerSideJoinColumns><OuterSideJoinColumns><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="StateProvinceID"></ColumnReference></OuterSideJoinColumns><Residual><ScalarOperator ScalarString="[AdventureWorks2014].[Person].[StateProvince].[StateProvinceID] as [sp].[StateProvinceID]=[AdventureWorks2014].[Person].[Address].[StateProvinceID] as [a].[StateProvinceID]"><Compare CompareOp="EQ"><ScalarOperator><Identifier><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="StateProvinceID"></ColumnReference></Identifier></ScalarOperator><ScalarOperator><Identifier><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference></Identifier></ScalarOperator></Compare></ScalarOperator></Residual><RelOp NodeId="1" PhysicalOp="Clustered Index Scan" LogicalOp="Clustered Index Scan" EstimateRows="181" EstimateIO="0.00386574" EstimateCPU="0.0003561" AvgRowSize="65" EstimatedTotalSubtreeCost="0.00422184" TableCardinality="181" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row"><OutputList><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="StateProvinceID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="Name"></ColumnReference></OutputList><IndexScan Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" ForceScan="0" NoExpandHint="0" Storage="RowStore"><DefinedValues><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="StateProvinceID"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="Name"></ColumnReference></DefinedValue></DefinedValues><Object Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Index="[PK_StateProvince_StateProvinceID]" Alias="[sp]" IndexKind="Clustered" Storage="RowStore"></Object></IndexScan></RelOp><RelOp NodeId="2" PhysicalOp="Sort" LogicalOp="Sort" EstimateRows="434" EstimateIO="0.0112613" EstimateCPU="0.0060321" AvgRowSize="177" EstimatedTotalSubtreeCost="0.30564" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row"><OutputList><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine1"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine2"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="City"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="PostalCode"></ColumnReference></OutputList><MemoryFractions Input="1" Output="1"></MemoryFractions><Sort Distinct="0"><OrderBy><OrderByColumn Ascending="1"><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference></OrderByColumn></OrderBy><RelOp NodeId="3" PhysicalOp="Clustered Index Scan" LogicalOp="Clustered Index Scan" EstimateRows="434" EstimateIO="0.257199" EstimateCPU="0.0217324" AvgRowSize="177" EstimatedTotalSubtreeCost="0.278931" TableCardinality="19614" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row"><OutputList><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine1"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine2"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="City"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="PostalCode"></ColumnReference></OutputList><IndexScan Ordered="0" ForcedIndex="0" ForceScan="0" NoExpandHint="0" Storage="RowStore"><DefinedValues><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressID"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine1"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine2"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="City"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="PostalCode"></ColumnReference></DefinedValue></DefinedValues><Object Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Index="[PK_Address_AddressID]" Alias="[a]" IndexKind="Clustered" Storage="RowStore"></Object><Predicate><ScalarOperator ScalarString="[AdventureWorks2014].[Person].[Address].[City] as [a].[City]=[@City]"><Compare CompareOp="EQ"><ScalarOperator><Identifier><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="City"></ColumnReference></Identifier></ScalarOperator><ScalarOperator><Identifier><ColumnReference Column="@City"></ColumnReference></Identifier></ScalarOperator></Compare></ScalarOperator></Predicate></IndexScan></RelOp></Sort></RelOp></Merge></RelOp><ParameterList><ColumnReference Column="@City" ParameterCompiledValue="N'London'"></ColumnReference></ParameterList></QueryPlan></StmtSimple></Statements></Batch></BatchSequence></ShowPlanXML>'

EXEC sys.sp_create_plan_guide
    @name = N'PlanGuidePrecedence',
    @stmt = N'SELECT  a.AddressID,
        sp.Name AS StateProvinceName,
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City;',
    @type = N'OBJECT',
    @module_or_batch = N'dbo.AddressByCity',
    @params = NULL,
    @hints = @xmlplan;

--create a plan guide with just a hint in it
EXEC sys.sp_create_plan_guide @name = 'PlanGuidePrecedence', -- sysname
    @stmt = N'SELECT  a.AddressID,
        sp.Name AS StateProvinceName,
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City;', -- nvarchar(max)
    @type = N'Object', -- nvarchar(60)
    @module_or_batch = N'dbo.AddressByCity', -- nvarchar(max)
    @params = NULL, -- nvarchar(max)
    @hints = N'OPTION(OPTIMIZE FOR(@City = ''London''))'; -- nvarchar(max)