Jul 25 2016

Monitor Query Performance

Blog post #7 in support of Tim Ford’s (b|t) #iwanttohelp, #entrylevel. Read about it here.

Sooner or later when you’re working with SQL Server, someone is going to complain that the server is slow. I already pointed out the first place you should look when this comes up. But what if they’re more precise? What if, you know, or at least suspect, you have a problem with a query? How do you get information about how queries are behaving in SQL Server?

Choices For Query Metrics

It’s not enough to know that you have a slow query or queries. You need to know exactly how slow they are. You must measure. You need to know how long they take to run and you need to know how many resources are used while they run. You need to know these numbers in order to be able to determine if, after you do something to try to help the query, you’ll know whether or not you’ve improved performance. To measure the performance of queries, you have a number of choices. Each choice has positives and negatives associated with them. I’m going to run through my preferred mechanisms for measuring query performance and outline why. I’ll also list some of the other mechanisms you have available and tell you why I don’t like them. Let’s get started.

Dynamic Management Views

Since SQL Server 2005, Dynamic Management Views (DMV) and Functions (DMF) have been available for access all sorts of information about the server. Specifically there are a few DMVs that are focused on queries and query performance. If you go back through my blog, you can find tons of examples where I illustrate their use. You can also see them at work in commercial tools and free tools. Adam Machanic’s sp_WhoIsActive, a free tool, makes extensive use of DMVs. To learn more about DMVs, you can download a free book, Performance Tuning with SQL Server Dynamic Management Views. DMVs are available in Azure SQL Database, Azure SQL Data Warehouse, and all editions of SQL Server.

The information captured by DMVs is an aggregation of all the times the query has been run. This means you can’t find how long the query ran at 3PM yesterday. You can though see the minimum and maximum time the query took as well as the average. The ability to see this information is what makes DMVs useful. However, another important point about DMVs is that they only collect information while a query is in memory. As soon as it leaves the cache (the area of memory it is stored in), so does all the aggregated information about the query in the DMVs.

You use the DMVs for a general understanding of how a query is behaving. They’re not meant for detailed for long term collection of information about queries. For that we use other tools.

Extended Events

Introduced in SQL Server 2008, Extended Events (ExEvents) are a mechanism for capturing detailed information about SQL Server and the processes within. One of those processes is how queries behave. I have multiple examples on this blog on using ExEvents. You can’t go wrong reading about them on Jonathan Kehayias’ blog. Extended events are available in Azure SQL Database and all editions of SQL Server.

When you need to know every query against a database, or each time a particular query is called, and all the details associated with the query (reads, writes, duration), ExEvents are the way to go. ExEvents are very lightweight on the server (but not free) and can be filtered so that you capture just the information you need. The information is detailed and not aggregated. Instead it’s raw. The real issue with capturing this data is the amount of data you’ll be capturing. Testing and careful filtering to ensure you’re dealing with too much information is important. Prior to SQL Server 2012, there was no graphical user interface for reading ExEvent data, so you would have been forced to run queries against the XML that the information is captured within. With the tools available in SQL Server Management Studio, this is no longer the case.

You use ExEvents when you need specific and detailed information about a query. ExEvents are not so good for generalized monitoring.

Query Store

Introduced in Azure SQL Database, and first released in SQL Server with 2016, Query Store is another mechanism for capturing aggregated information about queries. As before, I have examples on how to work with Query Store on my blog. You can also find quite a bit on it over at Simple-Talk. Query Store is pretty specialized still and only available in Azure and SQL Server 2016, but it is in all editions of SQL Server 2016.

Query Store captures information similar to what is available in the DMVs. However, unlike the DMVs, the information that Query Store captures is kept around, even after a query ages out or is removed from cache. This persistence makes Query Store very exciting. You do have to choose to turn it on for each database you wish to capture queries for. It’s not automatic like DMVs. The capture processes are asynchronous, so they should be relatively light weight for most databases.

You use the Query Store when you need to capture query metrics over the long term, but you don’t need detailed information and aggregations works well for you.

Others

There are other ways to measure query performance. You can use the Profiler GUI, but that actually seriously negatively impacts the server. You can bring a server down by using it, so it should be avoided. Profiler generates scripts called trace, which can be used to monitor your server. However, they have a much higher impact than ExEvents and they’re on the deprecation list. Microsoft is not added new trace events for new functionality, so they’re becoming less and less useful with each release. You also can’t use trace against Azure. If you’re writing a query and you just want to see how long it takes to run, you can use SET STATISTICS TIME ON, to capture the execution time. This is a handy way to quickly measure performance. There is also the ability to capture reads and writes using SET STATISTICS IO ON, but, while this does capture the metrics we need, it adds considerable overhead to the query, skewing performance measurement. This is why I stick to ExEvents when I need an accurate measure.

Conclusion

Honest people can disagree about the best way to capture query performance. I have my preferences as you can see. However, I’m fairly certain that everyone would agree that it’s important to know how to capture performance metrics in order to be able to assert that performance has increased or decreased in a measured fashion. You don’t want to guess at query performance, you want to know.

Jul 18 2016

Common Table Expression, Just a Name

The Common Table Expression (CTE) is a great tool in T-SQL. The CTE provides a mechanism to define a query that can be easily reused over and over within another query. The CTE also provides a mechanism for recursion which, though a little dangerous and overused, is extremely handy for certain types of queries. However, the CTE has a very unfortunate name. Over and over I’ve had to walk people back from the “Table” in Common Table Expression. The CTE is just a query. It’s not a table. It’s not providing a temporary storage space like a table variable or a temporary table. It’s just a query. Think of it more like a temporary view, which is also just a query.

Every time I explain this, there are people who don’t believe me. They point to the “Table” in the name, “See. Says so right there. It’s a table.”

It’s not and I can prove it. Let’s create a relatively simple CTE and use it in a query:

WITH    MyCTE
          AS (SELECT    c.CustomerName,
                        cc.CustomerCategoryName
              FROM      Sales.Customers AS c
              JOIN      Sales.CustomerCategories AS cc
              ON        cc.CustomerCategoryID = c.CustomerCategoryID
              WHERE     c.CustomerCategoryID = 4)
    SELECT  *
    FROM    MyCTE;

Now, I’m going to run the query within the CTE and the CTE together as two statements in a batch and capture the execution plans:

ExecPlans

On the top, the CTE, on the bottom, the query. You’ll note that the execution plans are identical. They each have the exact same Query Plan Hash value in the properties, 0x88EFD2B7C165E667, even though they have different Query Hash values, 0x192FFC125A08CC35 and 0xFEB7F2BCAC853CD5, respectively. Further, if I capture the query metrics using extended events, I get identical reads and, on average, identical execution times:

duration

This is because, there is no table being created. The data is not treated differently. A CTE is just a query, not some type of temporary storage.

Heck, let’s do one more thing. Let’s use the latest SSMS plan comparison tool and highlight one of the operators to see what differences there are internally in the plan:

 

plancompare

I don’t see a lot of differences. In fact, I don’t see any. That’s because the optimizer recognizes these two queries as identical. If it was loading data into temporary storage, you would see differences in something. We don’t. This is because, despite the somewhat unfortunate emphasis that gets placed on the Table portion of the name, the emphasis of the name, Common Table Expression, should be on the word Expression.

I will point out an interesting difference, especially useful for those who plug in CTEs everywhere, whether it’s needed or not. Let’s look at the properties of the two plans:

peroperties

You can see the similarities and differences that I pointed out earlier in the Statement, Query Hash and Query Plan Hash, as well as the Estimated Subtree Cost and others. What’s truly interesting is that the CompileCPU, CompileMemory and CompileTime for the CTE is higher than the regular query. While the CTE is just a query, it’s a query that adds a non-zero overhead when used, and therefore, should only be used where appropriate (good gosh, I’ve seen people put it EVERWHERE, on every single query, don’t do that).

Hopefully, this is enough to establish, truly, completely, and thoroughly, that the Common Table Expression is an expression, not a table.

Yeah, I did this before, but it keeps coming up, so I tried a different approach. Let’s see if the word gets out. Your Common Table Expression is not a table.


I love talking about execution plans and query tuning. I’ll be doing this at an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.

Don’t  miss your chance to an all day training course on execution plans before SQL Saturday Oslo in September.

Jul 11 2016

Same Query, Different Servers, Different Performance. Now What?

Based on the number of times I see this question on forums, it must be occurring all the time. You have two different servers that, as far as you know, are identical in terms of their options and setup (although not necessarily in terms of power, think a test or pre-production system versus production). On these servers you have a database on each that, as far as you know, is the same as the other in terms of options, objects, maybe even data (although, this does mean that you have unmasked production information in your QA environment, which potentially means you’re going to jail, might want to address this, especially now that I’ve told you about it, mens rea, you’re welcome). On each database you run, as far as you know, the exact same query (whether a straight up ad hoc query, stored procedure, parameterized query, whatever). On your production server, let’s call it PROD, the query runs for 3 hours before it returns, but on the test server, let’s call it TEST, it returns in about 3 seconds.

What. The. Hell.

This is especially troubling because PROD has 8 processors that are much more powerful than TEST, which has only 2 (either can go parallel, so this is the same). TEST only has 16GB of RAM and PROD has 64GB. Further, you know that the databases, data, and statistics are the same because you restored the TEST database from PROD (although, see that jail thing above). However, you’re on top of things. You capture the query metrics so you know exactly the amount of time and the reads or writes from each query and you even do this using extended events so you’re minimizing observer effects. You also have monitoring in place, so you know that there is no blocking in PROD. The query is just a pig there, but not on TEST. You’ve even gone so far as to get an estimated execution plan from each server and they are absolutely different.

Now what?

Well obviously something somewhere is different. Start by comparing everything on both servers and both databases down to… hang on, here, we’ll write a PowerShell script and then….

Wait, wait, wait!

You have the execution plans? Before we start digging through all the properties everywhere and comparing everything to everything, updating statistics 14 times, and all the rest, what if we look at the execution plans. They’re different, so we should start looking at scans & indexes & statistics &….

Wait!

Let’s start simple and clear. Let’s take a look at the properties of the first operator:

AllProperties

This contains a ton of information about the settings of the systems where each plan was generated. Let’s assume that, probably, the databases are the same, as is the data and the statistics, but the problem is a system or database setting. These differences can result in different execution plans and therefore different execution times. If we are dealing with two different servers and we are fairly sure the data, structure and the statistics are the same, the properties of the first operator are a great starting point for understanding what went wrong.

Oh, and the first operator is this one with the red square around it:

first operator

Let’s use the new SSMS Compare Showplan to see the differences between our two execution plans that we captured:

properties compared

(clicking on that will make it bigger)

There are a bunch of differences highlighted, but one ought to jump out pretty quick. That’s right, these two databases have different compatibility levels which resulted in one of them using the new optimizer and one using the old optimizer as evidenced by the CardinalityEstimatorModelVersion. Fun point, both have identical query hash values. Not surprising, but additional, quick, validation that we really are talking about the same query on each server (and one of them isn’t using a trace flag to change the cardinality estimator). You now have enough information to go and make some changes to your system without lots of further fumbling and work.

When dealing with the same query from two servers that ought to result in similar behavior, but doesn’t, get the execution plans (estimated plans are fine here) and compare the the properties of the first operator. That’s the quickest way to identify the issues that could be leading to the differences between the servers.


Want to talk more about execution plans and query tuning? Let’s do it.

In August, I’ll be doing an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.

I’m also going to be in Oslo Norway for a pre-con before SQL Saturday Oslo in September.

Jun 28 2016

CASE Statement in GROUP BY

Set based operations means you should put everything into a single statement, right?

Well, not really. People seem to think that having two queries is really bad, so when faced with logical gaps, they just cram them into the query they have. This is partly because SQL Server and T-SQL supports letting you do this, and it’s partly because it looks like a logical extension of code reuse to arrive at a query structure that supports multiple logic chains. However, let’s explore what happens when you do this on particular situation, a CASE statement in a GROUP BY clause.

You see this a lot because a given set of data may be needed in slightly different context by different groups within the company. Like many of my example queries, this could be better written. Like many of my example queries, it mirrors what I see in the wild (and for those following along at home, I’m using the WideWorldImporters database for tests now):

CREATE PROCEDURE dbo.InvoiceGrouping (@x INT)
AS
SELECT  SUM(il.UnitPrice),
        COUNT(i.ContactPersonID),
        COUNT(i.AccountsPersonID),
        COUNT(i.SalespersonPersonID)
FROM    Sales.Invoices AS i
JOIN    Sales.InvoiceLines AS il
        ON il.InvoiceID = i.InvoiceID
GROUP BY CASE WHEN @x = 7 THEN i.ContactPersonID
              WHEN @x = 15 THEN i.AccountsPersonID
              ELSE i.SalespersonPersonID
         END;
GO

Running this for any given value above, 7, 15 or other, you’ll get the same execution plan, regardless of the column used in the GROUP BY. However, Parameter Sniffing is still something of a factor. When you group this data by SalesPersonID, you only get 10 rows back. This will be shown as the estimated number of rows returned if some value other than 7 or 15 is used as a parameter. However, this is always the plan:

coreplan

You can click on that to expand it into something readable. We can eliminate the Parameter Sniffing from the equation if we want to by modifying the query thus:

CREATE PROCEDURE dbo.InvoiceGrouping_NoSniff (@x INT)
AS
DECLARE @x2 INT;
SET @x2 = @x;

SELECT  SUM(il.UnitPrice),
        COUNT(i.ContactPersonID),
        COUNT(i.AccountsPersonID),
        COUNT(i.SalespersonPersonID)
FROM    Sales.Invoices AS i
JOIN    Sales.InvoiceLines AS il
        ON il.InvoiceID = i.InvoiceID
GROUP BY CASE WHEN @x2 = 7 THEN i.ContactPersonID
              WHEN @x2 = 15 THEN i.AccountsPersonID
              ELSE i.SalespersonPersonID
         END;
GO

However, except for some deviation on the estimated rows (since it’s averaging the rows returned), the execution plan is the same.

What’s the big deal right? Well, let’s break down the code into three different procedures:

CREATE PROCEDURE dbo.InvoiceGrouping_Contact
AS
SELECT  SUM(il.UnitPrice),
        COUNT(i.ContactPersonID),
        COUNT(i.AccountsPersonID),
        COUNT(i.SalespersonPersonID)
FROM    Sales.Invoices AS i
JOIN    Sales.InvoiceLines AS il
        ON il.InvoiceID = i.InvoiceID
GROUP BY i.ContactPersonID;
GO

CREATE PROCEDURE dbo.InvoiceGrouping_Sales
AS
SELECT  SUM(il.UnitPrice),
        COUNT(i.ContactPersonID),
        COUNT(i.AccountsPersonID),
        COUNT(i.SalespersonPersonID)
FROM    Sales.Invoices AS i
JOIN    Sales.InvoiceLines AS il
        ON il.InvoiceID = i.InvoiceID
GROUP BY i.SalespersonPersonID;
GO

CREATE PROCEDURE dbo.InvoiceGrouping_Account
AS
SELECT  SUM(il.UnitPrice),
        COUNT(i.ContactPersonID),
        COUNT(i.AccountsPersonID),
        COUNT(i.SalespersonPersonID)
FROM    Sales.Invoices AS i
JOIN    Sales.InvoiceLines AS il
        ON il.InvoiceID = i.InvoiceID
GROUP BY i.AccountsPersonID;
GO

Interestingly enough, these three queries produce a nearly identical execution plan. The one big difference is the Compute Scalar operator that is used to generate a value for the Hash Match Aggregate is no longer in the query:

specificplan

The same basic set of structures, scans against both tables, to arrive at the data. Cost estimates between the two plans are very different though, with the targeted queries having a much lower estimated cost.

Performance-wise, interestingly enough, the average execution time of the first query, only returning the 10 rows, is 157ms on average, while the query grouping directly on the SalesPersonID averages about 190ms. Now, the reads tell a slightly different story with 17428 on the generic query and 5721 on the specific query. So, maybe a server under load will see a significant performance increase. However, let’s deal with what we have in front of us and say that, at least for these tests, the catch-all GROUP BY query performs well.

Now let’s change the paradigm slightly. Let’s add an index:

CREATE INDEX TestingGroupBy ON Sales.Invoices (SalespersonPersonID);

Frankly, this isn’t a very useful index. However, after adding it, the execution plan for the InvoiceGrouping_Sales query changes. Instead of scanning the table, it’s now scanning the index. Despite recompiles and attempts to force it using hints, the original InvoiceGrouping query will not use this index. Duration of the InvoiceGrouping_Sales query drops to 140ms on average and the reads drop a little further to 5021. Getting an 11% increase on performance is a win.

This is a pretty simplified example, however, making the CASE statement more complex won’t improve performance or further assist the optimizer to make good choices. Instead of trying to cram multiple different logical groupings into a single query, a better approach would be to create the three new procedures that I did above, and make the original InvoiceGrouping procedure into a wrapping procedure that chooses which of the individual procedures to call. This way, if you do add indexes in support of each of the different possible groupings, you would realize a positive outcome in your performance.


Want to talk more about execution plans and query tuning?. In August, I’ll be doing an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.

I’m also going to be in Oslo Norway for a pre-con before SQL Saturday Oslo in September.

May 31 2016

Precedence Goes to Query Store or Plan Guide?

While presenting at SQLDay in Wroclaw, Poland, on the Query Store, I was asked a pretty simple question, which takes precedence, the Query Store or a Plan Guide?

One of my favorite answers to questions is “I don’t know” because it gives me the opportunity to learn. Let’s figure this one out together.

I’ll post the code to recreate this experiment within AdventureWorks at the end of the article. I’m doing this because the code for forcing execution plans using Plan Guides can be pretty doggone long (you may need to generate your own XML from a plan on your own system, fair warning).

I have a stored procedure that I use frequently to demonstrate parameter sniffing and bad parameter sniffing, AddressByCity (listed below). This query when passed the value ‘London’ will generate a plan with a Merge Join. When passed the value ‘Mentor’ it will generate a plan with a Loops Join. These different plans are caused by the statistics for the different values. 434 rows for ‘London’ and 1 row for ‘Mentor’. To start the test, I’ll generate a plan for each value, dropping the plan from cache between executions, so that I have both plans in the Query Store. Now, we’re ready for phase 1.

Phase 1

I’m going to remove the current plan from cache. I will then use sp_query_store_force_plan to select the plan with the Loops join. I’ll run the query using ‘London’ to validate that the plan generated is NOT the plan for ‘London’ but rather the plan for ‘Mentor’. Done, and I can see the Use Plan results in the Properties of the Select operator:

QSvPG_UsePlan

I’ve highlighted a couple of interesting points. First, even though we are using a plan that is compiled for the value of ‘Mentor’, because this plan was compiled using the value ‘London’, that’s what gets stored, despite the plan used. You can then see down at the bottom, that the ‘Use plan’ property is set to true. Now, let’s force the plan using a plan guide (in my script below). To be very sure what happens, I’ll again clear the plan from cache.

If the Query Store wins, I should see a plan with a Loops Join. If the Plan Guide wins, I should see a plan with a Merge Join. The moment of truth:

QSvPG_Plan

Query Store wins! Let’s see how it shows in the Properties:

QSvPG_PlanGuideName

You’ll see that the Parameter Compile & Runtime values are the same and that the ‘Use plan’ property is the same. Note however, the PlanGuideName property is in play. In fact, the indication is that the optimizer thinks that a Plan Guide has been successfully applied. In fact, if I watch the Extended Events for the plan_guide_successful event, the best way to check how Plan Guides are being used, apart from looking at the execution plans, it shows success for the Plan Guide we created.

Phase 2

Precedence could matter. First, I remove the Plan Guide using sp_control_plan_guide. Then, I remove the Query Store force using sp_query_store_unforce_plan. I’ll also clear the plan out of cache again, just to be sure. Next, I create the Plan Guide. I’ll run the procedure using the value ‘Mentor’ to ensure that I see the Merge/’London’ plan, the PlanGuideName property in that plan, and a successful application of a Plan Guide from the plan_guide_successful ExEvent. All good.

Now, I once again force the plan from Query Store. Rerun the plan…

The winner is Query Store.

Phase 3

OK, so, if we try to force plans using Plan Guide, Query Store wins. What about if I’m just trying to apply a hint? Let’s clear everything up again and then create an OPTIMIZE FOR query hint in a Plan Guide. Simple right?

Query Store wins again… Except things get a little odd. Take a look at the Properties now:

QSvPG_Parameter

Normally, with a Plan Guide that includes an OPTIMIZE FOR hint, I’d see the Compiled Value as the defined value within the hint (in this case ‘London’). However, this time I’m seeing just the value of ‘Mentor’.

Conclusion

I don’t know what’s occurring in the internals. I just know the evidence I have in front of me. First, Query Store vs. Plan Guides? Query Store wins. However, internally, something is still going on. The Plan Guide is still successfully applied, at least according to what I can see. The processes that determine if a Plan Guide should be applied or not are not interfered with by the Query Store plan forcing. The fact that I’m seeing a Query Store plan and evidence of a successful Plan Guide though, that’s a problem that could cause issues when troubleshooting. Now you know, and knowing, as they say, is half the battle.


Let’s get together and talk about execution plans. In August, I’ll be doing an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.

I’m extremely exited to be in Oslo Norway for a pre-con before SQL Saturday Oslo in September.


/*NOTE: These scripts are not in the exact order in which
they must be run to re-create all my tests. These are
however, all the scripts you'll need.
*/
--the procedure
CREATE PROC dbo.AddressByCity @City NVARCHAR(30)
AS
SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City;
GO

--generating the first plan
EXEC dbo.AddressByCity
    @City = N'London';



--remove this from cache
DECLARE @PlanHandle VARBINARY(64);

SELECT  @PlanHandle = deps.plan_handle
FROM    sys.dm_exec_procedure_stats AS deps
WHERE   deps.object_id = OBJECT_ID('dbo.AddressByCity');

IF @PlanHandle IS NOT NULL
    BEGIN
        DBCC FREEPROCCACHE(@PlanHandle);
    END
GO



--generate the second plan
EXEC dbo.AddressByCity
    @City = N'Mentor';




--get the query & plan id
SELECT  qsq.query_id,
        qsp.plan_id
FROM    sys.query_store_query AS qsq
JOIN    sys.query_store_plan AS qsp
        ON qsp.query_id = qsq.query_id
WHERE   qsq.object_id = OBJECT_ID('dbo.AddressByCity');


--force the plan
EXEC sys.sp_query_store_force_plan 414,460;

--unforce the plan
EXEC sys.sp_query_store_unforce_plan 414,460;




--Get the plan_handle and offset 
SELECT  deqs.plan_handle,
        deqs.statement_start_offset
FROM    sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE   dest.objectid = OBJECT_ID('dbo.AddressByCity');

--create a plan guide
EXEC sys.sp_create_plan_guide_from_handle
    @name = N'PlanGuidePrecedence',
    @plan_handle = 0x05000500C26A7610A03A4F53A600000001000000000000000000000000000000000000000000000000000000,
    @statement_start_offset = 108;

--remove plan guide
EXEC sys.sp_control_plan_guide
    @operation = N'DROP',
    @name = N'PlanGuidePrecedence';


--pull a plan out of cache. You'll need to make sure the right plan is there first
DECLARE @xmlplan NVARCHAR(MAX)

SELECT  @xmlplan = deqp.query_plan
FROM    sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_text_query_plan(deqs.plan_handle,DEFAULT,DEFAULT) AS deqp
WHERE   deqp.objectid = OBJECT_ID('dbo.AddressByCity');

SELECT @xmlplan


--copied and pasted the value from @xmlplan for easy reuse
DECLARE @xmlplan NVARCHAR(MAX) =
'<ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.5" Build="13.0.1300.275"><BatchSequence><Batch><Statements><StmtSimple StatementText="CREATE PROC dbo.AddressByCity @City NVARCHAR(30)
AS
SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City" StatementId="1" StatementCompId="3" StatementType="SELECT" StatementSqlHandle="0x0900D56A5A43E8B56B541DFF8DC13F149C6A0000000000000000000000000000000000000000000000000000" DatabaseContextSettingsId="1" ParentObjectId="276196034" StatementParameterizationType="0" RetrievedFromCache="true" StatementSubTreeCost="0.316799" StatementEstRows="434" SecurityPolicyApplied="false" StatementOptmLevel="FULL" QueryHash="0xDD75E124763781F2" QueryPlanHash="0x6C408B64C5E8A83F" StatementOptmEarlyAbortReason="GoodEnoughPlanFound" CardinalityEstimationModelVersion="120"><StatementSetOptions QUOTED_IDENTIFIER="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" NUMERIC_ROUNDABORT="false"></StatementSetOptions><QueryPlan CachedPlanSize="32" CompileTime="3" CompileCPU="3" CompileMemory="360"><MissingIndexes><MissingIndexGroup Impact="89.7983"><MissingIndex Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]"><ColumnGroup Usage="EQUALITY"><Column Name="[City]" ColumnId="4"></Column></ColumnGroup></MissingIndex></MissingIndexGroup></MissingIndexes><MemoryGrantInfo SerialRequiredMemory="512" SerialDesiredMemory="720"></MemoryGrantInfo><OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="260900" EstimatedPagesCached="65225" EstimatedAvailableDegreeOfParallelism="2"></OptimizerHardwareDependentProperties><RelOp NodeId="0" PhysicalOp="Merge Join" LogicalOp="Inner Join" EstimateRows="434" EstimateIO="0" EstimateCPU="0.0069349" AvgRowSize="225" EstimatedTotalSubtreeCost="0.316799" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row"><OutputList><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine1"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine2"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="City"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="PostalCode"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="Name"></ColumnReference></OutputList><Merge ManyToMany="0"><InnerSideJoinColumns><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference></InnerSideJoinColumns><OuterSideJoinColumns><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="StateProvinceID"></ColumnReference></OuterSideJoinColumns><Residual><ScalarOperator ScalarString="[AdventureWorks2014].[Person].[StateProvince].[StateProvinceID] as [sp].[StateProvinceID]=[AdventureWorks2014].[Person].[Address].[StateProvinceID] as [a].[StateProvinceID]"><Compare CompareOp="EQ"><ScalarOperator><Identifier><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="StateProvinceID"></ColumnReference></Identifier></ScalarOperator><ScalarOperator><Identifier><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference></Identifier></ScalarOperator></Compare></ScalarOperator></Residual><RelOp NodeId="1" PhysicalOp="Clustered Index Scan" LogicalOp="Clustered Index Scan" EstimateRows="181" EstimateIO="0.00386574" EstimateCPU="0.0003561" AvgRowSize="65" EstimatedTotalSubtreeCost="0.00422184" TableCardinality="181" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row"><OutputList><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="StateProvinceID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="Name"></ColumnReference></OutputList><IndexScan Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" ForceScan="0" NoExpandHint="0" Storage="RowStore"><DefinedValues><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="StateProvinceID"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Alias="[sp]" Column="Name"></ColumnReference></DefinedValue></DefinedValues><Object Database="[AdventureWorks2014]" Schema="[Person]" Table="[StateProvince]" Index="[PK_StateProvince_StateProvinceID]" Alias="[sp]" IndexKind="Clustered" Storage="RowStore"></Object></IndexScan></RelOp><RelOp NodeId="2" PhysicalOp="Sort" LogicalOp="Sort" EstimateRows="434" EstimateIO="0.0112613" EstimateCPU="0.0060321" AvgRowSize="177" EstimatedTotalSubtreeCost="0.30564" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row"><OutputList><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine1"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine2"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="City"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="PostalCode"></ColumnReference></OutputList><MemoryFractions Input="1" Output="1"></MemoryFractions><Sort Distinct="0"><OrderBy><OrderByColumn Ascending="1"><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference></OrderByColumn></OrderBy><RelOp NodeId="3" PhysicalOp="Clustered Index Scan" LogicalOp="Clustered Index Scan" EstimateRows="434" EstimateIO="0.257199" EstimateCPU="0.0217324" AvgRowSize="177" EstimatedTotalSubtreeCost="0.278931" TableCardinality="19614" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row"><OutputList><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine1"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine2"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="City"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="PostalCode"></ColumnReference></OutputList><IndexScan Ordered="0" ForcedIndex="0" ForceScan="0" NoExpandHint="0" Storage="RowStore"><DefinedValues><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressID"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine1"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="AddressLine2"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="City"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="StateProvinceID"></ColumnReference></DefinedValue><DefinedValue><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="PostalCode"></ColumnReference></DefinedValue></DefinedValues><Object Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Index="[PK_Address_AddressID]" Alias="[a]" IndexKind="Clustered" Storage="RowStore"></Object><Predicate><ScalarOperator ScalarString="[AdventureWorks2014].[Person].[Address].[City] as [a].[City]=[@City]"><Compare CompareOp="EQ"><ScalarOperator><Identifier><ColumnReference Database="[AdventureWorks2014]" Schema="[Person]" Table="[Address]" Alias="[a]" Column="City"></ColumnReference></Identifier></ScalarOperator><ScalarOperator><Identifier><ColumnReference Column="@City"></ColumnReference></Identifier></ScalarOperator></Compare></ScalarOperator></Predicate></IndexScan></RelOp></Sort></RelOp></Merge></RelOp><ParameterList><ColumnReference Column="@City" ParameterCompiledValue="N'London'"></ColumnReference></ParameterList></QueryPlan></StmtSimple></Statements></Batch></BatchSequence></ShowPlanXML>'


EXEC sys.sp_create_plan_guide
    @name = N'PlanGuidePrecedence',
    @stmt = N'SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City;',
    @type = N'OBJECT',
    @module_or_batch = N'dbo.AddressByCity',
    @params = NULL,
    @hints = @xmlplan;


--create a plan guide with just a hint in it
EXEC sys.sp_create_plan_guide @name = 'PlanGuidePrecedence', -- sysname
    @stmt = N'SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City;', -- nvarchar(max)
    @type = N'Object', -- nvarchar(60)
    @module_or_batch = N'dbo.AddressByCity', -- nvarchar(max)
    @params = NULL, -- nvarchar(max)
    @hints = N'OPTION(OPTIMIZE FOR(@City = ''London''))'; -- nvarchar(max)
May 16 2016

Query Store, Forced Plans, and New Plans

I love questions. I recently received one about new plans in the Query Store (available in Azure SQL Database now and in SQL Server 2016 after June 1).

Let’s say you have selected a plan that you want to force. You set it up. Now, let’s say the plan ages out of cache or even goes through a recompile. During the recompile, due to out of date statistics or skew in the statistics, you would, under normal circumstances, get a new plan. However, with Query Store and plan forcing, the plan that’s going to be used is the plan that is being forced. But, does that other plan, the one not used, get stored in Query Store?

I have no idea. Let’s find out.

The Setup

To start with, a small stored procedure that I use all the time for bad parameter sniffing demos that reliably gets different plans with different values due to statistics skew:

CREATE PROC dbo.spAddressByCity @City NVARCHAR(30)
AS
SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City;

If this procedure is called for a value of ‘London’ it gets a plan with a Merge Join. For most other value it gets a plan with a Loops Join. Here’s an example of the ‘London’ plan:

2016-02-22_10-38-43

I’ve run both queries on my test system. I can check their existence in Query Store this way:

SELECT  CAST(qsp.query_plan AS XML),
		qsq.query_id,
		qsp.plan_id,
		qsp.is_forced_plan
FROM    sys.query_store_query AS qsq
JOIN    sys.query_store_plan AS qsp
        ON qsp.query_id = qsq.query_id
WHERE   qsq.object_id = OBJECT_ID('dbo.spAddressByCity');

That’s going to return two (2) rows:

newplanresults

Now, I’m going to remove the second plan, the ‘Mentor’ plan, and I’m going to force the ‘London’ plan with two statements:

EXEC sys.sp_query_store_remove_plan @plan_id =219;

EXEC sys.sp_query_store_force_plan 2,2;

Now, when I execute the query, no matter what values I pass, I’m going to arrive at the ‘London’ plan. However, let’s pull the plan from cache in order to force a recompile:

DECLARE @PlanHandle varbinary(64);

SELECT  @PlanHandle = deps.plan_handle
FROM    sys.dm_exec_procedure_stats AS deps
WHERE   deps.object_id = OBJECT_ID('dbo.spAddressByCity');

DBCC FREEPROCCACHE(@PlanHandle);

The Reveal

When I execute the query using the value of ‘Mentor’ do I get that plan stored in the Query Store? I know which plan will be in the cache. The ‘London’ plan because of I have chosen to force that plan. But, when I query the Query Store again to see what’s there for my stored procedure:

newplannegativeresults

The short answer is, nope. Any possible new plans that could have been stored in Query Store are not there. Yes, we have gone into the compile process, but, due to plan forcing, that process is getting a small bypass and the plan is simply applied from the Query Store. This, even though, you’re going to see a compile or recompile event if you capture them using Extended Events. For more fun on recompiles and the Query Store, see this earlier post of mine.

Keep those questions coming.


I love talking about query tuning. If you want to spend the day with me talking query tuning, you sure can. In August, I’ll be doing an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.

I’ll be doing a pre-conference seminar in Oslo Norway before SQL Saturday Oslo in September.

May 03 2016

Implicit Conversion and Performance

Letting SQL Server change data types automatically can seriously impact performance in a negative way. Because a calculation has to be run on each column, you can’t get an index seek. Instead, you’re forced to use a scan. I can demonstrate this pretty simply. Here’s a script that sets up a test table with three columns and three indexes and tosses a couple of rows in:

CREATE TABLE dbo.ConvertTest (
     BigIntColumn BIGINT NOT NULL,
     IntColumn INT NOT NULL,
     DateColumn VARCHAR(30)
    );

CREATE INDEX BigIntIndex 
ON dbo.ConvertTest 
(BigIntColumn);
CREATE INDEX IntIndex 
ON dbo.ConvertTest 
(IntColumn);
CREATE INDEX DateIndex 
ON dbo.ConvertTest
(DateColumn);

WITH    Nums
          AS (SELECT TOP (1000000)
                        ROW_NUMBER() OVER (ORDER BY (SELECT 1
                                                    )) AS n
              FROM      master.sys.all_columns ac1
              CROSS JOIN master.sys.all_columns ac2
             )
    INSERT  INTO dbo.ConvertTest
            (BigIntColumn,
             IntColumn,
             DateColumn
            )
    SELECT  Nums.n,
            Nums.n,
            DATEADD(HOUR, Nums.n, '1/1/1900')
    FROM    Nums;

If I run a very simple query like this:

DECLARE @param DATETIME;

SET @param = '3/15/1963 8:00:00';

SELECT  ct.DateColumn
FROM    dbo.ConvertTest AS ct
WHERE   ct.DateColumn = @param;

I get an execution plan that looks like this:

convert_scan

The issue is, while I put dates into the DateColumn, the data type there is VARCHAR, not DATETIME. Passing a DATETIME value results in an implicit conversion that we can see in the Index Scan Predicate property:

ConverImplicitHurtsPerformance

This is a function against a column and that is what results in the scan. If I modify the code as follows:

DECLARE @param VARCHAR(30);

SET @param = '3/15/1963 8:00:00';

SELECT  ct.DateColumn
FROM    dbo.ConvertTest AS ct
WHERE   ct.DateColumn = @param;

I get an execution plan that looks like this:

convert_seek

No change in data type is needed, so the index can be used. It’s a very simple demonstration of how implicit conversions hurt performance. Now, what about the integer columns? What happens when I do this with the code:

DECLARE @param INT;

SET @param = 650323;

SELECT  ct.BigIntColumn
FROM    dbo.ConvertTest AS ct
WHERE   ct.BigIntColumn = @param;

Here’s the execution plan:

convert_intseek

And here is the predicate:

convert_predicate

In this case, we’re still getting an implicit conversion of the data type, but SQL Server is smart enough to realize that, hey, it’s still just an integer, so it can convert the @param instead of the column, resulting in a seek. If I change the test so that it runs a query against the INT column passing a BIGINT value, it doesn’t even do a conversion of any kind.

In short, not all implicit conversions are a problem. However, it’s still a best practice to use the right data type for columns. You should also use the same data type for your parameters and local variables as the column data type. Taking these steps avoids issues with implicit conversion.


For lots more information on query tuning, in August, I’ll be doing an all day pre-con at SQLServer Geeks Annual Summit in Bangalore India.

I’m also going to be doing an all day training course before SQL Saturday Oslo in September.

Feb 24 2016

Query Store and Recompile

One of the many advantages of SQL Cruise is the ability to have enough time during a presentation to be able to answer questions from the people there in great detail. One question came up while I was showing the new functionality of Query Store (available soon in SQL Server 2016, available right now in Azure SQL Database).

What happens to plan forcing when you have OPTION RECOMPILE on a query?

Great question. I have a favorite procedure I use to illustrate the functionality of parameter sniffing:

ALTER PROC dbo.AddressByCity @City NVARCHAR(30)
AS
SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City;

If this procedure is called with the value of ‘Mentor’ you get an execution plan that looks like this:

2016-02-22_10-48-00

If you remove that plan from cache and then call the procedure with the value of ‘London’ then the plan looks like this:

2016-02-22_10-38-43

It’s parameter sniffing in action. One of the cool pieces of functionality that comes with the Query Store is the ability to choose a plan and make that plan get used, regardless of the parameter values. You do this through the following command:

sys.sp_query_store_force_plan

You pass it a query_id and a plan_id  from the information in the Query Store and now that plan will get used. But, what happens when you have attempted to fix your query using a query hint, such as RECOMPILE? Let’s modify our procedure:

ALTER PROC dbo.spAddressByCity @City NVARCHAR(30)
AS
SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City
OPTION  (RECOMPILE);

If I run the procedure, regardless of the parameter values passed in, I get a new plan for each value. Nothing is stored in cache, but, it is stored in the Query Store.

Let’s pull the query_id and plan_id out of Query Store for this procedure:

SELECT  qsq.query_id,
        qsp.plan_id,
        CAST(qsp.query_plan AS XML) AS sqlplan
FROM    sys.query_store_query AS qsq
JOIN    sys.query_store_plan AS qsp
        ON qsp.query_id = qsq.query_id
WHERE   qsq.object_id = OBJECT_ID('dbo.spAddressByCity');

Currently, on my system, this brings back four rows, two distinct query_id and four different plan_id. This is because, when I modified the text of the query within the procedure, I kept the object_id the same, but, the query text is different (the RECOMPILE hint). This means a different record in the Query Store. We’ll use the newer query_id and pick one of the plans to force it:

EXEC sys.sp_query_store_force_plan
    42,
    44;

Now, no matter what value I pass it, the RECOMPILE hint is effectively ignored. I have the same plan every time (in my case, the London plan). Ah, but is the RECOMPILE hint ignored? In fact, no, it isn’t. If I capture the extended event sql_statement_recompile, I can see every single time I execute this procedure that it’s going through a full recompile… which is then tossed and the plan I chose is forced upon the optimizer. Every time.

This isn’t a revelation. It makes sense. However, the Query Store represents one more thing that we have to think through.


If you want to talk query tuning, execution plans, the Query Store, and more, I’ll be presenting an all day pre-conference seminar at SQLDay in Wroclaw Poland on May 16th, 2016. Let’s talk.

Feb 22 2016

Why Is The Server Slow?

This is blog post #2 in support of Tim Ford’s (b|t) #iwanttohelp, #entrylevel.

If you haven’t been working in SQL Server for very long, you may not have got this phone call yet, but you will:

Hi, yeah, the server is slow. Thanks. Bye.

Let’s pretend for a moment that you know which server they’re referring to (because just finding out that piece of information can be a challenge). Now what?

The list of tools and mechanisms within SQL Server for gathering metrics is extremely long:

Performance Monitor
Dynamic Management Views & Functions
System Views
Extended Events
Trace Events
Activity Monitor
Data Collector
Execution Plans
3rd Party Tools

I’m leaving out lots of stuff in that list. So where do you start when you get this phone call? Where is the server slow?

The best place to start is by looking at the wait statistics.

A simple, and simplified, explanation for how processes work within SQL Server is that each process gets access to the various resources for a little while, then has to pass off access to another process, each of them working together to get the work done. But, some processes take longer than others. When a process has to wait for anything in SQL Server, this information gets logged. Depending on the system you’re working with you can access these wait statistics from one of three locations:

sys.dm_os_wait_stats: for SQL Server
sys.dm_db_wait_stats: for Azure SQL Database
sys.dm_pdw_nodes_os_wait_stats: for Azurew SQL Data Warehouse

Getting information from these system views is extremely simple. Here’s an example query:

SELECT  *
FROM    sys.dm_os_wait_stats AS dows
ORDER BY dows.wait_time_ms DESC;

The results of this query look like this:

waits

The first column lists the wait types. The second column provides a count of the tasks that have had that wait type. The third column, wait_time_ms, is the amount of time in milliseconds that tasks have been waiting within the system, cumulative. Next is the maximum wait time that any one task has waited, max_wait_time_ms. Finally, signal_wait_time_ms, you see the time that the waits have had to wait for access to the CPU (also known as time spent on the Runnable queue). The importance of waits are not simply the time that a wait has had, but also the number of tasks and the max time. Using all these values gives meaning to the individual wait.

Here is where things actually get difficult. The wait types are arcane, difficult to understand, and difficult to interpret. Further, a lot of the wait types actually don’t mean anything at all. The waits are not indicative of an issue. So, while the query above is simple, the results it provides are very weak. Instead, I strongly recommend you use the query provided by Paul Randal, located here. It will filter out the wait statistics that you shouldn’t care about.

Now, you have a meaningful list of wait statistics that will tell you exactly why, if not where, your server is running slow. Unfortunately, these waits still need to be interpreted. If you read further on Paul’s blog, you’ll see he has a number of waits and their causes documented. That’s your best bet to start understanding what’s happening on your system (although, I hear, Paul might be creating a more complete database of wait stats. I’ll update this blog post should that become available).

One other thing to consider. These waits are since the last time the server was started (or failed over, or the database was failed over in Azure, or if the values have been reset). This means that simply looking at the list doesn’t give you necessarily enough information. Instead, running this more than once during a day can show you what’s been slow over time by comparing the two data sets. You can also use sys.dm_exec_session_wait_stats to see what any given session is experiencing if you want to know what a given user or process is experiencing. Don’t just look at the list and think you’re done. The best thing to do is get to a point of proactive monitoring (for this, monitoring tools make it easier).

So, when you get the phone call that says the server is slow, you know how to get started understanding exactly why that may be the truth.

Jan 25 2016

Finding Your Query in Query Store

Query Store is pretty amazing. I’m loving working with it. I think it’s likely to change how query tuning will be done in the future. Lots of people are probably going to just use the reports and tools in SQL Server Management Studio. However, a pretty healthy chunk of us will start using the system views in order to programmatically access the information stored in Query Store. One of the first things you’re going to want to do is track down your query.

The primary views you’ll want are sys.query_store_query and sys.query_store_query_text. They join together based on the query_text_id. Let’s take four scenarios and see if we can retrieve the correct query. First up, an ad hoc query:

SELECT  e.NationalIDNumber,
        p.LastName,
        p.FirstName,
        a.City,
        bea.AddressTypeID
FROM    HumanResources.Employee AS e
JOIN    Person.BusinessEntityAddress AS bea
        ON bea.BusinessEntityID = e.BusinessEntityID
JOIN    Person.Address AS a
        ON a.AddressID = bea.AddressID
JOIN    Person.Person AS p
        ON p.BusinessEntityID = e.BusinessEntityID
WHERE   p.LastName = 'Hamilton';

If we wanted to retrieve this from the Query Store AdventureWorks2014, we’d run a query like this:

SELECT * FROM sys.query_store_query_text AS qsqt
WHERE qsqt.query_sql_text = 'SELECT  e.NationalIDNumber,
        p.LastName,
        p.FirstName,
        a.City,
        bea.AddressTypeID
FROM    HumanResources.Employee AS e
JOIN    Person.BusinessEntityAddress AS bea
        ON bea.BusinessEntityID = e.BusinessEntityID
JOIN    Person.Address AS a
        ON a.AddressID = bea.AddressID
JOIN    Person.Person AS p
        ON p.BusinessEntityID = e.BusinessEntityID
WHERE   p.LastName = ''Hamilton''';

Of note, I had to drop the statement terminator from the text of the query, the semi-colon, in order to retrieve the correct query. That’ll be good to know in a moment. This retrieves the query information I requested, just fine.

For our next example, let’s simplify things a whole bunch:

SELECT  *
FROM    Production.BillOfMaterials AS bom
WHERE   bom.BillOfMaterialsID = 2363;

If I then attempt to retrieve the information from Query Store like this:

SELECT  *
FROM    sys.query_store_query_text AS qsqt
WHERE   qsqt.query_sql_text = 'SELECT  *
FROM    Production.BillOfMaterials AS bom
WHERE   bom.BillOfMaterialsID = 2363';

It actually doesn’t work. Note, I took off the statement terminator, just like before. In fact, the problem here can be identified if we look at the T-SQL from the execution plan from the second example:

SELECT * FROM [Production].[BillOfMaterials] [bom] WHERE [bom].[BillOfMaterialsID][email protected]

This query has gone through simple parameterization. So, in order to retrieve the information from Query Store, we have a function, sys.fn_stmt_sql_handle_from_sql_stmt that we have to incorporate like this:

SELECT  qsqt.*
FROM    sys.query_store_query_text AS qsqt
JOIN    sys.query_store_query AS qsq
        ON qsq.query_text_id = qsqt.query_text_id
CROSS APPLY sys.fn_stmt_sql_handle_from_sql_stmt('SELECT  *
FROM    Production.BillOfMaterials AS bom
WHERE   bom.BillOfMaterialsID = 2363;', qsq.query_parameterization_type) AS fsshfss

And note, I have the statement terminator back in place, but this function takes that into account. I joined to the query_store_query table in order to get the parameterization type value. With the function figuring out the statement handle based on the text I originally ran the query through, everything is hunky dory.

Let’s look at one more example:

EXEC dbo.spAddressByCity
    @City = N'London';

This is a stored procedure, so I can just do this:

SELECT  qsqt.*
FROM    sys.query_store_query_text AS qsqt
JOIN    sys.query_store_query AS qsq
        ON qsq.query_text_id = qsqt.query_text_id
WHERE   qsq.object_id = OBJECT_ID('dbo.spAddressByCity');

However, you may have multi-statement stored procs, so you might want to query based on the text within the procedure like this:

SELECT  *
FROM    sys.query_store_query_text AS qsqt
WHERE   qsqt.query_sql_text = 'SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City'

Bad news. That doesn’t work. If you looked at the qsqt.query_sql_text value in the previous Query Store query using the OBJECT_ID function, you’ll see that the query looks like this within the Query Store:

(@City nvarchar(30))SELECT  a.AddressID,          a.AddressLine1,          a.AddressLine2,          a.City,          sp.Name AS StateProvinceName,          a.PostalCode  FROM    Person.Address AS a  JOIN    Person.StateProvince AS sp          ON a.StateProvinceID = sp.StateProvinceID  WHERE   a.City = @City

But, if you look at the documentation for sys.fn_stmt_sql_handle_from_sql_stmt that I linked above, it only works with simple or forced parameterization (this, despite having options for none and user, 0 & 1 respectively in the documents). That means you may be resorted to the use of LIKE to retrieve particular statements:

SELECT  qsqt.*
FROM    sys.query_store_query_text AS qsqt
WHERE qsqt.query_sql_text LIKE '%SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City%';

At least, that’s where we’re at with the current, public, CTP.

Tracking down your query can be a little bit of work and hopefully these tips will make it a little easier.