Sep 18 2013

Finding Ad Hoc Queries with Query Hash

I was presenting a session on how to read execution plans when I received a question: Do you have a specific example of how you can use the query hash to identify similar query plans. I do, but I couldn’t show it right then, so the person asking requested this blog post.

If you’re dealing with lots of application generated, dynamic or ad hoc T-SQL queries, then attempting to determine tuning opportunities, missing indexes, incorrect structures, etc., becomes much more difficult because you don’t have a single place to go to see what’s happening. Each ad hoc query looks different… or do they. Introduced in SQL Server 2008 and available in the standard Dynamic Management Objects (DMO), we have a mechanism to identify ad hoc queries that are similar in structure through the query hash.

Query hash values are available in the following DMOs: sys.dm_exec_requests and sys.dm_exec_query_stats. Those two cover pretty much everything you need, what’s executing right now, what has recently executed (well, what is still in cache that was recently executed, if a query isn’t in cache, you won’t see it). The query hash value itself is nothing other than the output from a hash mechanism. A hash is a formula that outputs a value based on input. For the same, or similar, input, you get the same value. There’s also a query_plan_hash value that’s a hash of the execution plan.

Let’s see this in action. Here is a query:

SELECT  soh.AccountNumber ,
        soh.DueDate ,
        sod.OrderQty ,
        p.Name
FROM    Sales.SalesOrderHeader AS soh
        JOIN Sales.SalesOrderDetail AS sod ON sod.SalesOrderID = soh.SalesOrderID
        JOIN Production.Product AS p ON p.ProductID = sod.ProductID
WHERE   p.Name LIKE 'Flat%';

And if I modify it just a little, like you might with dynamically generated code:

SELECT  soh.AccountNumber ,
        soh.DueDate ,
        sod.OrderQty ,
        p.Name
FROM    Sales.SalesOrderHeader AS soh
        JOIN Sales.SalesOrderDetail AS sod ON sod.SalesOrderID = soh.SalesOrderID
        JOIN Production.Product AS p ON p.ProductID = sod.ProductID
WHERE   p.Name LIKE 'HL%';

What I’ve got is essentially the same query. Yes, if I were to parameterize that WHERE clause by creating a stored procedure or a parameterized query it would be identical, but in this case it is not. In the strictest terms, those are two different strings. But, they both hash to the same value: 0x5A5A6D8B2DA72E25. But, here’s where it gets fun. The first query returns no rows at all, but the second returns 8,534. They have identical query hash values, but utterly different execution plans:

HasPlans

Now, how to use this? First, let’s say you’re looking at the second execution plan. You note the scan of the SalesOrderHeader clustered index and decide you might want to add an index here, but you’re unsure of how many other queries behave like this one. You first look at the properties of the SELECT operator. That actually contains the query hash value. You get that value and plug it into a query something like this:

SELECT  deqs.query_hash ,
        deqs.query_plan_hash ,
        deqp.query_plan ,
        dest.text
FROM    sys.dm_exec_query_stats AS deqs
        CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
        CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE   deqs.query_hash = 0x5A5A6D8B2DA72E25;

This resulted in eight rows. Yeah, eight. Because I had run this same query different times in different ways, in combinations, so that the query hash, which is generated on each statement, is common, but there were all different sorts of plans and issues. But, a common thread running through them all, a scan on the clustered index as the most expensive operator. So, now that I can identify lots of different common access paths, all of which have a common problem, I can propose a solution that will help out in multiple locations.

The problem with this is, what if I modify the query like this:

SELECT  soh.AccountNumber ,
        soh.DueDate ,
        sod.OrderQty ,
        p.Name,
		p.Color
FROM    Sales.SalesOrderHeader AS soh
        JOIN Sales.SalesOrderDetail AS sod ON sod.SalesOrderID = soh.SalesOrderID
        JOIN Production.Product AS p ON p.ProductID = sod.ProductID
WHERE   p.Name LIKE 'HL%';

Despite the fact that this query has an identical query plan to the one above, the hash value, because of the added p.Color column, is now 0xABB5AFDC5A4A6988. But, worth noting, the query_plan_hash values for both these queries are the same, so you can do the same search for common plans with different queries to identify potential tuning opportunities.

So, the ability to pull this information is not a magic bullet that will help you solve all your ad hoc and dynamic T-SQL issues. It’s just another tool in the tool box.

For things like this and a whole lot more, let’s get together in November, 2013 at the all day pre-conference seminar at SQL Saturday Dallas.

UPDATE: In the text I had incorrectly said this was introduced in 2005. It was 2008. I’ve updated it and added this little footnote. Sorry for any confusion.

Jan 23 2013

Saving Execution Plans on Azure SQL Database

In my previous post showing how to get to execution plans in the Database Management Portal, I showed that it’s pretty easy to put a query in a query window and get the execution plan. This allows you to understand query behavior in order to tune your T-SQL or your data structures, all through the Azure interface. But, what happens if you want to share an execution plan with a friend, post it to an online forum, save it for later comparisons as part of troubleshooting bad parameter sniffing, track behaviors over time as statistics change, other purposes that I can’t think of at the moment?

To first answer this question, let me tell you how you would do these things in SQL Server Management Studio (SSMS). First, and most obviously, when you look at the plan in SSMS you can just right click on that plan and select “Save Execution Plan As…” from the context menu. This will allow you to create a .sqlplan file on your OS. From there, do what you want. The file saved is XML, but the .sqlplan format is read by SSMS as a graphical plan. You can look at the underlying XML any time you want by right clicking and selecting the appropriate context menu.

And Azure SQL Database?

Well, things are a little different. Let’s start with the query. After you execute the query to get an actual plan or call up the estimated plan you’ll see the plan in the Portal.

SaveQueryPlanAndQuery

You can see the query, the plan, and options with the plan (to be explored in a later post). If you right click on the plan as you would within SSMS, the context menu is singularly unhelpful since it will only have one option ‘Silverlight.’ (yes, we’re still running Silverlight in this part of Azure, don’t shoot the messenger). So I guess there’s no way to save that plan, right? Wrong. Look up at the top of the image above. You’ll see a button that looks like a 3.25″ floppy disk (kids, get granddad to explain what that is) and the phrase “Save As” underneath it. Click on that and you can save your query. Ah, but there’s a drop down under file types and there you’ll see “SQL Plan File.”

SaveSaveAs

Provide a file name and you can save the execution plan down to your machine. Easy.

Ah, but there is a catch. Let’s say we query the Dynamic Management Objects to pull plans out of the cache. This query will run on both SQL Server and Azure SQL Database:

SELECT	deqp.query_plan
FROM	sys.dm_exec_query_stats AS deqs
		CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
		CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE	dest.text LIKE 'SELECT	m.MovieName,
		msd.MovieStageName,
		st.StageTypeDesc%';

The results, when run from SSMS, will look something like this:

SaveURL

That’s an embedded URL that you can click which will then open the execution plan from the cache. But, when the same query is run from SQL Database Management Portal, the results look like this:

SaveNoURL

 

No options for a clickable interface here to open the plans. You can try copying the output and then pasting it into Notepad then saving as a .sqlplan file. Of course, that’s only going to work for small plans since the result set here is not unlimited. You could build an app to query the cache and output to file. But, in general, querying directly to the plan cache from the SQL Database Management Portal gives you very few options for displaying your query plan.

Since this is such an important part of query tuning, here’s hoping that we get further enhancements to the Management Portal in the future.

Jul 02 2012

Querying Information from the Plan Cache, Simplified

One of the great things about the Dynamic Management Objects (DMOs) that expose the information in plan cache is that, by their very nature, they can be queried. The plans exposed are in XML format, so you can run XQuery against them to pull out interesting information.

For example, what if you wanted to see all the plans in cache that had a Timeout as the reason for early termination from the optimizer? It’d be great way to see which of your plans were less than reliable. You could so like this:

WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),  QueryPlans 
AS  ( 
SELECT  RelOp.pln.value(N'@StatementOptmEarlyAbortReason', N'varchar(50)') AS TerminationReason, 
        RelOp.pln.value(N'@StatementOptmLevel', N'varchar(50)') AS OptimizationLevel, 
        --dest.text, 
        SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1, 
                  (deqs.statement_end_offset - deqs.statement_start_offset) 
                  / 2 + 1) AS StatementText, 
        deqp.query_plan, 
        deqp.dbid, 
        deqs.execution_count, 
        deqs.total_elapsed_time, 
        deqs.total_logical_reads, 
        deqs.total_logical_writes 
FROM    sys.dm_exec_query_stats AS deqs 
        CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest 
        CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp 
        CROSS APPLY deqp.query_plan.nodes(N'//StmtSimple') RelOp (pln) 
WHERE   deqs.statement_end_offset > -1        
)   
SELECT  DB_NAME(qp.dbid), 
        * 
FROM    QueryPlans AS qp 
WHERE   qp.TerminationReason = 'Timeout' 
ORDER BY qp.execution_count DESC ;

 

I posted a similar version of this query once before (although, I think that one is a little broken). It works fine… But…

This query takes 25 seconds. A big chunk of that is parsing the XML. What if, for a simple query like there, where I’m not doing a lot of conversion & processing with the XML, we ignored it and went instead to something like this:

SELECT  DB_NAME(deqp.dbid), 
        SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1, 
                  (CASE deqs.statement_end_offset 
                     WHEN -1 THEN DATALENGTH(dest.text) 
                     ELSE deqs.statement_end_offset 
                   END - deqs.statement_start_offset) / 2 + 1) AS StatementText, 
        deqs.statement_end_offset, 
        deqs.statement_start_offset, 
        deqp.query_plan, 
        deqs.execution_count, 
        deqs.total_elapsed_time, 
        deqs.total_logical_reads, 
        deqs.total_logical_writes 
FROM    sys.dm_exec_query_stats AS deqs 
        CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp 
        CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest 
WHERE   CAST(deqp.query_plan AS NVARCHAR(MAX)) LIKE '%StatementOptmEarlyAbortReason="TimeOut"%';

 

Now, we’re no longer hooked into getting the XML parsed. But, surprisingly, performance is not much better, sometimes worse in my tests. It probably has something to do with performing a function on a column, the CAST of the query_plan column from XML to NVARCHAR(MAX). What can you do?

Well, there is one other place where execution plans are kept, sys.dm_exec_text_query_plan. Things are a little different in there. Instead of a plan with multiple statements in it, each of these plans is for an individual statement. This is why you must pass in the start & end offsets to call the query. That changes the result sets, a little. You get fewer rows back, but, you also get a lot less duplication, and, we don’t have to cast anything in the WHERE clause. Let’s check it out:

SELECT  DB_NAME(detqp.dbid), 
        SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1, 
                  (CASE deqs.statement_end_offset 
                     WHEN -1 THEN DATALENGTH(dest.text) 
                     ELSE deqs.statement_end_offset 
                   END - deqs.statement_start_offset) / 2 + 1) AS StatementText, 
        CAST(detqp.query_plan AS XML), 
        deqs.execution_count, 
        deqs.total_elapsed_time, 
        deqs.total_logical_reads, 
        deqs.total_logical_writes 
FROM    sys.dm_exec_query_stats AS deqs 
        CROSS APPLY sys.dm_exec_text_query_plan(deqs.plan_handle, 
                                                deqs.statement_start_offset, 
                                                deqs.statement_end_offset) AS detqp 
        CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest 
WHERE   detqp.query_plan LIKE '%StatementOptmEarlyAbortReason="TimeOut"%';

 

Performance on my system dropped from 30 seconds on average to 8 seconds on average. That’s a win by any measure. If you worked on a way to eliminate that wild card LIKE search, it would be even better. Note line 7 above. To be able to click on the query_plan column and see a pretty graphical execution plan, I just have to CAST the text to XML, but that’s not adding to the overhead of the query.

If you’re looking to search within your query plans, you’re still likely to be better off using XQuery to get sophisticated searches on the data, but for really simple stuff, using the sys.dm_exec_text_query_plan may offer you a faster alternative.

Mar 26 2012

Execution Plan for a User Defined Function

When you execute a multi-statement user-defined function you may see an execution plan that looks something like this:

image

It appears as if the cost of the UDF is free. This is especially true if you use the UDF in a query with other objects, such as joining it to actual tables. Since the optimizer always assumes a multi-statement UDF has a single row for statistics estimates, it’ frequently displays a low cost.

But you know that there’s more going on there, right? It’s a multi-statement UDF because it’s doing a lot of work, but that is not reflected in the execution plan.. or is it?

What if we went after the cache? Let’s run this little query:

SELECT  deqp.query_plan,
dest.text,
SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1,
(deqs.statement_end_offset – deqs.statement_start_offset)
/ 2 + 1) AS actualstatement
FROM    sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE   deqp.objectid = OBJECT_ID(‘dbo.ufnGetContactInformation’);

This query doesn’t do anything really fancy. I’m using the sys.dm_exec_query_stats which holds aggregated performance metrics for any query currently in cache because it has the plan_handle that lets me look up execution plans in sys.dm_exec_query_plan. It just so happens that you also get the objectid there in the plan so we can look for the specific plan that corresponds to our UDF (I’m using an example from AdventureWorks2008R2). The results from the query look like this:

image

Each row represents a statement that has been executed from the UDF. There’s just a single plan for the UDF, and it looks like this:

image

Without even drilling down to all the details of what’s going on in the plan, I think we can agree, that this UDF is not zero cost, but may in fact be extremely expensive depending on what’s going on inside that execution plan.

Why is this useful? Because it lets you look behind that simplistic… lie, that is shown as a plan for the execution of a UDF to see what the UDF is actually doing. This may make it possible to improve the performance of the UDF since you’ll be able to identify missing indexes and other information thanks to your direct access to what’s going on inside the UDF.

I had demoed this during my 24 Hours of PASS presentation and was asked to put the code up on my blog, so here you go. I hope it’ proves helpful.

May 25 2011

Oh ****!

ohnoDid you ever accidentally close SQL Server Management Studio? And, in closing SSMS, did you get the prompt that says “Save changes to the following items?” And did you, completely unthinkingly, with a query you had just been working on, hit Cancel? Yeah, me neither. What kind of idiot does that….

OK. I confess. I just did that. Silly thing it was, but I had just spent at least 1/2 an hour working on a query and now it was gone…. or was it? I had just run the query and had been looking at the results when I closed SSMS. Initially, I panicked and started thinking about how I could get the data back (somewhere there’s a file I’ve heard). Then it occurred to me, I had just been writing queries against the cache using DMOs. Why don’t I just pull it using the DMOs I had just been using? Worked like a charm.

[sourcecode language="sql"]SELECT  dest.text
FROM    sys.dm_exec_query_stats AS deqs
        CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE   deqs.last_execution_time > ’5/19/2011 11:00′
        AND dest.text LIKE ‘WITH%’;
[/sourcecode]

My query was much more complicated, but this was all I needed. I was able to filter out the junk in cache by only selecting stuff with an execution time within about 15 minutes of when I had closed SSMS, and I supplied the start of the query, a CTE. That was all I needed. I got back my query. Took a little formatting work, but it was immediately available and all was right with the world.

Just posting this in case you hit the same situation. You too can get your query back, easily and quickly. Then again, maybe I’m the only one stupid enough to do that.

May 23 2011

SQL Azure Query Tuning

SQL Azure is still SQL Server at the end of the day. This means it is entirely possible to write queries against SQL Azure that really… what’s a good word… stink. So what do you do? It’s all in the cloud. You couldn’t possibly tune the queries, right? Wrong. Many of the same tools that you have available to you, such as execution plans and dynamic management objects, are still available in SQL Azure.

Let’s talk DMOs for a second. First off, don’t make the mistake I did of trying to run these outside the context of a specific database on SQL Azure. You’ll get extremely inconsistent results, trust me on this. Anyway, I did a quick run-down on some of the most used DMOs for performance tuning, the sys.dm_exec_* set. Here’s a complete listing of those procs and whether or not they’re available to you in SQL Azure:

[sourcecode language="sql"]SELECT  *
FROM    sys.dm_exec_requests AS der
–available

SELECT  *
FROM    sys.dm_exec_requests AS der
        CROSS APPLY sys.dm_exec_query_plan(der.plan_handle)
–available

SELECT  *
FROM    sys.dm_exec_requests der
        CROSS APPLY sys.dm_exec_sql_text(der.sql_handle)
–available

SELECT  *
FROM    sys.dm_exec_query_stats AS deqs
–available

SELECT  *
FROM    sys.dm_exec_cached_plans AS decp
–invalid object

SELECT  *
FROM    sys.dm_exec_connections AS dec
–available

SELECT  *
FROM    sys.dm_exec_cursors AS dec
–invalid object

SELECT  *
FROM    sys.dm_exec_requests AS der
        CROSS APPLY sys.dm_exec_plan_attributes(der.plan_handle) AS depa
–invalid object

SELECT  *
FROM    sys.dm_exec_procedure_stats AS deps
–invalid object

SELECT  *
FROM    sys.dm_exec_query_memory_grants AS deqmg
–invalid object

SELECT  *
FROM    sys.dm_exec_query_optimizer_info AS deqoi
–invalid object

SELECT  *
FROM    sys.dm_exec_query_resource_semaphores AS deqrs
–invalid object

SELECT  *
FROM    sys.dm_exec_sessions AS des
–available

SELECT  *
FROM    sys.dm_exec_requests AS der
        CROSS APPLY sys.dm_exec_text_query_plan(der.plan_handle, 0, -1) AS detqp
–available

SELECT  *
FROM    sys.dm_exec_trigger_stats AS dets
–invalid object

SELECT  *
FROM    sys.dm_exec_xml_handles(@@SPID)
–invalid object
[/sourcecode]

The good news, most everything you need is available so you’re really going to be able to go to town on using DMOs as part of your query tuning. The bad news, “most everything” doesn’t include sys.dm_exec_query_optimizer_info is not on the list. This does take away a tool. It’s not a vital tool, but it’s one that allows you to understand some of what’s happening on the system. I’m not going to cry and rend my clothing because I don’t have it, but I will be somewhat disappointed.

This is great news! The tools you’re learning and using (and you are using DMOs, right?) will continue to be available in SQL Azure.

Apr 06 2011

SQL University: Index Usage

SQL-University-Shield-268x300Greetings. Welcome once more to the Miskatonic University branch of SQL University. Does anyone know where to buy some camping equipment, cheap? I’ve been tagged to go an expedition to Antarctica and I need some cold weather gear a bit more substantial than my LL Bean boots. Evidently the last expedition found some caves in some mountains down there. Sounds like the perfect place to get away from all the crazy stuff that goes on here at Miskatonic. I mean, what could happen?

Anyway, our last several talks have all been about indexes and indexing. One of the things that we haven’t talked about is how to tell if, how or when your indexes are being used. Starting with SQL Server 2005, and continuing to 2008 and R2, there has been a wonderfully easy way to do just this through Dynamic Management Objects (DMOs). Specifically three DMOs: sys.dm_db_index_usage_stats, sys.dm_db_index_operational_stats(), and sys.dm_db_index_physical_stats().

sys.dm_db_index_usage_stats

The first, sys.dm_db_index_usage_stats is a fascinating Dynamic Management View (DMV). It returns information that shows how many times various operations have occurred on an index, and when the last time it was done. The information is stored since the last time the server was started, or when a database is attached or restored. If a database is detached or closed, the data for that database is removed from the DMV. It breaks the data down so that it shows user information separately from system information. This means you can see the number of times a user has updated the index as opposed to when the system did an update as part of maintenance. You can use this to get an idea of which indexes are used within your system and which ones are not.

Since it’s a DMV, it’s incredibly easy to use. Here’s a query that shows the number of times that user access have occurred on the SalesOrderDetail clustered index:

[sourcecode language="sql"]SELECT  ddius.user_lookups,
        ddius.user_scans,
        ddius.user_seeks,
        ddius.user_updates,
        ddius.index_id
FROM    sys.dm_db_index_usage_stats AS ddius
WHERE   database_id = DB_ID(N’AdventureWorks2008R2′)
        AND object_id = OBJECT_ID(‘AdventureWorks2008R2.Sales.SalesOrderDetail’)
        AND index_id = 1;[/sourcecode]

You can combine this with other queries and other DMOs to put together interesting information. But the key is to remember that you can only rely on this information to a certain degree. You can’t simply assume that the information here will tell you precisely which indexes have been used and which have not. You will have to assume that it might not cover all possible uses of indexes in your system because of a reboot on your server. If your server and the database that you’re interested in have been continuously online for over a year and you don’t have 18 month queries or something else, maybe. Otherwise, you’re taking a chance to simply use this to help clean up your system. However, it can help you clean up your system. Just do so with the knowledge of what this represents.

sys.dm_db_index_operational_stats

The next object is sys.dm_db_index_operational_stats(). This is a Dynamic Mangement Function (DMF) which means you must pass it parameters. With this DMF you can see current locking for every table or index. The call looks something like this:

[sourcecode language="sql"]SELECT  *
FROM    sys.dm_db_index_operational_stats(DB_ID(‘AdventureWorks2008R2′),
      OBJECT_ID(‘Sales.SalesOrderDetail’),
      NULL, –IndexID
      NULL –PartitionID
      ) AS ddios[/sourcecode]

The way I have it configured, I’m looking at all indexes for the table Sales.SalesOrderDetail. If I substituted NULL for the OBJECT_ID, I could see all access to tables within the database. Again, if I substituted NULL for the DB_ID I could see all active access across the system. Conversely, if I wanted to drill down, I would also add the IndexID value so that I would only see the activity of a given index and the same thing with the PartitionID.

But to really see it at work, let’s get a transaction going. I can do it very simply. If I open a second connection to the database and run this query:

[sourcecode language="sql"]BEGIN TRAN

SELECT * FROM Sales.SalesOrderDetail AS sod
WHERE sod.SalesOrderID = 42;[/sourcecode]

And then go back and run the original query. I have three rows for the three indexes on the table:

indexoperationstats

This returns all sorts of information about the locks and waits that currently occurring. In this instance you can see that a single page lock was taken out and that there were a pair of page latch waits, all on the first index, index_id = 1, the cluster. This would be expected for a query that is performing a seek on the clustered index.

sys.dm_db_index_physical_stats

The last object is sys.dm_db_index_physical_stats(). This gives you information about your indexes, how deep they are, how many pages, how fragmented those pages are, etc. It’s extremely useful for determining when to defragment your indexes, but it also helps you understand what is being accessed when, why the optimizer chose to access a given index and more. The key to this particular DMF is that it can be very expensive to run. It has three modes; LIMITED, SAMPLED, and DETAILED. You’re going to get varying amounts of data back, but the DETAILED mode will scan the whole index and this can lead to blocking issues on a production system. Be cautious. Calling this DMF is not that different from the others:

[sourcecode language="sql"]SELECT  *
FROM    sys.dm_db_index_physical_stats(DB_ID(‘AdventureWorks2008R2′),
      OBJECT_ID(‘Sales.SalesOrderDetail’),
      NULL, NULL, ‘LIMITED’) AS ddips[/sourcecode]

Running the query in this manner can be used to show general information about the index and it’s level of fragmentation. If you’re interested in understanding more about the index, such as the average amount of space left on a page, you can move to SAMPLED or DETAILED. SAMPLED and DETAILED return the same data, but the SAMPLED information is taken from a 1% sample of the index. Although, if the index is less than 10,000 pages in size, DETAILED is used instead of SAMPLED, so that’s something to be aware of.

Running the query above would return a data set similar to this one:

indexphysicalstats

This is all the data returned from a LIMITED scan, which only hits the tree structure above the leaf or the PFS & IAM pages of a heap. You can see the index types, if they store data outside of the row, the depth, average fragmentation in percentages, fragment counts, fragment size in pages and page count.

Conclusion

This is just an overview of what’s possible with these various DMOs. You can combine them with other DMOs to arrive at use information about your systems and the performance there.

Thanks for stopping by Miskatonic University. Please stop by the gift shop on your way out. Just remember to keep those meteor shards they sell there out of your water supply.

Jan 05 2011

Encryption and the Performance DMOs

Ever wonder what you can see in the performance oriented DMOs when stored procedures were encrypted? Me neither. But, I did get that question during my DMO presentation at the PASS Summit. I did not have an answer. I did get an answer from Johan Bijnens (twitter) from the audience, which I repeated without entirely knowing what I was saying. I decided that I ought to actually know the answer to that question, so here’s a little experiment.

I’m going to create a simple stored procedure:

[sourcecode language="sql"]CREATE PROCEDURE dbo.GetSalesDetails (@SalesOrderId INT)
AS
SELECT soh.AccountNumber,
sod.LineTotal
FROM Sales.SalesOrderHeader AS soh
JOIN Sales.SalesOrderDetail AS sod
ON soh.SalesOrderID = sod.SalesOrderID
WHERE soh.SalesOrderID = @SalesOrderID[/sourcecode]

When I create this procedure and run it, you can see the general performance of the query being run by pulling data from the sys.dm_exec_procedure_stats DMO like this:
[sourcecode language="sql"]SELECT deps.type_desc,
deps.last_execution_time,
deps.execution_count,
deps.total_logical_reads,
dest.encrypted AS EncryptedText,
dest.text,
deqp.query_plan,
deqp.encrypted AS EncryptedPlan
FROM sys.dm_exec_procedure_stats AS deps
CROSS APPLY sys.dm_exec_sql_text(deps.sql_handle) AS dest
CROSS APPLY sys.dm_exec_query_plan(deps.plan_handle) as deqp
WHERE dest.text LIKE ‘CREATE PROCEDURE dbo.GetSalesDetails%’[/sourcecode]

Now, to modify the procedure so that it’s encrypted I’m going to recreate it with a slight modifcation:
[sourcecode language="sql"]CREATE PROCEDURE dbo.GetSalesDetails (@SalesOrderId INT)
WITH ENCRYPTION…[/sourcecode]

Now, if I execute the procedure and rerun the select statement against the DMO, I won’t get any data. Why? Because of the WHERE clause. The text of the procedure is no longer available in the sys.dm_exec_procedures_stats DMO. Encryption has worked. I can’t see the SQL and I can’t see the execution plan. I will however, see values in the EncryptedText and EncryptePlan columns, showing that despite the encryption, rows for the procedure in question do exist in the appropriate DMOs.

There’s the answer to the question.

EDIT: Fixed the spelling of Johan’s name. Sorry Johan!

Oct 04 2010

How to Tell if Execution Plans are Reused

I try to watch the search phrases that point people to the blog because sometimes, you get a sense of what problems people are running into. The latest question or phrase I’ve seen a lot lately is along the lines of “how do you know if an execution plan is being reused.”

Since compiling an execution plan can be an extremely expensive operation, it’s worth your time to understand how well a given plan is getting reused. If you’ve seen me present, I’ll frequently talk about the application that had a query with an 86 table join. Recompiles on that thing were frequent and extremely costly. The only good news was, they were recompiles. If we weren’t getting plan reuse it would have been an even worse system than it was.

There are a number of ways you can see if a plan is being reused. One of the easiest is to hit the DMOs that look into the plan cache. You can look at either sys.dm_exec_procedure_stats or sys.dm_exec_query_stats. These DMOs are somewhat different and somewhat the same. Actually, they sort of cross each other. For sys.dm_exec_procedure_stats, you get aggregate information about stored procedures. For sys.dm_exec_query_stats you get aggregate information about queries, which, may be run within stored procedures. So either or both could be useful depending on what you’re attempting to measure. However, both return a very useful counter, execution_count. Quite simply, that tells you that the plan, whether for the statement or the procedure, is being reused. To get maximum gain out using either of these, you’ll want to combine them with other DMOs. Something along these lines can show you the plans for procedures that have been running against a system, ordered by the number of times they’ve been reused:

[sourcecode language="sql"]SELECT deps.execution_count ,
OBJECT_NAME(deps.object_id, deps.database_id) ‘Procedure’ ,
deqp.query_plan
FROM sys.dm_exec_procedure_stats AS deps
CROSS APPLY sys.dm_exec_query_plan(deps.plan_handle) AS deqp
ORDER BY deps.execution_count DESC[/sourcecode]

The only thing wrong with using the DMO like this is that you can only see what’s currently in cache. This means no history, depending on how volatile the cache is on your system.

Another way to tell if a plan is being reused, is to set up a server side trace and capture the event for the SP:CacheMiss or SP:CacheHit. This is pretty straight forward. If the query was not in, it generates a miss event. If it was there, it generates a hit. But, this has to be running in order for you to know if you had a hit or a miss. The good news is, if it’s running, you’ve got historical information since this captures the event as it occurs.

You can also catch recompile events using SQL:StmtRecompile. Yes, you can get SP:Recompile if you’re only ever dealing with procedures, but if you’ve got any kind of ad-hoc querying going on in the system or triggers, you’ll miss recompile events. Basically, because recompiles are at the statement level starting in 2005, BOL recommends only using SQL:StmtRcompile.

That’s pretty much it. Using these methods in the appropriate place will let you know if the plan is being reused or not.

Mar 18 2010

Undocumented Virtual Column: %%lockres%

One of my development teams needed a mechanism for identifying the value of a key that was part of a lock (don’t ask). I’d never tried doing that before. Obviously if you hit the DMV sys.dm_os_tran_locks you can see the hash of the key in the resource_description column. But how to pull the value back. After some research, I first found this excellent article by the late, great, Ken Henderson (I really wish he was still around). The article outlined, among other things, the use of an undocumented “virtual” column called %%lockres%%. Some more searching then uncovered this great article by James Rowland-Jones, AKA Claypole. He described how, in a very high volume system, he used %%lockres%% to identify the source of a deadlock as the internal mechanisms that SQL Server uses to manage locks, the hash of the key. Oh, and he opened an incident on Connect, which seems to be closed, but vote on it anyway, I did. %%lockres%% is also covered in Kalen Delaney’s excellent book on SQL Server 2008 Internals and even warrants a bit of discussion in Professional SQL Server 2008, but that was written by James Rowland-Jones, so I’m not sure it counts.

In the meantime, while I was investigating this stuff, evidently the development team was looking into it on their own. They came to the same set of resources and decided to use the virtual column as part of their real-time, transactional application. Yeah, an undocumented “virtual” column going into a major application. Since I would probably be unable to do anything about this, I decided to at least look into how this thing behaves so I can be aware of what types of problems I might run into.

First, a simple query:

SELECTa.City
–,%%lockres%%
FROM Person.Address AS a
WHERE a.AddressID = 432

If you run this query and take a look at the execution plan you’ll see a nice clean clustered index seek, just as you would suspect. If you take away the comment and run it again, the execution plan is identical. On the version of AdventureWorks2008 currently installed on my machine, I get two page reads, regardless of whether or not I include %%lockres%% or not. With the comments removed, it returns the hash of the primary key: (b0004e040bc2). This looks pretty painless, free even.

If we want to see %%lockres%% in action, it’s not too difficult:

BEGIN TRAN
UPDATE Person.Address
SET City = ‘dude’
WHERE AddressID = 432;
–ROLLBACK TRAN

Obviously this will put a key lock on that row in the table. If I just select against sys.dm_os_tran_locks, the data returned looks like this:

resource_type   resource_description   resource_associated_entity_id   request_mode
KEY                       (b0004e040bc2)            72057594043564032                      X 

The original request from the development team was for a way to get the key value back when you know that a table is locked, such as the case here. I wrote this simple query to make that happen:

SELECT a.AddressID
FROM person.address(NOLOCK) AS a
JOIN sys.dm_tran_locks AS dtl
ON a.%%lockres%% = dtl.resource_description
WHERE dtl.resource_type = ‘KEY’

This query works and returns our key value of 432 just as you would want. But, take a look at the execution plan:

Yes, that’s a clustered index (or table, same thing) scan followed by a Sort followed by a merge join, processing 19614 rows to return one. But hey, it was only 341 reads. To say the least, I’m not excited about seeing this in a production system. This was explicitly cautioned in Kalen Delaney’s book. While it appears that the remote scan operator, which is how the DMV is accessed in this case, is 59% of the operation, that’s the estimated cost and has been pointed out before, isn’t the best measure of real cost in the system.

The development team went off and developed their own query, they had said they were looking for the key value, but evidently they were looking for who was holding the lock on a particular key value:

SELECT s.nt_user_name
FROM sys.dm_tran_locks l
INNER JOIN sys.dm_exec_sessions s
on l.request_session_id = s.session_id
inner join sys.partitions p on l.resource_associated_entity_id = p.hobt_id
where OBJECT_NAME(p.object_id) = ‘Address’ and
l.resource_description in (select %%lockres%%
from person.Address(NOLOCK) a WHERE a.AddressID = 432)

I actually had to adjust their query just a bit to get it to work correctly, but basically they had the right idea. Here’s the final execution plan:

This was still not terribly inspiring a thing to think about running in a production system although it only had one scan and seven reads. Whether or not putting this in a transactional system is a good idea, it certainly adds yet another tool, albeit an undocumented one, to the tool belt.