Feb 27 2009

Dissecting SQL Server Execution Plans Chapter 3

The third chapter has been published over at Simple Talk. So if you’re interesting the book, you can go to Red Gate to get a free e-book copy of it, or you can see chapters one and two and now three over at Simple-Talk. And for those who just feel the need to kill a tree, I’m told we should have a print version real soon now and available from Amazon.

Feb 26 2009

Shrinking Databases

Tim Ford put up a hilarious blog post outlining an instance where shrinking databases is acceptable. I’m pretty sure the development method he describes is well documented and in use in more than one location around the world, including somewhere in my company. Read it.

Feb 25 2009

SQL Server 2008 Management and Administration

I finally wrote up my review of the book SQL Server 2008 Management and Administration. Todd Robinson, the technical editor, sent me a copy a while back. It’s a good book. I’ve already found it useful several times.  Worth a read. Go pick up a copy.

Feb 25 2009

Three Kinds of Execution Plans

You read that correctly, three kinds of execution plans. You may have thought that all you had to deal with are estimated and actual, but there is one more. The estimated plan is the plan that comes out of the optimizer. It’s based on statistics and indexes and known objects within the system. The actual plan is the plan that was used to execute the query and will show all the actual number of rows processed, etc. It might be different than the estimated plan because the stastics were off or for any number of other reasons. Those were the ones you knew about. There is also the plan that gets stored in the plan cache, the compiled plan.

I lied. The compiled plan and the estimated plan are the same thing, but I wasn’t really that aware of it until I started working with the sys.dm_exec_query_plan DMV. One day I went to look at the statistics of a plan after I executed it, prompted by a discussion at SQL Server Central. Instead I found an estimated plan. I flipped and began running all sorts of tests until I realized that, of course, the plan stored is an estimated plan. The number of rows changes over and over again as different data comes into the system or different parameters change the data accessed by any given execution plan. The system isn’t going to update the plan in place over & over. It just makes sense. But I had to see it in action.

First, I created a dummy table in AdventureWorks, just to play with. Some of these scripts are modified from Dissecting SQL Server Execution Plans (I’m lazy):

-- query to create the test table
IF EXISTS ( SELECT *
FROM sys.objects AS o
WHERE o.[object_id] = OBJECT_ID(N'dbo.[NewOrders]')
AND o.[type] IN (N'U') )
DROP TABLE dbo.[NewOrders]
GO
SELECT *
INTO dbo.NewOrders
FROM Sales.SalesOrderDetail AS sod
GO
CREATE INDEX IX_NewOrders_ProductID ON dbo.NewOrders (ProductID)
GO

Then comes the meat. First I’m clearing out the cache so that I can see exactly what gets stored there. I run a simple select statement and capture the estimated execution plan. I’m using SET SHOWPLAN_XML ON so that I can capture execution plans while running multiple scripts. Then I check the cache to see if anything is there. After that, because I want to see differences in my plans, I update my test data and the statistics on the data so that the exact same query produces a different execution plan. Here’s the code:

DBCC FREEPROCCACHE()
GO
--estimated plan
SET SHOWPLAN_XML ON
GO
SELECT nwo.OrderQty
,nwo.CarrierTrackingNumber
FROM dbo.NewOrders AS nwo
WHERE nwo.ProductID = 897
GO
SET SHOWPLAN_XML OFF
GO

–what’s in cache now?’
SELECT deqp.query_plan
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
GO

BEGIN TRAN
UPDATE dbo.NewOrders
SET ProductID = 897
WHERE ProductID BETWEEN 1 AND 10000
GO
UPDATE STATISTICS dbo.NewOrders
GO
–actual plan
SET STATISTICS XML ON
GO
SELECT nwo.OrderQty
,nwo.CarrierTrackingNumber
FROM dbo.NewOrders AS nwo
WHERE nwo.ProductID = 897
ROLLBACK TRAN
GO
SET STATISTICS XML OFF
GO

–and the cache again
SELECT deqp.query_plan
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp

The estimated plan looks like this:
fig12
The actual plan looks like this:
fig21
And the plan out of the cache looks like this:
fig31
You can see the differences as the plans are generated from different points in the script, but the final storage is an estimated plan, even though, in this case, it doesn’t look like the original estimated plan.

Gail Shaw over at SQL In Wild has a much better description of what’s going on than I’m going to attempt here. I wanted to show it to you in action.

Feb 23 2009

Public Speaking

A great post on the how’s, why’s, etc., of presenting by Paul Randal. I found it because of Database Weekly. Some of the discussion are things I’ve read about in other places with more more or less detail. But please, don’t think I’m knocking it in any way. It’s a unique look at presenting from someone who does quite a lot of it. There were a number of topics and ideas that I hope I can apply going forward.

I understand that PASS is going to open the gates for abstracts soon. I think I’ve got three ready this year, although one of them is a bit rocky. Still, armed with a more knowledge than I had, thanks to Paul Randal’s help, I think I can do better.

Feb 19 2009

Networking

Andy Warren has just posted the last entry in a very interesting set of posts about building and working your own network. I think they’re all worth a read, especially if you’ve been thinking about blogging, Twitter, LinkedIn or just volunteering at your local users group.

Feb 18 2009

Refining the Missing Index Data

In my previous post I showed how you could determine if a query had missing index information, but it was difficult to get meaningful data out because I didn’t know XQuery well enough. I spent some more time refining the query and here are the results. This new query will show some basic performance information gathered from the aggregate data in sys.dm_exec_query_stats. It combines this with the full data pulled from the Missing Indexes element in the XML of the execution plan. I’ve got it listing all the recommended columns and grouping. So this means that the performance data is repeated in order to allow for the full listing of groups & columns. It seems to work well.

A couple of interesting points. My purpose is to provide a short-cut method of identifying possible indexing needs for systems that run lots of dynamic or ad hoc queries. In my own environment I’m mainly talking about either Microsoft CRM or nHibernate, but the rules would apply to almost any other client-side query engine. I wanted to list the database name as part fo the query. DBID is included with the sys.dm_exec_query_plan, but unfortunately, when a query is prepared or ad hoc, this value is null. Luckily enough the data is available in the execution plan itself. That’s going to be something to watch for in the future when using these execution plans.

Here’s the query:
WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS sp)
SELECT p.query_plan.value(N'(sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/@Database)[1]', 'NVARCHAR(256)') AS DatabaseName
,s.sql_handle
,s.total_elapsed_time
,s.last_execution_time
,s.execution_count
,s.total_logical_writes
,s.total_logical_reads
,s.min_elapsed_time
,s.max_elapsed_time
,p.query_plan
,p.query_plan.value(N'(sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/@Table)[1]', 'NVARCHAR(256)') AS TableName
,p.query_plan.value(N'(/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/@Schema)[1]', 'NVARCHAR(256)') AS SchemaName
,p.query_plan.value(N'(/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/@Impact)[1]', 'DECIMAL(6,4)') AS ProjectedImpact
,ColumnGroup.value('./@Usage', 'NVARCHAR(256)') AS ColumnGroupUsage
,ColumnGroupColumn.value('./@Name', 'NVARCHAR(256)') AS ColumnName
FROM sys.dm_exec_query_stats s
CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) AS p
CROSS APPLY p.query_plan.nodes('/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex/sp:ColumnGroup') AS t1 (ColumnGroup)
CROSS APPLY t1.ColumnGroup.nodes('./sp:Column') AS t2 (ColumnGroupColumn)
WHERE p.query_plan.exist(N'/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan//sp:MissingIndexes') = 1
ORDER BY s.total_elapsed_time DESC

Feb 12 2009

Missing Index Information and Query Stats

So the goal was to find a way to pull information from sys.dm_exec_query_stats so that we could identify poor performing procedures that were in cache at the moment and combine it with missing index information from sys.dm_db_missing_index_details. We’re combining these because we’re working with a Microsoft Dynamics CRM database that is almost all ad hoc queries and lots of them are against tables with missing indexes. The hope was to identify necessary indexes merely by looking at the longest running queries.

Unfortunately there is no way to combine data from the missing indexes set of DMV’s and all the execution DMV’s that show query stats, execution plan, etc. None of the missing index tables has a plan handle or a plan hash column that would allow you to combine that data with the query data to identify which queries would directly benefit from the index if it were created.

But, if you look at the query plans in sys.dm_exec_query_plan, you can see the missing index information there. What to do? XQuery.

Since the query_plan is stored as XML, simply writing a small XQuery exist() function will do the trick:

SELECT  TOP 10 *
FROM sys.dm_exec_query_stats s
   CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) AS p
WHERE  p.query_plan.exist(
‘declare default element namespace “http://schemas.microsoft.com/sqlserver/2004/07/showplan”;
/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple/QueryPlan//MissingIndexes’) = 1
ORDER BY s.total_elapsed_time DESC

This is a pretty simple example, and yes, in reality you would not want to use SELECT *.  You would want to specify those columns that you were really interested in. Also, this allows you to get performance information from queries that show a MissingIndexes element in the XML of the showplan,  but I haven’t pulled the Missing Index data out and displayed it. That can be done, but I’ll leave it as homework for you for now (and because I’m still having a hard time with XQuery).

A quick nod to Tim Ford for the advice on the DMV’s. Please finish that book soon.

Feb 11 2009

PASS Chapter Spotlight

The Southern New England SQL Server Users Group was picked for the February spotlight on the PASS web site. This was largely based on the work we did helping Adam Machanic run the New England Data Camp. I’ve said it before and I’ll say it again, we helped. Adam did 90% of the work. It’s still good to be singled out. Thanks PASS and thanks to the volunteers at SNESSUG.

Feb 11 2009

PASS Editorial Board: Blog Directory

Andy Warren and the PASS Editorial Board have posted their first delivery, a blog directory of SQL Server blogs. That’s as complete a list as you’re likely to see, anywhere, on blogs related to SQL Server. Well done everyone.

This is just the beginning of a new and growing set of services from the PASS organization. If you’re not a member, get on over there and join up. It’s free. While you’re there, get hooked into the  PassPort networking service.

Now I need to get more work done on my part of the deliverables.