Mar 18 2014

Finding Mistakes

Ever had that moment where you start getting errors from code that you’ve tested a million times? I had that one recently. I had this little bit of code for pulling information directly from query plans in cache:

WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
QueryPlans AS
(
SELECT RelOp.pln.value(N'@PhysicalOp', N'varchar(50)') AS OperatorName,
RelOp.pln.value(N'@NodeId',N'integer') AS NodeId,
RelOp.pln.value(N'@EstimateCPU', N'decimal(10,9)') AS CPUCost,
RelOp.pln.value(N'@EstimateIO', N'decimal(10,9)') AS IOCost,
dest.text
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
CROSS APPLY deqp.query_plan.nodes(N'//RelOp') RelOp (pln)
)

SELECT  qp.OperatorName,
        qp.NodeId,
        qp.CPUCost,
        qp.IOCost,
        qp.CPUCost + qp.IOCost AS EstimatedCost
FROM    QueryPlans AS qp
WHERE   qp.text = 'some query or other in cache'
ORDER BY EstimatedCost DESC;

I’ve probably run this… I don’t know how many times. But… I’m suddenly getting an error:

Msg 8114, Level 16, State 5, Line 7
Error converting data type nvarchar to numeric.

What the hell? There is no where this should be occurring. I dig through the query over and over and I can’t figure it out. Until… I finally notice that one of the operators in the plan has the CPUCost value stored as a float:

EstimateCPU=”1e-006″

Ummmm, since when? Since forever. I’ve just been lucky with my code. I’d just never hit a sufficiently small cost in the plans before. I hadn’t bothered to look for the actual data type in use in the schema definition, although it’s right there:

<xsd:attribute name=”EstimateCPU” type=”xsd:double” use=”required”/>

 

Long-Kiss-Goodnight-samuel-l-jackson-14034435-720-480

I never did one thing right in my life, you know that? Not one. That takes skill.

What did I do wrong? I was looking at the data output from the queries and in the plans rather than looking at the structure to know what to expect. It’s the kind of thing I would never do with T-SQL. I would always look to the table structure to know what data type a given column was. But in this case, with the XML, I looked at the data and made an assumption. And we all knows what that means. It makes an ass out of you and mption.

Or, in this case, me and mption.

Anyway, the corrected query is a pretty trivial change:

WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
QueryPlans AS
(
SELECT RelOp.pln.value(N'@PhysicalOp', N'varchar(50)') AS OperatorName,
RelOp.pln.value(N'@NodeId',N'integer') AS NodeId,
RelOp.pln.value(N'@EstimateCPU', N'float') AS CPUCost,
RelOp.pln.value(N'@EstimateIO', N'float') AS IOCost,
dest.text
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
CROSS APPLY deqp.query_plan.nodes(N'//RelOp') RelOp (pln)
)

SELECT  qp.OperatorName,
        qp.NodeId,
        qp.CPUCost,
        qp.IOCost,
        qp.CPUCost + qp.IOCost AS EstimatedCost
FROM    QueryPlans AS qp
WHERE   qp.text = 'SELECT * FROM HumanResources.vEmployee AS ve'
ORDER BY EstimatedCost DESC;

But I do feel bad if anyone has been using this the way I showed it. ‘Cause, yeah, I’ve demonstrated with this code in the past. Oops. However, great point. Especially when working with a public XML schema like this, it pays to go and look at that schema the same way you would a table in order to ensure that you’re using the correct data types.

Feb 12 2014

SQL Server 2014 and the New Cardinality Estimator

Cardinality, basically the number of rows being processed by an operation with the optimizer, is a calculation predicated on the statistics available for the columns in question. The statistics used are generally either the values from the histogram or the density. Prior to SQL Server 2014, and going all the way back to SQL Server 7.0 (in the Dark Ages when we had to walk uphill to our cubicles through 15 feet of snow battling Oracle DBAs and Fenris the whole way), there’s been one cardinality estimator (although you can modify the behavior somewhat with a traceflag in 2008R2 and 2012). Not any more. There’s a possibility for really complex, edge-case queries, that you may run into a regression from this.

You control whether or not you get the new cardinality estimator by setting the Compatibility Level of the database to SQL Server 2014 (120 for the picky amongst us). This could lead to regression issues. So, you’re going to pretty quickly want to know if your execution plan is using the new Cardinality Estimation Model, right? It’s tricky. Just look at the properties of the first operator in the plan (I told you to use that first operator). You’ll find one value there that will tell you what you need to know:

CardinalityEstimator

Just check this value (which you can also get from the XML behind the graphical plan) to see what calculations the optimizer used to arrive at the plan you’re observing.

Sep 16 2013

Why the Lazy Spool Is Bad

First thing, there are no bad operators, just bad parents, uh, I mean query writers, or database designers, or ORM tools. Why do I say this? Because all the operators within a query execution plan serve a purpose. They are there to fulfill a task. Depending on where and when you see them, they’re doing exactly what you ask of them. The issues come up because you’re asking them to do a task that they may not be well suited for. This comes from inappropriate structures and inappropriate code. Lazy spools are not really bad (that was just link bait). In fact, depending on the query, what’s being done, how you’re retrieving data, what data is being retrieved, the lazy spool is actually awesome. But, it’s good to know what these things are doing because, like magic, the lazy spool comes with a price.

The spool operators within SQL Server are indications of the need to store data for reuse. That’s it. A spool is just gathering up information, putting it into a temp table so that it can use that data again. Key word and trick phrase in that last sentence, temp table. Again, depending on the needs of the query, temporary storage is a great solution. BUT, you are creating more stuff into tempdb, additional overhead to your disks, all that’s associated with this. So if your system is already under load in this area, depending on the query in question, you may need to adjust in order to eliminate the spool. The use of tempdb is the price you pay for the lazy spool.

A lazy spool is a non-blocking operator. That means that it doesn’t stop the flow of data while it gets loaded. Instead, it acts more or less as a pass through, letting data flow through it as needed by the downstream operations. That makes a lot sense when you consider that the lazy spool is loaded in a “lazy” fashion, or, as requested. The eager spool for example stops all operations until it completes loading it’s temporary object, loading in an “eager” fashion. Once data is loaded into the lazy spool, it can be used by other operators within the plan.

Let’s see this in action. I’m going to call a recursive CTE query from AdventureWorks2012

EXEC dbo.uspGetEmployeeManagers @BusinessEntityID = 9

This query results in the following execution plan (click on it to expand it).

Lazy

Reading plans that involve spools, is where you really see the need to understand both the logical and physical processing order within an plan. The reason I say that is that if you just read this from right to left, the physical order, you’re likely to think that the spool is being loaded all the way over there on the right. Instead, it’s being loaded by Node ID 5, which I show below.

LazyDetail

This shows how the logical processing order, left to right, like reading a book, has to be taken into account to deal with query plans. Now, in this case, the spool is how the optimizer is dealing with the recursive nature of the query. It loads data into a spool operator in a lazy fashion and that data is there when needed later within the operations of the query. In short, the lazy spool in this case is doing a good thing and helping you out.

So, why the lazy spool is bad? Well, it isn’t. It’s just not free.

For more on spools, you can read an older post of mine here, excellent information from Fabiano Amorim here, and an example of forcing a spool to improve performance by Paul White here.

Want to talk more query tuning? Let’s get together in Dallas on November 1st, 2013 where I’ll be doing an all day pre-conference seminar before SQL Saturday 255.

Sep 11 2013

Execution Plan Cost Estimates

It’s been emphasized over and over that the costs of operations within an execution plan, and the estimated costs of the plan themselves are, in fact, estimates. But it goes further than that. The estimated values are based on statistics, or the lack thereof. Statistics themselves are also estimates. This means that the costs you’re seeing are extrapolations based on extrapolations. So, you should just ignore those values and move on, right? Wrong.

In order to understand how the optimizer is choosing to put together an execution plan for your query so that you can use that understanding to then make intelligent choices as to modifying the query or the structure of your database, you must use the values you have at hand. However, you must also understand where and how those values were derived in order to make a determination on how much faith you can put into them (because you are simply placing faith in those numbers). Above all else, this means you must understand your data as presented by the statistics and what these statistics represent in terms of defining your data.

For a more detailed discussion of what exactly statistics represent, see my article in Simple-Talk.

Next, you need to understand what the operator itself is doing. In the majority of cases, this means just reading the description. They’re usually pretty clear. But sometimes, it might not be that clear. Further, while the descriptions of an operator may be clear, you need to further understand why or what it is doing, not simply what it is. This means further drill down to reading through the properties of the operator (NOTE: not the tool tip) in an attempt to understand what’s going on. Boogle or Ging will be a friend here. You can search up descriptions of what operators are to assist with your understanding. You need to know when something doesn’t even have statistics, such as a table variable, and therefore the optimizer assumes it only has 1 row, because that’s going to radically affect the cost estimates displayed for you, even in an actual execution plan. That’s right, the actual plan costs are still just estimates.

I wish I could tell you to rely on these numbers, but you can’t. I wish I could tell you to ignore these numbers, but you can’t. These are the only numbers you get that show you what’s happening internally within the query within the optimizer, so you must use them. Just use them with the full knowledge that they are calculations based on other calculations based on extrapolations. In short, a bit of a guess.

Want to talk query tuning, execution plans, optimizer and statistics some more? Come see me at the all day pre-conference seminar in Dallas before the 2013 SQL Saturday there in Dallas this November.

Mar 25 2013

Execution Plans in Azure SQL Database

Microsoft has stated pretty clearly that they’re putting code on Azure first, ahead of the desktop. Which makes one wonder when we’re going to start to see some of this cool stuff within SSMS. What cool stuff you ask? Well, let me explain. Let’s start with a query:

SELECT	m.MovieName,
		msd.MovieStageName,
		st.StageTypeDesc
FROM	dbo.Movie AS m
		JOIN dbo.MovieStage AS ms
		ON m.MovieId = ms.MovieID
		JOIN dbo.MovieStageDefinition AS msd
		ON ms.MovieStageDefinitionId = msd.MovieStageDefinitionId
		JOIN dbo.StageType AS st
		ON msd.StageTypeId = st.StageTypeId
WHERE	m.MovieId = 42;

When I run this on Windows Azure SQL Database (WASD) I get the following execution plan:

Azure Execution Plan

Kind of weird, kind of useful, right? First thing new that I can do is zoom in using that slider bar you seen in the lower left and then graphical part of the plan looks like this:

Azure Execution Plan Zoomed In

The icons have shifted into the corner and you get the costs of every operator. You also get text describing what the operator does. You can scroll around to see the other operators. Pretty cool stuff. Yeah, I know you can zoom in & out on SSMS, but it doesn’t modify the icons in any way, making viewing it zoomed out all but useless. WASD gets better. Notice the tool bar on the left. I have it here in full size so you can see everything:

AzureToolBar

Now things get fun. First, at the top, there is a little arrow pointing to the left. You can hide this tool bar by clicking anywhere within it that’s not on one of the icons. The arrow is indicating the possibility of hiding the toolbar. Once hidden, the arrow changes to pointing to the right. Clicking again expands the toolbar. Immediately below that arrow, you see the really fun stuff: Sort by:. You’ll notice that Total is currently selected. This shows me the total estimated cost for each operator within the plan. But, I can modify the display of the plan so that I only see estimated CPU cost or estimated IO cost. The following graphic shows the same plan with CPU selected. See how the estimated cost percentages have all shifted to show the different highlights?

AzureExecPlanSorted

This is a great way to rearrange the view of the plan, bringing out different information. The only beef I have with it is that it doesn’t really sort the operators so much as change the display. I understand what they mean, but I think it’s misleading. Anyway, useful, but I’ve got more. Moving down the tool bar we find the great part: Find by:. Currently on the plans above you can see that None is selected. But what if I select a different operator such as Seek:

AzureExecPlanFind

Now, each instance of the operator type I selected is highlighted in blue, but, the sort totals for estimated costs are still displaying the numbers in red, so I haven’t lost any of the display that I’ve set up so far. This is wonderful stuff. As you can see, you only get a few operator types, Warnings, Scan, Seek, Merge Join, Hash Match, Nested Loops and Sort. But it’s a great way to begin to explore the plan in ways that we can’t currently within SSMS.

AzureIconsLet’s also note the little icons in the upper right of the plan, displayed to the left in full size. Clicking on these completely changes what’s represented to us from the execution plan. First the little grid looking icon results in this execution plan:

AzureExecPlanGrid

You can click on the graphic to make it bigger so you can see everything. What we have are the operator nodes, listed by Node ID order, showing some of the interesting information from the properties in a grid format. And yeah, because it’s a grid, I can click on any of the column headers and get different sort order for the data. For large plans, this can be a quick and easy way to find the highest estimated cost, or the largest number of actual rows. The grid isn’t a text plan. The grid is just the same XML data that makes up the execution plan laid out slightly differently. Great stuff indeed. What if we click on the little bullet list looking icon?

AzureExecPlanBullets

Now we have a sort of nested display, almost like a human-readable XML. And the little blue icons next to the operator names do allow you to expand and collapse the layout. I haven’t decided how excited I am by this layout because I’m just not sure how I would use it. You can’t resort on the columns even though this looks like a grid. It’s just a way to simplify the display with the collapsible nesting. Still, another tool in the toolbox.

Microsoft is doing some really cool stuff out there on WASD, making the portal there functional and useful.Let’s hope that some of this translates down to the core product.

If you’re interested in learning more about this kind of thing, you should take a look at the pre-conference seminar at TechEd 2013 that I’ll be putting on with Thomas LaRock (b|t) and Dandy Weyn (b|t) in New Orleans and Madrid, Spain. It’s called How to be a Successful DBA in the Changing World of Cloud and On-Premise Data. We’ll spend all day getting you deep into the guts of Azure SQL Database showing how to administer it, tune it, grow it, and make it work. You don’t want to miss it. Click here for New Orleans or here for Madrid to register for this and TechEd 2013.

I promise you, with Tom, Dandy and I all talking at once, this will be informative and entertaining.

Jan 30 2013

Azure SQL Database Execution Plan Differences

I’ve been exploring execution plans in Azure SQL Databases a lot lately. I’m getting a presentation together for  some upcoming SQL Saturday events (first one is SQL Saturday #177, Silicon Valley). If you scroll to the bottom of this previous post, I mentioned that there were clearly differences in the optimizer because queries against empty databases were generating different plans. I’ve loaded up the data in my database, both SQL Server and SQL Database, so I can compare real behaviors. Doing so, I found a fun difference, even though I was running the query and generating the plan from SQL Server Management Studio.

Here’s the property sheet from the SELECT operator for the query run against SQL Server:

AzureSelectProperties

And here’s the property sheet from the SELECT operator for the query run against SQL Database:

AzureSSMSSelectProperties

Two differences jump out. The smaller of the two is that for the same data set, there are differences in estimated values. I noted that before, but it’s fun to see it coming up. By the way, if a good estimate could be defined as “closest without going over” then the optimizer in SQL Database is closer with 1.9 to the SQL Server value of 2.1. Just sayin’. Anyway, the big thing that you may have noticed was the NonParallelPlanReason with a value of CouldNotGenerateValidParallelPlan. Oooh, cool! Not that I would have expected a parallel plan for an estimated cost of .0149347. Assuming SQL Database has a default value of 5 for parallel plans, that’s pretty far off (and I hope they don’t still use that ancient value, nor should you). Some enterprising individual could try to suss out how much cost is needed to get a parallel plan (I don’t have time for that right now). More to come on execution plans and query tuning in Azure SQL Databases.

Jan 23 2013

Saving Execution Plans on Azure SQL Database

In my previous post showing how to get to execution plans in the Database Management Portal, I showed that it’s pretty easy to put a query in a query window and get the execution plan. This allows you to understand query behavior in order to tune your T-SQL or your data structures, all through the Azure interface. But, what happens if you want to share an execution plan with a friend, post it to an online forum, save it for later comparisons as part of troubleshooting bad parameter sniffing, track behaviors over time as statistics change, other purposes that I can’t think of at the moment?

To first answer this question, let me tell you how you would do these things in SQL Server Management Studio (SSMS). First, and most obviously, when you look at the plan in SSMS you can just right click on that plan and select “Save Execution Plan As…” from the context menu. This will allow you to create a .sqlplan file on your OS. From there, do what you want. The file saved is XML, but the .sqlplan format is read by SSMS as a graphical plan. You can look at the underlying XML any time you want by right clicking and selecting the appropriate context menu.

And Azure SQL Database?

Well, things are a little different. Let’s start with the query. After you execute the query to get an actual plan or call up the estimated plan you’ll see the plan in the Portal.

SaveQueryPlanAndQuery

You can see the query, the plan, and options with the plan (to be explored in a later post). If you right click on the plan as you would within SSMS, the context menu is singularly unhelpful since it will only have one option ‘Silverlight.’ (yes, we’re still running Silverlight in this part of Azure, don’t shoot the messenger). So I guess there’s no way to save that plan, right? Wrong. Look up at the top of the image above. You’ll see a button that looks like a 3.25″ floppy disk (kids, get granddad to explain what that is) and the phrase “Save As” underneath it. Click on that and you can save your query. Ah, but there’s a drop down under file types and there you’ll see “SQL Plan File.”

SaveSaveAs

Provide a file name and you can save the execution plan down to your machine. Easy.

Ah, but there is a catch. Let’s say we query the Dynamic Management Objects to pull plans out of the cache. This query will run on both SQL Server and Azure SQL Database:

SELECT	deqp.query_plan
FROM	sys.dm_exec_query_stats AS deqs
		CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
		CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE	dest.text LIKE 'SELECT	m.MovieName,
		msd.MovieStageName,
		st.StageTypeDesc%';

The results, when run from SSMS, will look something like this:

SaveURL

That’s an embedded URL that you can click which will then open the execution plan from the cache. But, when the same query is run from SQL Database Management Portal, the results look like this:

SaveNoURL

 

No options for a clickable interface here to open the plans. You can try copying the output and then pasting it into Notepad then saving as a .sqlplan file. Of course, that’s only going to work for small plans since the result set here is not unlimited. You could build an app to query the cache and output to file. But, in general, querying directly to the plan cache from the SQL Database Management Portal gives you very few options for displaying your query plan.

Since this is such an important part of query tuning, here’s hoping that we get further enhancements to the Management Portal in the future.

Oct 22 2012

Clustered Indexes Have Statistics Too

It may seem obvious, but I’ve heard more than one person suggest to me that statistics on a clustered index just don’t matter. That if the clustered index can satisfy a given query, it’s going to get selected. That just didn’t make any sense to me, but I haven’t seen anyone set up a test that shows how it might work one way or the other. Here you go.

First, I’m going to create a table and load it up with data. I’m intentionally using strings because I don’t want to confuse the ease of management of integers within indexes. I also went for one column that would have a very attractive set of statistics and one that would have a very ugly set. Also, because we’re only dealing with two columns at any given juncture, either a clustered or a non-clustered index would be a covering index. Finally, I didn’t mark the clustered index as unique because I wanted the non-selective clustered index and the highly selective clustered index to both have to deal with that extra bit of processing. Here’s how I set up the table and the data:

CREATE TABLE dbo.IndexTest (
SelectiveString VARCHAR(50),
NonSelectiveString VARCHAR(2))

WITH Nums
AS (SELECT TOP (100000)
ROW_NUMBER() OVER (ORDER BY (SELECT 1
)) AS n
FROM master.sys.all_columns AS ac
CROSS JOIN master.sys.all_columns AS ac2
)
INSERT INTO dbo.IndexTest
(SelectiveString,
NonSelectiveString
)
SELECT n,
CASE WHEN n % 3 = 0 THEN 'ab'
WHEN n % 5 = 0 THEN 'ac'
WHEN n % 7 = 0 THEN 'bd'
ELSE 'aa'
END
FROM Nums;

From there I created the first two indexes:

CREATE CLUSTERED INDEX ClusteredSelective ON dbo.IndexTest
(SelectiveString);

CREATE NONCLUSTERED INDEX NonClusteredNonSelective ON dbo.IndexTest
(NonSelectiveString);

Then I ran each of these queries, both of which are actually going after fairly selective bits of data, although largely relatively speaking in terms of the second query:

SELECT * FROM dbo.IndexTest AS it
WHERE SelectiveString = '2323';

SELECT * FROM dbo.IndexTest AS it
WHERE NonSelectiveString = 'aa';

This resulted in the following two execution plans:

As you can see, the clustered index was used in the first query. It makes sense because we’re querying against the clustered key and it’s a very highly selective key. The second query, despite being against a fairly non-selective key, 48,000 rows out of 100,000, used the non-clustered index. If I drop the non-clustered index and use just the cluster for the second query, the number of reads goes from 110 to 299 despite the fact that the same data is being returned. Clearly there’s a huge advantage to how data is ordered. Also, clearly, the fact that the statistics suggest that the cluster can’t immediately satisfy the query makes the optimizer choose other options. But, what happens if we change the indexes like this:

CREATE NONCLUSTERED INDEX NonClusteredSelective ON dbo.IndexTest
(SelectiveString);

CREATE CLUSTERED INDEX ClusteredNonSelective ON dbo.IndexTest
(NonSelectiveString);

Then, when I rerun my queries, I get these execution plans:

At least to my mind, it’s pretty clear. The statistics for the cluster clearly help the optimizer decide if that index is useful. Yeah, if I drop the nonclustered indexes and then run the queries the clustered index is always used, but that’s not because the cluster is selective or not, it’s because the cluster is the table.

I’m not sure where this concept that the statistics of a clustered do not matter, but, from these tests, it seems that they do.

And remember, in just a couple of weeks I’ll be doing 7 hours of query performance tuning instruction at the PASS Summit. You can sign up, as far as I know, right up to the day of the event. The name of the session is Query Performance Tuning: Start to Finish. I cover gathering metrics, understanding the optimizer, reading execution plans, and tuning queries. It’s a beginner’s level to intermediate course. It should be a lot of fun. Go here to register. I hope to see you there.

Aug 01 2012

Execution Plans … In the Cloud!

If you’re moving to a fully-fledged SQL database as part of Azure you may never even touch SQL Server Management Studio (SSMS). Instead you may do all the management of your SQL database through the Management Portal. You can actually do everything you need to through this tool. This includes writing and testing T-SQL statements. Believe it or not, it includes execution plans. But, if you’re used to seeing this from a query:

image

Then you may be a little bit surprised at this:

image

Believe it or not, those are identical plans. Yeah, the graphics are different, radically so. But the plans, those are the same.

I have a small test database that I have loaded into my local server and onto my Azure SQL Server database. It has identical structures and, thanks to SQL Data Generator, identical data sets. This means that when I run a query, assuming the statistics are the same, I should get identical plans.

For this post, I’m just going to scratch the surface and take a look at plans for this query:

SELECT * FROM dbo.Movie AS m
WHERE m.MovieId = 4138;

As you can see, both plans include the SELECT operator and a Clustered Index Seek operator. But as I’m constantly saying, just looking at the graphical part of the plan is not enough. You have to look at the properties. Let’s start with the SELECT operator. Here’s the properties from SSMS:

image

For comparison purposes, here’s what we get from the Management Portal. I’m going to post it up in pieces so that we can see it all.

When you click on an operator within the portal you’ll see a small set of properties, almost like the tool tip in SSMS. There’s a button there also “View More.” Clicking on that expands out the full set of properties. Here’s the first screen shot:

image

Obviously the order is different. But there are also some data differences. For example, the NodeID. In a plan in SSMS the nodes are numbered in a physical processing order, starting at zero. It’s one way to tell how the query was processed. In the Management Portal you can see that the first node has an ID of 1. The second one is 2. This follows a logical processing order. Also, the SELECT operator is given a NodeID. It does not have one, even in the XML, within SSMS. The other information is pretty standard and exists in both. I’ll keep scrolling down:

image

Here there are more differences. At the top, Statement Id. This is not displayed within the properties of an SSMS plan, but is available within the XML. It just shows which statement you’re working with if you have multiple T-SQL statements. Other pieces are the same, including exactly the same Subtree cost (which means the statistics are identical and the cost is estimated by the optimizer in both Azure and my SQL Server 2012 instance in the same way). Both plans were TRIVIAL and used simple parameterization.The Optimizer Hardware Dependent Properties show differences, but that’s to be expected. However, it does give you a glimpse at what is exposed to SQL Server within the Azure system. For example, the max degree of parallelism for this system is 3. I’ve got an EstimatedAvailableMemoryGrant of 4500062. Compare that with the 208838 on my 16gb laptop and you can see how much bigger the hardware is for the Azure system. Let me scroll down a little more:

image

The last major difference is more visible in this screen capture than the last one. The “Reason for NonParallel Plan” is not visible in the 2012 properties. It’s also not in the XML. This is new information exposed about this query and the execution plan generated.

With this one operator we can see that these plans, while similar, are not identical. I’ll post more about these in the coming weeks.

For more details on query tuning and execution plans, including Azure, please attend my pre-conference seminar at the PASS Summit 2012.

Jul 02 2012

Querying Information from the Plan Cache, Simplified

One of the great things about the Dynamic Management Objects (DMOs) that expose the information in plan cache is that, by their very nature, they can be queried. The plans exposed are in XML format, so you can run XQuery against them to pull out interesting information.

For example, what if you wanted to see all the plans in cache that had a Timeout as the reason for early termination from the optimizer? It’d be great way to see which of your plans were less than reliable. You could so like this:

WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),  QueryPlans 
AS  ( 
SELECT  RelOp.pln.value(N'@StatementOptmEarlyAbortReason', N'varchar(50)') AS TerminationReason, 
        RelOp.pln.value(N'@StatementOptmLevel', N'varchar(50)') AS OptimizationLevel, 
        --dest.text, 
        SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1, 
                  (deqs.statement_end_offset - deqs.statement_start_offset) 
                  / 2 + 1) AS StatementText, 
        deqp.query_plan, 
        deqp.dbid, 
        deqs.execution_count, 
        deqs.total_elapsed_time, 
        deqs.total_logical_reads, 
        deqs.total_logical_writes 
FROM    sys.dm_exec_query_stats AS deqs 
        CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest 
        CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp 
        CROSS APPLY deqp.query_plan.nodes(N'//StmtSimple') RelOp (pln) 
WHERE   deqs.statement_end_offset > -1        
)   
SELECT  DB_NAME(qp.dbid), 
        * 
FROM    QueryPlans AS qp 
WHERE   qp.TerminationReason = 'Timeout' 
ORDER BY qp.execution_count DESC ;

 

I posted a similar version of this query once before (although, I think that one is a little broken). It works fine… But…

This query takes 25 seconds. A big chunk of that is parsing the XML. What if, for a simple query like there, where I’m not doing a lot of conversion & processing with the XML, we ignored it and went instead to something like this:

SELECT  DB_NAME(deqp.dbid), 
        SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1, 
                  (CASE deqs.statement_end_offset 
                     WHEN -1 THEN DATALENGTH(dest.text) 
                     ELSE deqs.statement_end_offset 
                   END - deqs.statement_start_offset) / 2 + 1) AS StatementText, 
        deqs.statement_end_offset, 
        deqs.statement_start_offset, 
        deqp.query_plan, 
        deqs.execution_count, 
        deqs.total_elapsed_time, 
        deqs.total_logical_reads, 
        deqs.total_logical_writes 
FROM    sys.dm_exec_query_stats AS deqs 
        CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp 
        CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest 
WHERE   CAST(deqp.query_plan AS NVARCHAR(MAX)) LIKE '%StatementOptmEarlyAbortReason="TimeOut"%';

 

Now, we’re no longer hooked into getting the XML parsed. But, surprisingly, performance is not much better, sometimes worse in my tests. It probably has something to do with performing a function on a column, the CAST of the query_plan column from XML to NVARCHAR(MAX). What can you do?

Well, there is one other place where execution plans are kept, sys.dm_exec_text_query_plan. Things are a little different in there. Instead of a plan with multiple statements in it, each of these plans is for an individual statement. This is why you must pass in the start & end offsets to call the query. That changes the result sets, a little. You get fewer rows back, but, you also get a lot less duplication, and, we don’t have to cast anything in the WHERE clause. Let’s check it out:

SELECT  DB_NAME(detqp.dbid), 
        SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1, 
                  (CASE deqs.statement_end_offset 
                     WHEN -1 THEN DATALENGTH(dest.text) 
                     ELSE deqs.statement_end_offset 
                   END - deqs.statement_start_offset) / 2 + 1) AS StatementText, 
        CAST(detqp.query_plan AS XML), 
        deqs.execution_count, 
        deqs.total_elapsed_time, 
        deqs.total_logical_reads, 
        deqs.total_logical_writes 
FROM    sys.dm_exec_query_stats AS deqs 
        CROSS APPLY sys.dm_exec_text_query_plan(deqs.plan_handle, 
                                                deqs.statement_start_offset, 
                                                deqs.statement_end_offset) AS detqp 
        CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest 
WHERE   detqp.query_plan LIKE '%StatementOptmEarlyAbortReason="TimeOut"%';

 

Performance on my system dropped from 30 seconds on average to 8 seconds on average. That’s a win by any measure. If you worked on a way to eliminate that wild card LIKE search, it would be even better. Note line 7 above. To be able to click on the query_plan column and see a pretty graphical execution plan, I just have to CAST the text to XML, but that’s not adding to the overhead of the query.

If you’re looking to search within your query plans, you’re still likely to be better off using XQuery to get sophisticated searches on the data, but for really simple stuff, using the sys.dm_exec_text_query_plan may offer you a faster alternative.