May 31 2012

Never, Ever Use Clustered Indexes

This whole concept of the clustered index as a foundational structure within SQL Server is just plain nuts. Sure, I get the concept that if a table has a clustered index, then that index actually becomes the table. When you create a clustered index on a table, the data is now stored at the leaf level of the Balanced Tree (b-tree) page distribution for that index, and I understand that retrieving the data using a seek on that index is going be extremely fast because no additional reads are necessary. Unlike what would happen with a non-clustered index on a heap table.

Yes, I get that if I store my data in a heap, the only way to access the data is through the Index Allocation Mapping (IAM)  pages that define extents and this means that I don’t get the double-linked list of pages that occur within clustered indexes. I know that having to read the IAM leads to additional reads for a heap to look up within the IAM in order to find the locations of the data on the disk.

I realize that updating or deleting a clustered index is helped by being able to use the index itself to find the exact row that needs to be modified or removed. I’ve also seen the tests that show that clustered indexes work faster on inserts in the overwhelming majority of situations within SQL Server. But I still want you to stop using clustered indexes on all your tables within SQL Server. Why? Because that’s how Oracle databases are mostly designed.

I hope you’ve figured out by now that I’m joking about tossing out clustered indexes within your SQL Server databases. I do believe that, unless you have a very thoroughly tested exception, every table within SQL Server should have a clustered index for some of the reasons that I’ve listed above, as well as several others. But Oracle DBAs design their systems differently.

When I see a vendor that makes a product that is exactly the same on Oracle, SQL Server, and possibly DB2 or MySQL, I have to ask myself, just how well is that system going to perform. When I hear someone tell me to design the system using lowest common denominator T-SQL because “we don’t want to be locked into a particular vendor” I have to wonder, again, how are we going to make this system perform. Because, if Oracle likes heaps, but SQL Server likes clusters, how do you design for both? I’d say you can’t.

In fact, I’d argue that you need to design precisely for specific relational database management systems because, let’s face it, they don’t implement the fundamentals in the same way. If you mess up the fundamentals, you’ve just messed up your entire design.

Jun 20 2011

Make the Optimizer Work Harder

One of my favorite indicators for whether or not you have a good execution plan is when you see the “Reason for Early Termination” property in the TSQL operator like this:

image

The optimizer considered this particular plan “Good Enough.” which is what you want to see. When you see “Timeout” as the reason, that’s an indication that the plan you have may be sub-optimal. The question is, can you make the optimizer spend more time on your queries. Well, actually, the question is, should you make the optimizer spend more time on queries. During my session on SQL Cruise I answered the original phrasing of that question, no. As usual when I present in front of people smarter than I am, I was wrong. Brent Ozar (blog|twitter) pointed out that there was a trace flag for forcing the optimizer to spend more time on queries, 2301.

According to Microsoft you can set this trace flag on your system or per user session. Either way, it doesn’t simply make the optimizer spend more time. In fact, what it does is turn on a whole new set of possible optimizations, which causes the optimizer to spend more time. What optimizations you ask? Here’s an excellent article by Ian Jose (blog) outlining exactly what you’re enabling by turning on this traceflag.

Do I recommend that you enable this trace flag if you’re looking at Timeout as the early terminator for your execution plan? Nope. Not at all. I recommend you spend time tuning that query. Break it down into smaller pieces. Not that I like hints, but see if a query hint will solve the issue. If none of those approaches work, I’d at least consider testing trace flag 2301. But even before you do that, I’d validate that any of the additional optimizations outlined by Ian Jose are applicable to your issue. If you are not facing those specific situations, setting this trace flag could hurt your performance.

There’s surprisingly little documentation on this out there. The one story on it I found comes from Brent (which is why he evidently knew about it). He turned it on which solved a problem and then had to turn it back off because it created others.

This particular trace flag definitely sounds like you need to apply primum non nocere as your guiding principle.

May 30 2011

Optimizer Timeouts with XQuery

xraygogsI was looking at performance of a database and I noticed a few of the plans were very large and timing out in the optimizer. This made me wonder, just how many of them were timing out?

This sounds like a job for XQuery!

There’s really nothing to it. Once you start plucking stuff out of the execution plans using XQuery, it’s kind of hard to stop. So here’s my little bit of code.

[sourcecode language="sql"]WITH XMLNAMESPACES(DEFAULT N’http://schemas.microsoft.com/sqlserver/2004/07/showplan’),  QueryPlans
AS  ( 
SELECT  RelOp.pln.value(N’@StatementOptmEarlyAbortReason’, N’varchar(50)’) AS TerminationReason,
        RelOp.pln.value(N’@StatementOptmLevel’, N’varchar(50)’) AS OptimizationLevel,
        –dest.text,
        SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1,
                  (deqs.statement_end_offset – deqs.statement_start_offset)
                  / 2 + 1) AS StatementText,
        deqp.query_plan,
        deqp.dbid,
        deqs.execution_count,
        deqs.total_elapsed_time,
        deqs.total_logical_reads,
        deqs.total_logical_writes
FROM    sys.dm_exec_query_stats AS deqs
        CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
        CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
        CROSS APPLY deqp.query_plan.nodes(N’//StmtSimple’) RelOp (pln)
WHERE   deqs.statement_end_offset > -1        
)   
SELECT  DB_NAME(qp.dbid),
        *
FROM    QueryPlans AS qp
WHERE   (qp.dbid = 13 OR qp.dbid IS NULL)
        AND qp.optimizationlevel = ‘Full’
ORDER BY qp.execution_count DESC ;
[/sourcecode]

The mandatory warning now, XQuery like this can be somewhat processor intensive. I wouldn’t suggest running this on a production system unless you were to put more filters in place to trim the data down a bit.

Jul 17 2009

Execution Plan Compile Termination

Recently I’ve been seeing a lot of people with bad execution plans, desperately trying to tune them, but they were unable to explain why they had such bad plans. More often than no these were larger queries, with a number of derived tables, CTE’s, CROSS APPLY, etc. In most cases the statistics appeared to be fine (this is usually checked by comparing estimated & actual rows within the operations in the execution plan) and the plans themselves didn’t look crazy, but the execution plans were flat out, not good.

If you’re looking at a plan and it doesn’t make much sense, one option that most people don’t check… SQL Server didn’t have enough time to complete optimization. The optimizer is a pretty amazing bit of code. The scary volume of work it does in frightenly short periods of time is quite awesome. However, it doesn’t always get done. To quickly check this situation you need to determine the “Reason For Early Termination Of Statement Optimization.” In the graphical execution plans this is found by going to the final operator in the plan, usually an INSERT, UPDATE, DELETE, or SELECT operator. Most people know to hover the mouse over the operator & get interesting little facts, but most people forget about checking the properties. Right click on this, or any other operator, and select Properties from the context menu. This will open the properties window, which looks something like this:

Properties

Right near the bottom you can see that this execution plan is showing an early termination reason of “Time Out.” That means that the optimizer didn’t finish optimizing. When the optimizer is running, it tries a whole series of different joins, join order, filters, what have you, all through some mathematic process that I’m not even remotely qualified to describe. But if it runs out of time, it goes with the last one it had. The last one it had might be the worst possible plan for this query, but because it hit a time out…

You can also find this right in the XML for the execution plan. In the Batch/Statements/StmtSimple property find the @StatementOptmEarlyAbortReason. It will show the same thing, “Time Out,” when the optimizer has run out of time.

You can get other early abort reasons, but the best one is “GoodEnoughPlanFound.” That means that the optimizer was able to complete it’s job. That doesn’t mean you have a good plan, but it means that you don’t have some interim pile of junk that is doing you more harm than good.

What can you do to fix this? Write simpler queries.

OK. Maybe that’s not helpful. Try to simplify the query. You may not be able to. If not, you might try rearranging the query, changing the join order. Yes, as everyone says, join order doesn’t matter, but the full statement is “Join order doesn’t normally matter.” Sometimes it can make a real difference. You may also have to supply some query hints, join hints, etc., to try to nudge the optimizer along.

Apr 01 2008

Performance Studio

I just found out about some new functionality coming out in SQL Server 2008 called Performance Studio. It’s actually largely a framework around which you can build performance monitoring routines for an entire enterprise. This sounds terrific. I’m going to dig into a bit and make it my presentation for the Heroes {Community} Launch event at SNESSUG next week. Here’s a Technet webcast on the topic. Here’s a very nice blog entry over at SQLTeam (I suppose I should ad them to my blog roll) discussing the function of the Data Collector, the foundation for this new framework. Performance Studio only works with 2008 systems though, so that’s something to take into account. Although I see an interview with Brad McGehee that says it’s not enterprise ready. Another something to take into account.