I almost forgot to tell you about the Database Administration Virtual Chapter meeting next week, March 26th, 2014. I’ll be doing a talk about query tuning in Windows Azure SQL Database. It’s a talk I’ve given before (it was in the top 10 at the PASS Summit last year). Come find out why you’ll need to tune queries in WASD, the tools you get, and the glorious fact that you’ll actually be actively saving your business money by tuning queries! Click here now to register.
It really is so much easier to just throw hardware at badly performing databases. Just buy a bigger, faster server with more and faster disks and you can put off doing tuning work for another 6-9 months, easily. But, for most of us, sooner or later, our performance problems get so big or, we just don’t have any more money to spend, and we’re stuck. We have to tune the queries. And frankly, query tuning is a pain in the nether regions.
But, after you’ve tuned queries 20 or 30 times, you start to recognize the patterns and it gets easier (never easy, just not as hard). But, if you haven’t done it 20 or 30 times, what do you do? My suggestion, talk to someone who has done it 30 times (or even a couple of hundred times), like me for example.
I have an all day session on tuning queries. It goes from understanding how the optimizer works (which will automatically lead you to write better queries), to how to gather performance metrics (so you know where the pain points are located), to reading execution plans (you need to know what has gone wrong with the query) to various mechanisms for fixing the query. This information is applicable to systems from SQL Server 2005 to SQL Server 2014 (sorry everyone still on 2000, it’s time to upgrade). The session is based on the books I’ve written about query tuning and execution plans, plus years and years of doing lots of query tuning.
Right now I’ve got two events scheduled. Before SQL Saturday #286 in Louisville, KY, I’ll be putting on this precon. We’re limited to seating, so don’t wait. You can go here to register. Then we can get together the next day at the SQL Saturday event to get some more education from all the great speakers there. Next, before SQL Saturday #302 in Albany, NY (their first one, ever), I’ll be hosting this. You can register by clicking here. Don’t miss the early bird special. Again, the next day will be filled with learning at the SQL Saturday event.
I’m working on taking this to other locations and venues. If you’re interested, please get in touch. I’ll do what I can to come to you.
If you have a particularly thorny query, bring it along with an actual execution plan. If we have time at the end of the day, I’ll take a look and makes suggestions, live (uh, please, no sensitive patient data or anything like that).
Let’s get together and talk query tuning.
Ever had that moment where you start getting errors from code that you’ve tested a million times? I had that one recently. I had this little bit of code for pulling information directly from query plans in cache:
WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'), QueryPlans AS ( SELECT RelOp.pln.value(N'@PhysicalOp', N'varchar(50)') AS OperatorName, RelOp.pln.value(N'@NodeId',N'integer') AS NodeId, RelOp.pln.value(N'@EstimateCPU', N'decimal(10,9)') AS CPUCost, RelOp.pln.value(N'@EstimateIO', N'decimal(10,9)') AS IOCost, dest.text FROM sys.dm_exec_query_stats AS deqs CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp CROSS APPLY deqp.query_plan.nodes(N'//RelOp') RelOp (pln) ) SELECT qp.OperatorName, qp.NodeId, qp.CPUCost, qp.IOCost, qp.CPUCost + qp.IOCost AS EstimatedCost FROM QueryPlans AS qp WHERE qp.text = 'some query or other in cache' ORDER BY EstimatedCost DESC;
I’ve probably run this… I don’t know how many times. But… I’m suddenly getting an error:
Msg 8114, Level 16, State 5, Line 7
Error converting data type nvarchar to numeric.
What the hell? There is no where this should be occurring. I dig through the query over and over and I can’t figure it out. Until… I finally notice that one of the operators in the plan has the CPUCost value stored as a float:
Ummmm, since when? Since forever. I’ve just been lucky with my code. I’d just never hit a sufficiently small cost in the plans before. I hadn’t bothered to look for the actual data type in use in the schema definition, although it’s right there:
<xsd:attribute name=”EstimateCPU” type=”xsd:double” use=”required”/>
What did I do wrong? I was looking at the data output from the queries and in the plans rather than looking at the structure to know what to expect. It’s the kind of thing I would never do with T-SQL. I would always look to the table structure to know what data type a given column was. But in this case, with the XML, I looked at the data and made an assumption. And we all knows what that means. It makes an ass out of you and mption.
Or, in this case, me and mption.
Anyway, the corrected query is a pretty trivial change:
WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'), QueryPlans AS ( SELECT RelOp.pln.value(N'@PhysicalOp', N'varchar(50)') AS OperatorName, RelOp.pln.value(N'@NodeId',N'integer') AS NodeId, RelOp.pln.value(N'@EstimateCPU', N'float') AS CPUCost, RelOp.pln.value(N'@EstimateIO', N'float') AS IOCost, dest.text FROM sys.dm_exec_query_stats AS deqs CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp CROSS APPLY deqp.query_plan.nodes(N'//RelOp') RelOp (pln) ) SELECT qp.OperatorName, qp.NodeId, qp.CPUCost, qp.IOCost, qp.CPUCost + qp.IOCost AS EstimatedCost FROM QueryPlans AS qp WHERE qp.text = 'SELECT * FROM HumanResources.vEmployee AS ve' ORDER BY EstimatedCost DESC;
But I do feel bad if anyone has been using this the way I showed it. ‘Cause, yeah, I’ve demonstrated with this code in the past. Oops. However, great point. Especially when working with a public XML schema like this, it pays to go and look at that schema the same way you would a table in order to ensure that you’re using the correct data types.
SQL Saturday’s are awesome! Let’s get that clear up front. The organizers of SQL Saturday events are glorious individuals. Let’s get that clear too.
I want to be up front about those things because, well, I’m going to be critical.
First though, I want to establish my bona fides for what I’m about to say. I helped organize two SQL Saturday events and two other local events before those. I also help Red Gate Software run half-day seminars all over the country. So, I have some idea what goes into the organizational side of these things. I’ve presented at eleven SQL Saturday events in just the last year. I’m on the schedule for, I think, 6 more between now and August. So, I think I have some idea what it’s like to be a speaker at the events. And, I work for a vendor who puts money and swag up at the events in order to get some advertising. Which gives me some ideas behind what makes the vendors happy too. Further, I’m one of the community. I attend the sessions, talk to the sponsors, take part in the after events, the whole magilla. I don’t think any of this makes me an expert or makes my voice more important than anyone else, but it all comes together to show that I’m not utterly clueless in my opinions (which, I know the adage, opinions are like certain body parts, everyone has one, and they all stink).
Organizers, I’ve seen this issue a lot and it’s just getting worse. This issue is going to hurt you with, in no particular order; speakers, sponsors and the community. What are you doing? You’re trying to have it all.
You want sponsors, right? The sponsors frequently ask for one thing… please, please, please, let us do a presentation so we can show off how wicked awesome our products are to the most motivated people in the region (yeah, the people who are giving up a Saturday to learn technology for their jobs are the best people in the area where that SQL Saturday is taking place). So, you agree to letting the sponsors have a talk… ooh, but when to schedule it?
You want BIG NAME speakers, right? Although I’m absolutely convinced that big name speakers don’t really draw people to your event. Good sessions, usually defined by good session titles, draw people to your event (and good communication on your part through various venues and… well, that’s a different discussion I’ll leave for Karla). But, the belief is there, so people try to get Brent Kline and Kendal Ford and Jes Misner to come speak at their event. BUT, you also want to meet the needs of the local community so you can grow new speakers, so you’re going to take in a bunch of new people too… ah, but how do you schedule that?
And you’re committed to your community too, right? And one of the best ways to show your commitment to your community is to host a panel at lunch. The most common panel is Women in Technology, but I’ve also seen or heard about panels on educating young people, charities, user groups, and all sorts of things. Great stuff really and a big part of why the SQL Family is so wonderful. We really do try to help each other out. We really do care, and those panels give people a chance to communicate what they’ve done to others who may want to contribute in the same way. Ah… but when can we schedule this panel?
By now, I’ll bet many of you know what I’m about to say. But, before I say it, let me point out one more thing. SQL Saturday’s are all day affairs. And if anyone goes to the entire thing, they’re in the building from 8AM to 4PM (or so), so, we’re going to feed them something at mid-day. That really bites into our schedule too.
When can we put all this together? A WIT panel, sponsor talks, new speakers, experienced speakers and lunch…. Hey, hold on. Let’s put it all at lunch. That’s just a gaping hole in the schedule begging to be filled.
And there lies the problem. Putting all this together, all at the same time, hurts something. And, putting it all at lunch, pretty much hurts all of it. It’s hard to get your food and then find your way to a room to eat it in, or, conversely attend the session you want and get your food later, or, try to eat and then go into a session half way through. You can’t do it all. And then, when you think about the audience mix you just created, you’re hurting new speakers because people may skip their session to attend the sponsor session or the WIT panel. The WIT panel is going to suffer because you scheduled an experienced, known, speaker at lunch because you just ran out of room to put them anywhere else. And the sponsors… I’ll be blunt. We want eyeballs. And you just gave them alternatives, and we know they already have alternatives with our competitors doing a session at the same time, but did you have to clean out everyone for the WIT panel too?
In short, organizers, you need to start to pare it down. Don’t try to do it all. You want to support sponsors at lunch? Cool, do that. Schedule the WIT panel to 1/2 hour before the prize drawing (I’ve seen that done, it worked well). You want to have sessions at lunch? Fine. Don’t schedule the sponsors for then. Extend the day and have sponsor sessions before or after lunch. Want to get eyeballs to the local speaker or the big name speaker? Cool, but leave the sponsors out of it. Can’t work out how to fit ALL this in? Then don’t. Don’t even try. Give up on some of it. Pick and choose to make your event yours. But don’t try to cram so much stuff in that you basically make it difficult for the speakers and the community and the sponsors and the attendees.
No, I don’t mean the use of sp_updatestats is not smart. It’s a fine, quick mechanism for getting statistics updated in your system. But the procedure itself is not smart. I keep seeing stuff like “sp_updatestats knows which statistics need to be updated” and similar statements.
Wanna know how I know? It’s tricky. Ready? I looked at the query. It’s there, in full, at the bottom of the article (2014 CTP2 version, just in case yours is slightly different, like, for example, no Hekaton logic). Let’s focus on just this bit:
if ((@ind_rowmodctr <> 0) or ((@is_ver_current is not null) and (@is_ver_current = 0)))
The most interesting part is right at the front, @ind_rowmodctr <> 0. That value is loaded with the cursor and comes from sys.sysindexes and the rowmodctr column there. In short, we can know that the “smart” aspect of sp_updatestats is that it assumes if there are any modifications, then updating statistics is good to go. We can argue for hours over how exactly you determine whether or not statistics are sufficiently out of date to warrant an update, but I’d be willing to bet that the sophisticated answers are not going to include just finding everything that’s been touched.
Now, don’t get me wrong. I’m not implying, suggesting or stating that sp_updatestats shouldn’t be used. It should. It’s fine. Just be very clear about what it does and how it does it.
Just a reminder, I’m putting on an all day seminar on query tuning in Louisville on June 20th, 2014. Seats are going fast, so please sign up early.
USE [master] GO /****** Object: StoredProcedure [sys].[sp_updatestats] Script Date: 3/6/2014 8:09:58 PM ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER procedure [sys].[sp_updatestats] @resample char(8)='NO' as declare @dbsid varbinary(85) select @dbsid = owner_sid from sys.databases where name = db_name() -- Check the user sysadmin if not is_srvrolemember('sysadmin') = 1 and suser_sid() <> @dbsid begin raiserror(15247,-1,-1) return (1) end -- cannot execute against R/O databases if DATABASEPROPERTYEX(db_name(), 'Updateability')=N'READ_ONLY' begin raiserror(15635,-1,-1,N'sp_updatestats') return (1) end if upper(@resample)<>'RESAMPLE' and upper(@resample)<>'NO' begin raiserror(14138, -1, -1, @resample) return (1) end -- required so it can update stats on ICC/IVs set ansi_warnings on set ansi_padding on set arithabort on set concat_null_yields_null on set numeric_roundabort off declare @exec_stmt nvarchar(4000) -- "UPDATE STATISTICS [sysname].[sysname] [sysname] WITH RESAMPLE NORECOMPUTE" declare @exec_stmt_head nvarchar(4000) -- "UPDATE STATISTICS [sysname].[sysname] " declare @options nvarchar(100) -- "RESAMPLE NORECOMPUTE" declare @index_names cursor declare @ind_name sysname declare @ind_id int declare @ind_rowmodctr int declare @updated_count int declare @skipped_count int declare @sch_id int declare @schema_name sysname declare @table_name sysname declare @table_id int declare @table_type char(2) declare @schema_table_name nvarchar(640) -- assuming sysname is 128 chars, 5x that, so it's > 128*4+4 declare @compatlvl tinyint -- Note that we looked up from sys.objects$ instead sys.objects since some internal tables are not visible in sys.objects declare ms_crs_tnames cursor local fast_forward read_only for select name, object_id, schema_id, type from sys.objects$ o where o.type = 'U' or o.type = 'IT' open ms_crs_tnames fetch next from ms_crs_tnames into @table_name, @table_id, @sch_id, @table_type -- determine compatibility level select @compatlvl = cmptlevel from sys.sysdatabases where name = db_name() while (@@fetch_status <> -1) -- fetch successful begin -- generate fully qualified quoted name select @schema_name = schema_name(@sch_id) select @schema_table_name = quotename(@schema_name, '[') +'.'+ quotename(rtrim(@table_name), '[') -- check for table with disabled clustered index if (1 = isnull((select is_disabled from sys.indexes where object_id = @table_id and index_id = 1), 0)) begin -- raiserror('Table ''%s'': cannot perform the operation on the table because its clustered index is disabled', -1, -1, @tablename) raiserror(15654, -1, -1, @schema_table_name) end else begin -- filter out local temp tables if ((@@fetch_status <> -2) and (substring(@table_name, 1, 1) <> '#')) begin -- reset counters for this table select @updated_count = 0 select @skipped_count = 0 -- print status message --raiserror('Updating %s', -1, -1, @schema_table_name) raiserror(15650, -1, -1, @schema_table_name) -- initial statement preparation: UPDATE STATISTICS [schema].[name] select @exec_stmt_head = 'UPDATE STATISTICS ' + @schema_table_name + ' ' -- using another cursor to iterate through -- indices and stats (user and auto-created) -- Hekaton indexes do not appear in sys.sysindexes so we need to use sys.stats instead -- Hekaton indexes do not support rowmodctr so we just return 1 which will force update stats -- Note that OBJECTPROPERTY returns NULL on type="IT" tables, thus we only call it on type='U' tables if ((@table_type = 'U') and (1 = OBJECTPROPERTY(@table_id, 'TableIsMemoryOptimized'))) -- Hekaton tables begin set @index_names = cursor local fast_forward read_only for select name, stats_id, 1 as rowmodctr from sys.stats where object_id = @table_id and indexproperty(object_id, name, 'ishypothetical') = 0 order by stats_id end else begin set @index_names = cursor local fast_forward read_only for select name, indid, rowmodctr from sys.sysindexes where id = @table_id and indid > 0 and indexproperty(id, name, 'ishypothetical') = 0 and indexproperty(id, name, 'iscolumnstore') = 0 order by indid end open @index_names fetch @index_names into @ind_name, @ind_id, @ind_rowmodctr -- if there are no stats, skip update if @@fetch_status < 0 --raiserror(' %d indexes/statistics have been updated, %d did not require update.', -1, -1, @updated_count, @skipped_count) raiserror(15651, -1, -1, @updated_count, @skipped_count) else begin while @@fetch_status >= 0 begin -- create quoted index name declare @ind_name_quoted nvarchar(258) select @ind_name_quoted = quotename(@ind_name, '[') -- reset options select @options = '' declare @is_ver_current bit select @is_ver_current = stats_ver_current(@table_id, @ind_id) -- note that <> 0 should work against old and new rowmodctr logic (when it is always > 0) -- also, force a refresh if the stats blob version is not current if ((@ind_rowmodctr <> 0) or ((@is_ver_current is not null) and (@is_ver_current = 0))) begin select @exec_stmt = @exec_stmt_head + @ind_name_quoted -- Add FULLSCAN for hekaton tables -- Note that OBJECTPROPERTY returns NULL on type="IT" tables, thus we only call it on type='U' tables if ((@table_type = 'U') and (1 = OBJECTPROPERTY(@table_id, 'TableIsMemoryOptimized'))) -- Hekaton tables select @options = 'FULLSCAN' -- add resample if needed else if (upper(@resample)='RESAMPLE') select @options = 'RESAMPLE ' if (@compatlvl >= 90) -- put norecompute if local properties are set to AUTOSTATS = OFF -- note that ind name is unique within the object if ((select no_recompute from sys.stats where object_id = @table_id and name = @ind_name) = 1) begin if (len(@options) > 0) select @options = @options + ', NORECOMPUTE' else select @options = 'NORECOMPUTE' end if (len(@options) > 0) select @exec_stmt = @exec_stmt + ' WITH ' + @options --print @exec_stmt exec (@exec_stmt) --raiserror(' %s has been updated...', -1, -1, @ind_name_quoted) raiserror(15652, -1, -1, @ind_name_quoted) select @updated_count = @updated_count + 1 end else begin --raiserror(' %s, update is not necessary...', -1, -1, @ind_name_quoted) raiserror(15653, -1, -1, @ind_name_quoted) select @skipped_count = @skipped_count + 1 end fetch @index_names into @ind_name, @ind_id, @ind_rowmodctr end --raiserror(' %d index(es)/statistic(s) have been updated, %d did not require update/disabled.', -1, -1, @updated_count, @skipped_count) raiserror(15651, -1, -1, @updated_count, @skipped_count) end deallocate @index_names end end print ' ' fetch next from ms_crs_tnames into @table_name, @table_id, @sch_id, @table_type end raiserror(15005,-1,-1) deallocate ms_crs_tnames return(0) -- sp_updatestats
This never gets easier. I was able to attend a bunch of sessions in the last month from a number of speakers that I’d never seen before. A lot of them were good, very good. In fact, I’d go so far as to say I think the general level of speakers within the SQL Server community is improving. Which means we’ll all need to up our games. I also saw several that I’ve seen before because I always learn from them. In short, my cup runneth over. Anyway, the person I picked this month, well, I’d never seen him present before. But, I have hung out with him. He’s got this incredible, fast, sharp wit and he’ll protect you from dangerous objects in orange. I’m picking Mark Vaillancourt (b|t) and his session Danger: The Art and Science of Presenting.
I’m so glad I went to this session. If I learned nothing else, it’s that I have a gut because I present. You see, a fight or flight emotion, which takes place in front of people, tends to spike insulin, leading to increased fat storage. If I just stopped presenting, I’d be skinny again… well, skinnier. But, let’s focus on Mark’s session. You see, he wasn’t kidding when he said Science in the title. There was tons and tons of discussion around the science of presenting, mostly focused on how you, as a presenter, are dealing with this situation, methods you can use to control your breathing, your focus and all that sort of stuff. Yes, he talked somewhat about good practices on slides & such, the Art aspects. But the main focus was on the science and it was fascinating. Instead of the approach so many people take towards a session on presentations, “Here’s how you do it,” Mark gave us a lot of detail on how the process of presentation works. You can figure out how on your own from there. I love the breakdown on the topics between self perception (figuring out how you’re doing), self expression (figuring out how to get across what you want to get across), interpersonal (how to deal with your audience), stress management (figure that one out), and decision making (literally, thinking on your feet as you present). Mark chatted up the crowd before the session. He had amazing delivery, which you’d have to in order to make a session like this work, and his wit was on display throughout, although it was very controlled and focused and never overwhelmed the presentation. I really got a lot out of attending the session.
My feedback has two points. Mark really didn’t repeat the questions. It was a very small room, but people can be so quiet and if they’re talking away from you, you may not hear what they say. This one is so hard to do (it’s one of my own goals to improve my presentations), but it’s vital. Second, Mark’s funny. He put bits of humor in the slides with pictures and titles and he cracked jokes as he presented. Again, this in no way hurt the delivery. It enhanced it in every possible way. But, for a lot of his humor, he stopped and quickly explained what the joke meant. I’d say, have less obscure jokes that don’t need explanation, or just let them stand. Those who get them, great. Those who don’t oh well. It’s not going to enhance the joke to the people who get it and the people who don’t won’t laugh at the explanation. But that’s all I’ve got. It really was a great presentation.
Mark posts upcoming presentations on his blog, but I don’t see any listed currently. He’s not listed on Lanyrd (and if someone has a better place to record upcoming events so we can all share them, I’m open. Until you supply me with an alternative, we should be using Lanyrd, and no, I’m not getting cut backs or anything). Anyway, congrats Mark. All the cash, gifts, slaves and other associated paraphernalia traditionally associated with this award are winging their way towards you now.
I spend quite a bit of time writing about query tuning on this blog. I’ve written (re-written and am actively re-writing) books on query tuning. But what I like most is talking about query tuning. I love giving sessions at various events on different aspects of query tuning, but, what I like the most is spending a whole day, trying to do a complete brain dump to get as much information out there as possible. Sound attractive? Then I’ve got a great deal for you. Come to Louisville on June 20th, 2014. We will talk query tuning at length. You have a specific question? Let’s get it answered. Then, the next day, we can all go to SQL Saturday 286 there in Louisville to get more learning and some serious networking. What’s not to like?
I am terribly jazzed to be involved with this amazing event, SQL Intersection. It’s featuring some truly amazing speakers presenting on important topics. It’s being held here on the East Coast, right near the Mouse, the Duck and Dog. This is one of those conferences you need to get to. Check out the lineup. That is some of the smartest, most capable people I know. I’m quite humbled to be on the list with them, so I’ll do my level best to deliver good content. Look at the sessions. While I don’t know precisely when SQL Server 2014 is coming out, I’m sure it’s real soon, so this will be a great place to get a leg-up on understanding what this new set of technology offers, or just learn more about SQL Server in general, Azure, SSRS and SSIS.
Click here now to register for this special event.
Deservedly so, I got called out for a bit of attitude I displayed in a recent blog post: Time for a Quick Rant. Steve Hood took the general attitude of “Do this or I will beat you” to task in his blog post The Approachable DBA.
Granted, my little rant was primarily done tongue wedged immovably in cheek. But I was reflecting an attitude that the gods know I’m guilty of and that I think way too many DBAs are guilty of. Actually, I think developers are just as guilty. And sysadmins, san admins, support desk people, QA, the report writing team, those people supporting the data warehouse certainly, the SharePoint team, and that poor lady who got stuck being the Deployment manager.
That attitude? I don’t think you heard me the first couple of (eight thousand) times I said this, so I’ll say it a little louder… with emphasis… at length… weapon in hand (figuratively, of course).
I have never, ever, put my hands on someone in anger in the workplace. I don’t threaten people in the workplace either (there was that one time, but that guy had it coming). But I am guilty of the extended and repeated rant. While it might actually be cathartic for me, it doesn’t help anyone else at work. More importantly, it doesn’t help you with your leadership position at work.
Leadership? Most of you just held up a sign to ward off evil. “I don’t want to move into management.” No. That’s not what I mean. I mean leadership, not management. Why do we go off on these rants and tears? Speaking for everyone (’cause I can, my blog, my rules), we want to be able to positively influence the decisions made within our company and we’re right (most of the time, or, well, often enough), so when we have to deal with concerns like, “Hey, that restore you ran left off a row” we launch into what is practically a canned recording of “Restores are a bit-by-bit copy of the database at a given moment in time. I can’t possibly have missed a row. How many times do I have to tell you? Look, here’s how it works…” And we’re off.
Where was I? Oh yeah, leadership. It’s not about management. It’s about having influence, setting direction, getting things going the right way. In order to really do this, you can’t just bark at people and figuratively shove them around. You must be approachable. In order for them to hear you, they must listen to you. If all you are is shouts and threats and rants, you will not be able to lead.
I think Steve is right. There has to be a level of approachability that we have in order to establish the trust we need to put ourselves in the place we need to be to make a positive impact. Just don’t ask for access to production again, or I’m getting the hose out.
In case you don’t know, this query:
UPDATE dbo.Test1 SET C2 = 2 WHERE C1 LIKE '%33%';
Will run quite a bit slower than this query:
UPDATE dbo.Test1 SET C2 = 1 WHERE C1 LIKE '333%';
Or this one:
UPDATE dbo.Test1 SET C2 = 1 WHERE C1 = '333';
That’s because the second two queries have arguments in the filter criteria that allow SQL Server to use the statistics in an index to look for specific matching values and then use the balanced tree, B-Tree, of the index to retrieve specific rows. The argument in the first query requires a full scan against the index because there is no way to know what values might match or any path through the index to simply retrieve them.
But, what if we do this:
UPDATE dbo.test1 SET C2 = CASE WHEN C1 LIKE '19%' THEN 3 WHEN C1 LIKE '25%' THEN 2 WHEN C1 LIKE '37%' THEN 1 END;
We’re avoiding that nasty wild card search, right? So the optimizer should just be able to immediately find those values and retrieve them… Whoa! Hold up there pardner. Let’s set up a full test:
IF (SELECT OBJECT_ID('Test1') ) IS NOT NULL DROP TABLE dbo.Test1; GO CREATE TABLE dbo.Test1 (C1 VARCHAR(50),C2 INT, C3 INT IDENTITY); SELECT TOP 1500 IDENTITY( INT,1,1 ) AS n INTO #Nums FROM Master.dbo.SysColumns sC1, Master.dbo.SysColumns sC2; INSERT INTO dbo.Test1 (C1,C2) SELECT n, n FROM #Nums; DROP TABLE #Nums; CREATE CLUSTERED INDEX i1 ON dbo.Test1 (C1) ; UPDATE dbo.test1 SET C2 = CASE WHEN C1 LIKE '%42%' THEN 3 WHEN C1 LIKE '%24%' THEN 2 WHEN C1 LIKE '%36%' THEN 1 END DBCC FREEPROCCACHE() UPDATE dbo.test1 SET C2 = CASE WHEN C1 LIKE '19%' THEN 33 WHEN C1 LIKE '25%' THEN 222 WHEN C1 LIKE '37%' THEN 11 WHEN C1 LIKE '22%' THEN 5 END
I added the extra CASE evaluation in the second query in order to get a different query hash value.
Here are the execution plans from the two queries:
They’re pretty identical. Well, except for me forcing a difference in the hash values, they’re identical except for the details in the Compute Scalar operator. So what’s going on? Shouldn’t that second query use the index to retrieve the values? After all, it avoided that nasty comparison operator, right? Well, yes, but… we introduced a function on the columns. What function you ask? The CASE statement itself.
This means you can’t use a CASE statement in this manner because it does result in bypassing the index and statistics in the same way as using functions against the columns do.