Mar 19 2014

Query Tuning Near You

It really is so much easier to just throw hardware at badly performing databases. Just buy a bigger, faster server with more and faster disks and you can put off doing tuning work for another 6-9 months, easily. But, for most of us, sooner or later, our performance problems get so big or, we just don’t have any more money to spend, and we’re stuck. We have to tune the queries. And frankly, query tuning is a pain in the nether regions.

But, after you’ve tuned queries 20 or 30 times, you start to recognize the patterns and it gets easier (never easy, just not as hard). But, if you haven’t done it 20 or 30 times, what do you do? My suggestion, talk to someone who has done it 30 times (or even a couple of hundred times), like me for example.

I have an all day session on tuning queries. It goes from understanding how the optimizer works (which will automatically lead you to write better queries), to how to gather performance metrics (so you know where the pain points are located), to reading execution plans (you need to know what has gone wrong with the query) to various mechanisms for fixing the query. This information is applicable to systems from SQL Server 2005 to SQL Server 2014 (sorry everyone still on 2000, it’s time to upgrade). The session is based on the books I’ve written about query tuning and execution plans, plus years and years of doing lots of query tuning.

Right now I’ve got two events scheduled. Before SQL Saturday #286 in Louisville, KY, I’ll be putting on this precon. We’re limited to seating, so don’t wait. You can go here to register. Then we can get together the next day at the SQL Saturday event to get some more education from all the great speakers there. Next, before SQL Saturday #302 in Albany, NY (their first one, ever), I’ll be hosting this. You can register by clicking here. Don’t miss the early bird special. Again, the next day will be filled with learning at the SQL Saturday event.

I’m working on taking this to other locations and venues. If you’re interested, please get in touch. I’ll do what I can to come to you.

If you have a particularly thorny query, bring it along with an actual execution plan. If we have time at the end of the day, I’ll take a look and makes suggestions, live (uh, please, no sensitive patient data or anything like that).

Let’s get together and talk query tuning.

Mar 18 2014

Finding Mistakes

Ever had that moment where you start getting errors from code that you’ve tested a million times? I had that one recently. I had this little bit of code for pulling information directly from query plans in cache:

WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
QueryPlans AS
(
SELECT RelOp.pln.value(N'@PhysicalOp', N'varchar(50)') AS OperatorName,
RelOp.pln.value(N'@NodeId',N'integer') AS NodeId,
RelOp.pln.value(N'@EstimateCPU', N'decimal(10,9)') AS CPUCost,
RelOp.pln.value(N'@EstimateIO', N'decimal(10,9)') AS IOCost,
dest.text
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
CROSS APPLY deqp.query_plan.nodes(N'//RelOp') RelOp (pln)
)

SELECT  qp.OperatorName,
        qp.NodeId,
        qp.CPUCost,
        qp.IOCost,
        qp.CPUCost + qp.IOCost AS EstimatedCost
FROM    QueryPlans AS qp
WHERE   qp.text = 'some query or other in cache'
ORDER BY EstimatedCost DESC;

I’ve probably run this… I don’t know how many times. But… I’m suddenly getting an error:

Msg 8114, Level 16, State 5, Line 7
Error converting data type nvarchar to numeric.

What the hell? There is no where this should be occurring. I dig through the query over and over and I can’t figure it out. Until… I finally notice that one of the operators in the plan has the CPUCost value stored as a float:

EstimateCPU=”1e-006″

Ummmm, since when? Since forever. I’ve just been lucky with my code. I’d just never hit a sufficiently small cost in the plans before. I hadn’t bothered to look for the actual data type in use in the schema definition, although it’s right there:

<xsd:attribute name=”EstimateCPU” type=”xsd:double” use=”required”/>

 

Long-Kiss-Goodnight-samuel-l-jackson-14034435-720-480

I never did one thing right in my life, you know that? Not one. That takes skill.

What did I do wrong? I was looking at the data output from the queries and in the plans rather than looking at the structure to know what to expect. It’s the kind of thing I would never do with T-SQL. I would always look to the table structure to know what data type a given column was. But in this case, with the XML, I looked at the data and made an assumption. And we all knows what that means. It makes an ass out of you and mption.

Or, in this case, me and mption.

Anyway, the corrected query is a pretty trivial change:

WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
QueryPlans AS
(
SELECT RelOp.pln.value(N'@PhysicalOp', N'varchar(50)') AS OperatorName,
RelOp.pln.value(N'@NodeId',N'integer') AS NodeId,
RelOp.pln.value(N'@EstimateCPU', N'float') AS CPUCost,
RelOp.pln.value(N'@EstimateIO', N'float') AS IOCost,
dest.text
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
CROSS APPLY deqp.query_plan.nodes(N'//RelOp') RelOp (pln)
)

SELECT  qp.OperatorName,
        qp.NodeId,
        qp.CPUCost,
        qp.IOCost,
        qp.CPUCost + qp.IOCost AS EstimatedCost
FROM    QueryPlans AS qp
WHERE   qp.text = 'SELECT * FROM HumanResources.vEmployee AS ve'
ORDER BY EstimatedCost DESC;

But I do feel bad if anyone has been using this the way I showed it. ‘Cause, yeah, I’ve demonstrated with this code in the past. Oops. However, great point. Especially when working with a public XML schema like this, it pays to go and look at that schema the same way you would a table in order to ensure that you’re using the correct data types.

Mar 17 2014

SQL Saturday: You can’t have it all.

SQL Saturday’s are awesome! Let’s get that clear up front. The organizers of SQL Saturday events are glorious individuals. Let’s get that clear too.

I want to be up front about those things because, well, I’m going to be critical.

First though, I want to establish my bona fides for what I’m about to say. I helped organize two SQL Saturday events and two other local events before those. I also help Red Gate Software run half-day seminars all over the country. So, I have some idea what goes into the organizational side of these things. I’ve presented at eleven SQL Saturday events in just the last year. I’m on the schedule for, I think, 6 more between now and August. So, I think I have some idea what it’s like to be a speaker at the events. And, I work for a vendor who puts money and swag up at the events in order to get some advertising. Which gives me some ideas behind what makes the vendors happy too. Further, I’m one of the community. I attend the sessions, talk to the sponsors, take part in the after events, the whole magilla. I don’t think any of this makes me an expert or makes my voice more important than anyone else, but it all comes together to show that I’m not utterly clueless in my opinions (which, I know the adage, opinions are like certain body parts, everyone has one, and they all stink).

Organizers, I’ve seen this issue a lot and it’s just getting worse. This issue is going to hurt you with, in no particular order; speakers, sponsors and the community. What are you doing? You’re trying to have it all.

You want sponsors, right? The sponsors frequently ask for one thing… please, please, please, let us do a presentation so we can show off how wicked awesome our products are to the most motivated people in the region (yeah, the people who are giving up a Saturday to learn technology for their jobs are the best people in the area where that SQL Saturday is taking place). So, you agree to letting the sponsors have a talk… ooh, but when to schedule it?

You want BIG NAME speakers, right? Although I’m absolutely convinced that big name speakers don’t really draw people to your event. Good sessions, usually defined by good session titles, draw people to your event (and good communication on your part through various venues and… well, that’s a different discussion I’ll leave for Karla). But, the belief is there, so people try to get Brent Kline and Kendal Ford and Jes Misner to come speak at their event. BUT, you also want to meet the needs of the local community so you can grow new speakers, so you’re going to take in a bunch of new people too… ah, but how do you schedule that?

And you’re committed to your community too, right? And one of the best ways to show your commitment to your community is to host a panel at lunch. The most common panel is Women in Technology, but I’ve also seen or heard about panels on educating young people, charities, user groups, and all sorts of things. Great stuff really and a big part of why the SQL Family is so wonderful. We really do try to help each other out. We really do care, and those panels give people a chance to communicate what they’ve done to others who may want to contribute in the same way. Ah… but when can we schedule this panel?

By now, I’ll bet many of you know what I’m about to say. But, before I say it, let me point out one more thing. SQL Saturday’s are all day affairs. And if anyone goes to the entire thing, they’re in the building from 8AM to 4PM (or so), so, we’re going to feed them something at mid-day. That really bites into our schedule too.

When can we put all this together? A WIT panel, sponsor talks, new speakers, experienced speakers and lunch…. Hey, hold on. Let’s put it all at lunch. That’s just a gaping hole in the schedule begging to be filled.

And there lies the problem. Putting all this together, all at the same time, hurts something. And, putting it all at lunch, pretty much hurts all of it. It’s hard to get your food and then find your way to a room to eat it in, or, conversely attend the session you want and get your food later, or, try to eat and then go into a session half way through. You can’t do it all. And then, when you think about the audience mix you just created, you’re hurting new speakers because people may skip their session to attend the sponsor session or the WIT panel. The WIT panel is going to suffer because you scheduled an experienced, known, speaker at lunch because you just ran out of room to put them anywhere else. And the sponsors… I’ll be blunt. We want eyeballs. And you just gave them alternatives, and we know they already have alternatives with our competitors doing a session at the same time, but did you have to clean out everyone for the WIT panel too?

In short, organizers, you need to start to pare it down. Don’t try to do it all. You want to support sponsors at lunch? Cool, do that. Schedule the WIT panel to 1/2 hour before the prize drawing (I’ve seen that done, it worked well). You want to have sessions at lunch? Fine. Don’t schedule the sponsors for then. Extend the day and have sponsor sessions before or after lunch. Want to get eyeballs to the local speaker or the big name speaker? Cool, but leave the sponsors out of it. Can’t work out how to fit ALL this in? Then don’t. Don’t even try. Give up on some of it. Pick and choose to make your event yours. But don’t try to cram so much stuff in that you basically make it difficult for the speakers and the community and the sponsors and the attendees.

Mar 11 2014

sp_updatestats Is Not Smart

No, I don’t mean the use of sp_updatestats is not smart. It’s a fine, quick mechanism for getting statistics updated in your system. But the procedure itself is not smart. I keep seeing stuff like “sp_updatestats knows which statistics need to be updated” and similar statements.

Nope.

Not true.

Wanna know how I know? It’s tricky. Ready? I looked at the query. It’s there, in full, at the bottom of the article (2014 CTP2 version, just in case yours is slightly different, like, for example, no Hekaton logic). Let’s focus on just this bit:

if ((@ind_rowmodctr <> 0) or ((@is_ver_current is not null) and (@is_ver_current = 0)))

The most interesting part is right at the front, @ind_rowmodctr <> 0. That value is loaded with the cursor and comes from sys.sysindexes and the rowmodctr column there. In short, we can know that the “smart” aspect of sp_updatestats is that it assumes if there are any modifications, then updating statistics is good to go. We can argue for hours over how exactly you determine whether or not statistics are sufficiently out of date to warrant an update, but I’d be willing to bet that the sophisticated answers are not going to include just finding everything that’s been touched.

Now, don’t get me wrong. I’m not implying, suggesting or stating that sp_updatestats shouldn’t be used. It should. It’s fine. Just be very clear about what it does and how it does it.

 

Just a reminder, I’m putting on an all day seminar on query tuning in Louisville on June 20th, 2014. Seats are going fast, so please sign up early.

sp_updatestats:

USE [master]
GO
/****** Object:  StoredProcedure [sys].[sp_updatestats]    Script Date: 3/6/2014 8:09:58 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER procedure [sys].[sp_updatestats]
	@resample char(8)='NO'
as

	declare @dbsid varbinary(85)

	select @dbsid = owner_sid
		from sys.databases
		where name = db_name()

	-- Check the user sysadmin
	if not is_srvrolemember('sysadmin') = 1 and suser_sid() <> @dbsid
	begin
		raiserror(15247,-1,-1)
		return (1)
	end
	-- cannot execute against R/O databases  
	if DATABASEPROPERTYEX(db_name(), 'Updateability')=N'READ_ONLY'
	begin
		raiserror(15635,-1,-1,N'sp_updatestats')
		return (1)
	end

	if upper(@resample)<>'RESAMPLE' and upper(@resample)<>'NO'
	begin
		raiserror(14138, -1, -1, @resample)
		return (1)
	end

	-- required so it can update stats on ICC/IVs
	set ansi_warnings on
	set ansi_padding on
	set arithabort on
	set concat_null_yields_null on
	set numeric_roundabort off

	declare @exec_stmt nvarchar(4000)		-- "UPDATE STATISTICS [sysname].[sysname] [sysname] WITH RESAMPLE NORECOMPUTE"
	declare @exec_stmt_head nvarchar(4000)	-- "UPDATE STATISTICS [sysname].[sysname] "
	declare @options nvarchar(100)			-- "RESAMPLE NORECOMPUTE"

	declare @index_names cursor

	declare @ind_name sysname
	declare @ind_id int
	declare @ind_rowmodctr int
	declare @updated_count int
	declare @skipped_count int

	declare @sch_id int
	declare @schema_name sysname
	declare @table_name sysname
	declare @table_id int
	declare @table_type char(2)
	declare @schema_table_name nvarchar(640) -- assuming sysname is 128 chars, 5x that, so it's > 128*4+4

	declare @compatlvl tinyint

	-- Note that we looked up from sys.objects$ instead sys.objects since some internal tables are not visible in sys.objects
	declare ms_crs_tnames cursor local fast_forward read_only for
		select name, object_id, schema_id, type from sys.objects$ o
		where o.type = 'U' or o.type = 'IT'
	open ms_crs_tnames
	fetch next from ms_crs_tnames into @table_name, @table_id, @sch_id, @table_type

	-- determine compatibility level
	select @compatlvl = cmptlevel from sys.sysdatabases where name = db_name()

	while (@@fetch_status <> -1) -- fetch successful
	begin
		-- generate fully qualified quoted name
		select @schema_name = schema_name(@sch_id)
		select @schema_table_name = quotename(@schema_name, '[') +'.'+ quotename(rtrim(@table_name), '[')

		-- check for table with disabled clustered index
		if (1 = isnull((select is_disabled from sys.indexes where object_id = @table_id and index_id = 1), 0))
		begin
			-- raiserror('Table ''%s'': cannot perform the operation on the table because its clustered index is disabled', -1, -1, @tablename)
			raiserror(15654, -1, -1, @schema_table_name)
		end
		else
		begin
			-- filter out local temp tables
			if ((@@fetch_status <> -2) and (substring(@table_name, 1, 1) <> '#'))
			begin
				-- reset counters for this table
				select @updated_count = 0
				select @skipped_count = 0

				-- print status message
				--raiserror('Updating %s', -1, -1, @schema_table_name)
				raiserror(15650, -1, -1, @schema_table_name)

				-- initial statement preparation: UPDATE STATISTICS [schema].[name]
				select @exec_stmt_head = 'UPDATE STATISTICS ' + @schema_table_name + ' '

				-- using another cursor to iterate through
				-- indices and stats (user and auto-created)
				-- Hekaton indexes do not appear in sys.sysindexes so we need to use sys.stats instead
				-- Hekaton indexes do not support rowmodctr so we just return 1 which will force update stats
				-- Note that OBJECTPROPERTY returns NULL on type="IT" tables, thus we only call it on type='U' tables
				if ((@table_type = 'U') and (1 = OBJECTPROPERTY(@table_id, 'TableIsMemoryOptimized')))	-- Hekaton tables
				begin
					set @index_names = cursor local fast_forward read_only for
						select name, stats_id, 1 as rowmodctr
						from sys.stats
						where object_id = @table_id and indexproperty(object_id, name, 'ishypothetical') = 0 
						order by stats_id
				end
				else
				begin
					set @index_names = cursor local fast_forward read_only for
						select name, indid, rowmodctr from sys.sysindexes
						where id = @table_id and indid > 0 and indexproperty(id, name, 'ishypothetical') = 0 
						and indexproperty(id, name, 'iscolumnstore') = 0
						order by indid
				end

				open @index_names
				fetch @index_names into @ind_name, @ind_id, @ind_rowmodctr

				-- if there are no stats, skip update
				if @@fetch_status < 0
					--raiserror('    %d indexes/statistics have been updated, %d did not require update.', -1, -1, @updated_count, @skipped_count)
					raiserror(15651, -1, -1, @updated_count, @skipped_count)
				else 
				begin
					while @@fetch_status >= 0
					begin
						-- create quoted index name
						declare @ind_name_quoted nvarchar(258)
						select @ind_name_quoted = quotename(@ind_name, '[')

						-- reset options
						select @options = ''

						declare @is_ver_current bit
						select @is_ver_current = stats_ver_current(@table_id, @ind_id)

						-- note that <> 0 should work against old and new rowmodctr logic (when it is always > 0)
						-- also, force a refresh if the stats blob version is not current
						if ((@ind_rowmodctr <> 0) or ((@is_ver_current is not null) and (@is_ver_current = 0)))
						begin
							select @exec_stmt = @exec_stmt_head + @ind_name_quoted

							-- Add FULLSCAN for hekaton tables
							-- Note that OBJECTPROPERTY returns NULL on type="IT" tables, thus we only call it on type='U' tables
							if ((@table_type = 'U') and (1 = OBJECTPROPERTY(@table_id, 'TableIsMemoryOptimized')))	-- Hekaton tables
								select @options = 'FULLSCAN'

							-- add resample if needed
							else if (upper(@resample)='RESAMPLE')
								select @options = 'RESAMPLE '

							if (@compatlvl >= 90)
								-- put norecompute if local properties are set to AUTOSTATS = OFF
								-- note that ind name is unique within the object
								if ((select no_recompute from sys.stats where object_id = @table_id and name = @ind_name) = 1)
								begin
									if (len(@options) > 0) select @options = @options + ', NORECOMPUTE'
									else select @options = 'NORECOMPUTE'
								end

							if (len(@options) > 0)
								select @exec_stmt = @exec_stmt + ' WITH ' + @options

							--print @exec_stmt
							exec (@exec_stmt)
							--raiserror('    %s has been updated...', -1, -1, @ind_name_quoted)
							raiserror(15652, -1, -1, @ind_name_quoted)
							select @updated_count = @updated_count + 1
						end
						else
						begin
							--raiserror('    %s, update is not necessary...', -1, -1, @ind_name_quoted)
							raiserror(15653, -1, -1, @ind_name_quoted)
							select @skipped_count = @skipped_count + 1
						end
						fetch @index_names into @ind_name, @ind_id, @ind_rowmodctr
					end
					--raiserror('    %d index(es)/statistic(s) have been updated, %d did not require update/disabled.', -1, -1, @updated_count, @skipped_count)
					raiserror(15651, -1, -1, @updated_count, @skipped_count)
				end
				deallocate @index_names
			end
		end
		print ' '
		fetch next from ms_crs_tnames into @table_name, @table_id, @sch_id, @table_type
	end
	raiserror(15005,-1,-1)
	deallocate ms_crs_tnames
	return(0) -- sp_updatestats
Mar 07 2014

Speaker of the Month, March 2014

This never gets easier. I was able to attend a bunch of sessions in the last month from a number of speakers that I’d never seen before. A lot of them were good, very good. In fact, I’d go so far as to say I think the general level of speakers within the SQL Server community is improving. Which means we’ll all need to up our games. I also saw several that I’ve seen before because I always learn from them. In short, my cup runneth over. Anyway, the person I picked this month, well, I’d never seen him present before. But, I have hung out with him. He’s got this incredible, fast, sharp wit and he’ll protect you from dangerous objects in orange. I’m picking Mark Vaillancourt (b|t) and his session Danger: The Art and Science of Presenting.

I’m so glad I went to this session. If I learned nothing else, it’s that I have a gut because I present. You see, a fight or flight emotion, which takes place in front of people, tends to spike insulin, leading to increased fat storage. If I just stopped presenting, I’d be skinny again… well, skinnier. But, let’s focus on Mark’s session. You see, he wasn’t kidding when he said Science in the title. There was tons and tons of discussion around the science of presenting, mostly focused on how you, as a presenter, are dealing with this situation, methods you can use to control your breathing, your focus and all that sort of stuff. Yes, he talked somewhat about good practices on slides & such, the Art aspects. But the main focus was on the science and it was fascinating. Instead of the approach so many people take towards a session on presentations, “Here’s how you do it,” Mark gave us a lot of detail on how the process of presentation works. You can figure out how on your own from there. I love the breakdown on the topics between self perception (figuring out how you’re doing), self expression (figuring out how to get across what you want to get across), interpersonal (how to deal with your audience), stress management (figure that one out), and decision making (literally, thinking on your feet as you present). Mark chatted up the crowd before the session. He had amazing delivery, which you’d have to in order to make a session like this work, and his wit was on display throughout, although it was very controlled and focused and never overwhelmed the presentation. I really got a lot out of attending the session.

My feedback has two points. Mark really didn’t repeat the questions. It was a very small room, but people can be so quiet and if they’re talking away from you, you may not hear what they say. This one is so hard to do (it’s one of my own goals to improve my presentations), but it’s vital. Second, Mark’s funny. He put bits of humor in the slides with pictures and titles and he cracked jokes as he presented. Again, this in no way hurt the delivery. It enhanced it in every possible way. But, for a lot of his humor, he stopped and quickly explained what the joke meant. I’d say, have less obscure jokes that don’t need explanation, or just let them stand. Those who get them, great. Those who don’t oh well. It’s not going to enhance the joke to the people who get it and the people who don’t won’t laugh at the explanation. But that’s all I’ve got. It really was a great presentation.

Mark posts upcoming presentations on his blog, but I don’t see any listed currently. He’s not listed on Lanyrd (and if someone has a better place to record upcoming events so we can all share them, I’m open. Until you supply me with an alternative, we should be using Lanyrd, and no, I’m not getting cut backs or anything). Anyway, congrats Mark. All the cash, gifts, slaves and other associated paraphernalia traditionally associated with this award are winging their way towards you now.

Feb 27 2014

Let’s Talk Query Tuning

I spend quite a bit of time writing about query tuning on this blog. I’ve written (re-written and am actively re-writing) books on query tuning. But what I like most is talking about query tuning. I love giving sessions at various events on different aspects of query tuning, but, what I like the most is spending a whole day, trying to do a complete brain dump to get as much information out there as possible. Sound attractive? Then I’ve got a great deal for you. Come to Louisville on June 20th, 2014. We will talk query tuning at length. You have a specific question? Let’s get it answered. Then, the next day, we can all go to SQL Saturday 286 there in Louisville to get more learning and some serious networking. What’s not to like?

Feb 26 2014

SQL Intersection, Spring 2014

I am terribly jazzed to be involved with this amazing event, SQL Intersection. It’s featuring some truly amazing speakers presenting on important topics. It’s being held here on the East Coast, right near the Mouse, the Duck and Dog. This is one of those conferences you need to get to. Check out the lineup. That is some of the smartest, most capable people I know. I’m quite humbled to be on the list with them, so I’ll do my level best to deliver good content. Look at the sessions. While I don’t know precisely when SQL Server 2014 is coming out, I’m sure it’s real soon, so this will be a great place to get a leg-up on understanding what this new set of technology offers, or just learn more about SQL Server in general, Azure, SSRS and SSIS.

Click here now to register for this special event.

Feb 25 2014

Approachable? Sometimes.

Deservedly so, I got called out for a bit of attitude I displayed in a recent blog post: Time for a Quick Rant. Steve Hood took the general attitude of “Do this or I will beat you” to task in his blog post The Approachable DBA.

Granted, my little rant was primarily done tongue wedged immovably in cheek. But I was reflecting an attitude that the gods know I’m guilty of and that I think way too many DBAs are guilty of. Actually, I think developers are just as guilty. And sysadmins, san admins, support desk people, QA, the report writing team, those people supporting the data warehouse certainly, the SharePoint team, and that poor lady who got stuck being the Deployment manager.

That attitude? I don’t think you heard me the first couple of (eight thousand) times I said this, so I’ll say it a little louder… with emphasis… at length… weapon in hand (figuratively, of course).

I have never, ever, put my hands on someone in anger in the workplace. I don’t threaten people in the workplace either (there was that one time, but that guy had it coming). But I am guilty of the extended and repeated rant. While it might actually be cathartic for me, it doesn’t help anyone else at work. More importantly, it doesn’t help you with your leadership position at work.

Leadership? Most of you just held up a sign to ward off evil. “I don’t want to move into management.” No. That’s not what I mean. I mean leadership, not management. Why do we go off on these rants and tears? Speaking for everyone (’cause I can, my blog, my rules), we want to be able to positively influence the decisions made within our company and we’re right (most of the time, or, well, often enough), so when we have to deal with concerns like, “Hey, that restore you ran left off a row” we launch into what is practically a canned recording of “Restores are a bit-by-bit copy of the database at a given moment in time. I can’t possibly have missed a row. How many times do I have to tell you? Look, here’s how it works…” And we’re off.

Where was I? Oh yeah, leadership. It’s not about management. It’s about having influence, setting direction, getting things going the right way. In order to really do this, you can’t just bark at people and figuratively shove them around. You must be approachable. In order for them to hear you, they must listen to you. If all you are is shouts and threats and rants, you will not be able to lead.

I think Steve is right. There has to be a level of approachability that we have in order to establish the trust we need to put ourselves in the place we need to be to make a positive impact. Just don’t ask for access to production again, or I’m getting the hose out.

Feb 18 2014

The CASE Statement and Performance

In case you don’t know, this query:

UPDATE dbo.Test1
SET C2 = 2
WHERE C1 LIKE '%33%';

Will run quite a bit slower than this query:

UPDATE dbo.Test1
SET C2 = 1
WHERE C1 LIKE '333%';

Or this one:

UPDATE dbo.Test1
SET C2 = 1
WHERE C1 = '333';

That’s because the second two queries have arguments in the filter criteria that allow SQL Server to use the statistics in an index to look for specific matching values and then use the balanced tree, B-Tree, of the index to retrieve specific rows. The argument in the first query requires a full scan against the index because there is no way to know what values might match or any path through the index to simply retrieve them.

But, what if we do this:

UPDATE dbo.test1
SET C2 =
CASE
WHEN C1 LIKE '19%' THEN 3
WHEN C1 LIKE '25%' THEN 2
WHEN C1 LIKE '37%' THEN 1
END;

We’re avoiding that nasty wild card search, right? So the optimizer should just be able to immediately find those values and retrieve them… Whoa! Hold up there pardner. Let’s set up a full test:

IF (SELECT  OBJECT_ID('Test1')
   ) IS NOT NULL 
    DROP TABLE dbo.Test1; 
GO
CREATE TABLE dbo.Test1 (C1 VARCHAR(50),C2 INT, C3 INT IDENTITY);

SELECT TOP 1500
        IDENTITY( INT,1,1 ) AS n
INTO    #Nums
FROM    Master.dbo.SysColumns sC1,
        Master.dbo.SysColumns sC2;
        
INSERT  INTO dbo.Test1
        (C1,C2)
        SELECT  n, n
        FROM    #Nums;
        
DROP TABLE #Nums;

CREATE CLUSTERED INDEX i1 ON dbo.Test1 (C1) ;


UPDATE dbo.test1
SET C2 =
CASE
WHEN C1 LIKE '%42%' THEN 3
WHEN C1 LIKE '%24%' THEN 2
WHEN C1 LIKE '%36%' THEN 1
END

DBCC FREEPROCCACHE()

UPDATE dbo.test1
SET C2 =
CASE
WHEN C1 LIKE '19%' THEN 33
WHEN C1 LIKE '25%' THEN 222
WHEN C1 LIKE '37%' THEN 11
WHEN C1 LIKE '22%' THEN 5
END

I added the extra CASE evaluation in the second query in order to get a different query hash value.

Here are the execution plans from the two queries:

TwoPlans

They’re pretty identical. Well, except for me forcing a difference in the hash values, they’re identical except for the details in the Compute Scalar operator. So what’s going on? Shouldn’t that second query use the index to retrieve the values? After all, it avoided that nasty comparison operator, right? Well, yes, but… we introduced a function on the columns. What function you ask? The CASE statement itself.

This means you can’t use a CASE statement in this manner because it does result in bypassing the index and statistics in the same way as using functions against the columns do.

Feb 12 2014

SQL Server 2014 and the New Cardinality Estimator

Cardinality, basically the number of rows being processed by an operation with the optimizer, is a calculation predicated on the statistics available for the columns in question. The statistics used are generally either the values from the histogram or the density. Prior to SQL Server 2014, and going all the way back to SQL Server 7.0 (in the Dark Ages when we had to walk uphill to our cubicles through 15 feet of snow battling Oracle DBAs and Fenris the whole way), there’s been one cardinality estimator (although you can modify the behavior somewhat with a traceflag in 2008R2 and 2012). Not any more. There’s a possibility for really complex, edge-case queries, that you may run into a regression from this.

You control whether or not you get the new cardinality estimator by setting the Compatibility Level of the database to SQL Server 2014 (120 for the picky amongst us). This could lead to regression issues. So, you’re going to pretty quickly want to know if your execution plan is using the new Cardinality Estimation Model, right? It’s tricky. Just look at the properties of the first operator in the plan (I told you to use that first operator). You’ll find one value there that will tell you what you need to know:

CardinalityEstimator

Just check this value (which you can also get from the XML behind the graphical plan) to see what calculations the optimizer used to arrive at the plan you’re observing.