Jan 26 2015

My First Board Meeting

Parents, you know that feeling you get after you’ve brought home that brand new baby (or in my case, babies) where you suddenly say to yourself, “Oh damn. What have I done?” Yeah, that’s my feeling after my first board meeting. But let me give you a run-through of how it was. Note, I can’t pre-publish any board decisions. I’m also not giving you a blow-by-blow account of the discussions (especially since there were no blows, at all). This is a story of the types of things we did, how it went, how I feel about it. This is my blog, my blog post, my thoughts and feelings. This does not represent an official PASS communication. We good?

While the board meeting itself is, for me, only a two day affair, we tacked a day on at the beginning to take advantage of an offer from Microsoft to tour one of their data centers. Let’s put the entire day into one word: Nerdgasm.

On to the work of the board.

The first half of day one of the meeting was dedicated to going through Insights communications styles seminar. I found this extremely useful and, I think it’s going to be even more useful over time going forward. Understanding how my fellow board members and the HQ people communicate will help facilitate that communication and will help PASS. I’m glad the board chose to do this before I was involved. Communication can just be hard. Any kind of tools or practices that make it easier are very important.

We had a nice discussion around some points of governance that we’re going to be taking out to the community in order to get feedback on. Once we get a feel for how people think, we can make more informed decisions. We got an overview of global growth and it’s current status. It’s important to keep this in mind since it’s no longer a portfolio, but is folded into all the others. There was a long and useful discussion around SQL Rally. We got a review of the speaker selection process from the 2014 Summit. It was a good day. I feel like I have a better understanding of what’s going on. I was able to contribute in areas where I had some knowledge. It was a good day.

The second day was primarily focused around discussions on the Business Analytics Conference. I can’t discuss much about this because there’s a whole bunch of stuff that’s going to have to be context driven and should absolutely come out of the mouth of the executives, not me. Suffice to say, the discussion and information was extremely interesting, very helpful to me personally, and I really appreciate the incredible amounts of work that has gone into this event. I will suggest, if you’re interested in PASS and it’s direction as an entity, you should be paying attention to this. It’s important.

Finally we got to the goals. Interestingly enough, I misunderstood the purpose of the goals. I thought I was going in there to get approval, and man I was armed for bear. Come to find out, nah, it’s a chance for feedback from the board proper and then, “Have fun storming the castle.” It’s up to me to pick the goals I want to achieve for the Chapters and then work through the processes we have in place to get them accomplished. Well heck, that’s easy. Now, my plan was to share my goals right here and now, but, I got such good feedback on the goals, that I want to rework them a little, and then I’ll publish them here, there, everywhere. I’m honestly feeling a little bit better about accomplishing something useful for the Chapters than I was heading into the Board Meeting.

That’s about all I have. The minutes will be posted… on whatever schedule they get posted on (not sure), and I could maybe do another blog post after the decisions & details get released. One point that I took away from everything, and it’s something that’s hard for me as a technologist, decisions get made and stuff gets set in motion. Sometimes, it might be stuff that you certainly would have done differently, but you weren’t part of the team. Regardless, you need to look at it, understand it, and then, figure out how best to add your own contributions in order to make the team successful, even if, given your druthers, you’d have done this completely differently. And no, that’s not some veiled reference at some fight or huge disagreement. It’s an overall feeling and directly relates back to the very first thing I said. This whole Board thing is a bigger, tougher, more fascinating challenge than I realized going in. I’m actually a little more excited about it now because of that.

Jan 19 2015

Understand the True Source of Problems

There’s an old joke that goes, “Doctor, doctor, it hurts when I do this.” While the person in question swings their arm over their head. The doctor’s response is, “Don’t do that.” Problem solved, right? Well, maybe not. Let’s take a quick example from life. I do crossfit (yeah, I’m one of those, pull up a chair I’ll tell you all about my clean & jerk progress… kidding). I’ve been experiencing pain in my shoulder. “It hurts when I do this.” But, I’m not going to stop. I’ve been working with my coach to identify where the pain is and what stretches and warm-ups I can do to get around it (assuming it’s not a real injury, and it isn’t). In short, we’re identifying the root cause and addressing the issue rather than just coming up with a “don’t do that” style solution.

As is usual with one of my rambling intros, I’m going to tie this back to our work as data professionals, hang on.

I recently ran into a set of requirements that a new DBA was told are “best practices” and that he had to comply with. I’m editing them in order to save anyone embarrassment (although, if you wrote these, you should be embarrassed). I think that each one represents a moment of “it hurts” followed by “don’t do that” which, as you’re going to see, is absolutely the wrong response. As crazy as they all are, it’s not the first time I’ve seen them. This information is clearly coming from some fever-filled internet-swamp where stupid mutates and grows. Let’s nip these in the bud.

1 – All queries against the production database must be executed by using RDP to connect to the production system. This is because connecting through SSMS will cause the server to crash

I couldn’t even begin to guess what happened that caused a system outage because of a query from SSMS, but I suspect it was some sort of crazy, extreme outlier of a query that probably didn’t run much better while RDPed into the server, but somehow avoided a network issue (or six). Who knows. Suffice to say, no. This is crazy. On multiple levels. The most important being, you’re giving people access to the operating system on production that really probably shouldn’t have it. And, you can only have one person connecting to production at any one time. Two teams ready for a deployment? Tough. Oh, and your production system, in addition to all the work it’s doing for SQL Server, it now has to handle all the work of SSMS and displaying the results of your queries. Was your server under stress when you started? It’s worse now. I really don’t know of anyone, including a very large number of remote DBAs, who don’t connect to SQL Server through SSMS running on their desktop.

2 – NOLOCK is required on every query. It makes things run faster.

The magic Turbo Button of SQL Server rears its ugly head, yet again. I suspect that if you went and looked at all the responses to questions on forums, blog posts, articles and presentations, there are more places defining exactly why this is bad than almost any topic except backups. One more time. Using NOLOCK on all your queries…  and if you’re putting it everywhere anyway, why not just use READ_UNCOMMITTED on your connections, it’s easier… Anyway, Using NOLOCK on your queries results in dirty reads. Yes, everyone says, “Oh, that means if someone is change ‘dog’ to ‘cat’ I could see either ‘dog’ or ‘cat’. I don’t care.” Right. That’s true. It also means when pages split and rearrange, you may miss rows or have rows duplicated. All the businesses I know love it when data is missing from the queries they’re running. I’m sure you wouldn’t mind if your bank used NOLOCK and told you that you didn’t have any money in it, right?

3 – Transactions take the server offline. Never use them in queries.

Again, I’m having a hard time imagining what exactly occurred to make this seem like a good idea. You do know that every query has an implicit transaction? And your server is still online. Can you over use transactions, your transactions can be overly large, your transactions can run too long, or you can have too many transactions. But no, just using a transaction will not bring the server down.

4 – Do not JOIN tables because it leads to performance problems.

I’m actually used to a slight variation on this, “Never join more than three tables.” This is just a holdover from the old days when several things were going on. One, the old query optimizers just sucked, so choices on loops or merge or hash joins were really bad. We also didn’t really know how to index our tables properly back in the day (well, I didn’t), so it hurt performance a lot to join between tables. Our code was horrendous back in the day too, so it didn’t help. But, since at least SQL Server 2000, the optimizer is good, performance is fine. I saw an 86 table join (not one I wrote) run in under 200ms on SQL Server 2000 on ancient processors and small memory. Talk about completely busting that myth. This rule is crazy. Seriously crazy. You’re using a RELATIONAL storage engine and then saying that you can’t actually use relationships. If you don’t have or need relational data, certainly a very valid option, use a non-relational data storage engine. But if you have relational data and you’re using a relational engine, I strongly, very strongly, recommend you use the all the relational tools. That includes JOINs but also includes, primary keys, foreign keys, enforced referential integrity, and all the rest.

5 – Don’t use SSMS’s ‘Edit Top 200 Rows’ because it leaves locks on the table.

One I actually agree with. But not because of what they’re saying. I suspect someone must have found a situation where the UPDATE process from this window held a lock. But I don’t think this is a good way to work because I don’t like the Top 200 rows approach because, well, here’s a query from this menu choice in SSMS:

SELECT TOP (200) BusinessEntityID, CreditCardID, ModifiedDate FROM Sales.PersonCreditCard

Spot any issues? Apart from the missing semi-colon and missing column owners? Yeah a top without an ORDER BY. Which 200 rows are we getting? Yes. If you want to edit data, use an UPDATE command or a proper interface for the data, not this.

6 – Stored procedures should never be used. Prepared statements should never be used.

Plan cache should never be used. Oh, I thought that was the next statement. Look, we don’t have to get into the stored procedures are a blessing from Odin vs. stored procedures are a frost giant trick argument. We can say that using parameterized queries (stored procedures or prepared statements) leads to the reuse of plans for cache. Whereas, writing nothing but ad hoc queries results in massive amounts of queries piled into the plan cache, running through the full optimization process, chewing up your memory, your CPU time, and then you never use them ever again. Can you think of a more efficient mechanism for hurting your performance than taking away these tools? Me neither. Parameterize your queries, somehow. Every ORM tool I’ve worked with or read about can do this (and they can all use stored procedures too, just saying). There is no good excuse for not using parameterized queries for most of your code. And yes, there are places where ad hoc or dynamic T-SQL makes sense. But I’d argue that they’re the exception, not the rule.

Please, take the time to understand what went wrong when you hit a problem. Don’t just shy away from the thing associated with it. It’s like the Dark Ages when they executed inanimate objects after an accident. Surely we’re a little more sophisticated than that. Identify the problem, understand it, then apply appropriate solutions.

If you want to talk more about performance tuning, and you’re attending SQL Rally Nordic 2015 in Copenhagen, I’m doing a full day pre-conference seminar. Click here now to register.

Jan 16 2015

Two Weeks on the PASS Board

The one absolute promise I made about serving on the PASS Board is that I would let you know what I was doing there. Well, we’re two weeks in and I figured now was a good time for a report.

Next week is my first board meeting (and I will report back on how that goes). I’ll be presenting the goals I’ve worked up for the Chapters and see how many of them we can get approved and then how many of them we can deliver.

In the time I’ve had on the board so far, a big part of what I’ve been doing is learning about what has gone before. What kind of goals did we have for last year. Which of them were delivered. Which weren’t. Why. What’s our budget. That sort of thing. It’s been very interesting and enlightening. I’ll say this right here, send Wendy Pastrick (b|t) a note thanking her for her hard work for the PASS Chapters. She’s done some good stuff.

Another thing I’ve been doing is getting hooked into all the communications with the Chapter Leaders and the Regional Mentors. So far, most of that has just been along the lines of welcoming new Chapter Leaders, “Welcome to your new job, sucker, uh, I mean valued volunteer…” Kidding. I ran a chapter for four years. It’s tough, but rewarding. We’re setting up calls with the Chapter Leaders and with the Regional Mentors so that I can talk to them directly to get a good sense of what the heck PASS is doing for them or needs to do for them.

I’ve also been getting connected with my cohorts at the management company. I won’t betray who they are because they haven’t said I could (although, pretty sure a certain “Wonder Woman” is going to be upset that I didn’t drop her name, you figure it out). These are the people that are doing most of the work managing the relationships between PASS and the Chapters and the Regional Mentors. I wouldn’t say I’m just a figurehead, but I don’t get paid to do what they get paid to do. We’ve had several calls in order to keep things going on the goals already set and to get going on the new goals. I even got to approve a thing. WHOOP!

I’ll post again after the board meeting, especially once I see which of the goals I’m trying to get going I manage to get into the plan.

Anyone not seeing anything you expected to see in these updates or you have any other comments, suggestions, etc., post here or send me an email.

And no, I still haven’t changed everything.

Jan 14 2015

Common Table Expressions Are Not Tables

There’s power in naming things. Supposedly some types of magic are even based on knowing the correct names for things. The name for the T-SQL clause Common Table Expression (CTE) is actually pretty accurate. It’s an expression that looks like a table and can be used in common across the entire query (at least I think that’s what the common part refers to). But note, I didn’t say it was a table. It’s not. It’s an expression. If you look at the T-SQL definition at the link, it refers to a “temporary” result set. Now, to a lot of people, that means table. But it isn’t. Let’s look at this in more detail.

Here’s a query that defines a simple CTE and then uses it to query the date in the next T-SQL statement:

WITH    x AS (SELECT    soh.OrderDate,
                        soh.SalesOrderID,
                        sod.LineTotal,
						sod.ProductID
              FROM      Sales.SalesOrderHeader AS soh
              JOIN      Sales.SalesOrderDetail AS sod
                        ON sod.SalesOrderID = soh.SalesOrderID
              WHERE     soh.SalesOrderID BETWEEN 43683 AND 43883
             )
    SELECT  x.OrderDate,
            x.LineTotal,
            p.Name
    FROM    x
    JOIN    Production.Product AS p
            ON p.ProductID = x.ProductID;

If we run this, we get the following execution plan:

CommonExecPlan

Let’s not worry about the plan for the moment. Instead, I want to look at a couple of more queries:

--Derived Table
SELECT  x.OrderDate,
        x.LineTotal,
        p.Name
FROM    (SELECT soh.OrderDate,
                soh.SalesOrderID,
                sod.LineTotal,
				sod.ProductID
         FROM   Sales.SalesOrderHeader AS soh
         JOIN   Sales.SalesOrderDetail AS sod
                ON sod.SalesOrderID = soh.SalesOrderID
         WHERE  soh.SalesOrderID BETWEEN 43683 AND 43883
        ) AS x
JOIN    Production.Product AS p
        ON p.ProductID = x.ProductID;

--Just a JOIN
SELECT  soh.OrderDate,
        sod.LineTotal,
        p.Name
FROM    Sales.SalesOrderHeader AS soh
JOIN    Sales.SalesOrderDetail AS sod
        ON sod.SalesOrderID = soh.SalesOrderID
JOIN    Production.Product AS p
        ON p.ProductID = sod.ProductID
WHERE   soh.SalesOrderID BETWEEN 43683 AND 43883;

Three, nominally, different queries. Yet, if you run these queries, all of them return exactly the same data and all of them have exactly the same execution plan. That’s right, the CTE didn’t change the way that SQL Server derived these queries in any way. Nor would it. That’s because, the CTE is absolutely not a table. It’s an expression.

That name appears to hold magic. We see the word table and we think “Tables, I know all about them” and we proceed to start treating our CTEs as if they were tables, but they’re not. And Microsoft’s definition is completely accurate, if maybe just a touch misleading. These are not tables we’re dealing with when we work with Common Table Expressions.

Let’s look at the difference:

--Table Valued Parameter
DECLARE @TVP AS TABLE
(OrderDate DATETIME,
SalesOrderID INT,
LineTotal NUMERIC(38,6),
ProductID INT,
INDEX tt_ci CLUSTERED (SalesOrderID),
INDEX tt_i NONCLUSTERED(ProductID));

INSERT @TVP
        (OrderDate,
         SalesOrderID,
         LineTotal,
         ProductID
        )
SELECT soh.OrderDate,
                soh.SalesOrderID,
                sod.LineTotal,
				sod.ProductID
         FROM   Sales.SalesOrderHeader AS soh
         JOIN   Sales.SalesOrderDetail AS sod
                ON sod.SalesOrderID = soh.SalesOrderID
         WHERE  soh.SalesOrderID BETWEEN 43683 AND 43883;


SELECT  x.OrderDate,
        x.LineTotal,
        p.Name
FROM    @TVP AS x
JOIN    Production.Product AS p
        ON p.ProductID = x.ProductID;



--Temporary Table
CREATE TABLE #TT
(OrderDate DATETIME,
SalesOrderID INT,
LineTotal NUMERIC(38,6),
ProductID INT,
INDEX tt_ci CLUSTERED (SalesOrderID),
INDEX tt_i NONCLUSTERED (ProductID));

INSERT #TT
        (OrderDate,
         SalesOrderID,
         LineTotal,
         ProductID
        )
SELECT  soh.OrderDate,
                soh.SalesOrderID,
                sod.LineTotal,
				sod.ProductID
         FROM   Sales.SalesOrderHeader AS soh
         JOIN   Sales.SalesOrderDetail AS sod
                ON sod.SalesOrderID = soh.SalesOrderID
         WHERE  soh.SalesOrderID BETWEEN 43683 AND 43883;


SELECT  x.OrderDate,
        x.LineTotal,
        p.Name
FROM    #TT AS x
JOIN    Production.Product AS p
        ON p.ProductID = x.ProductID;

These two queries are using table objects, a table valued parameter and a temporary table. The execution plans are decidedly different from the one above and each other:

TableExecPlans

Interesting enough it chose to do a Merge instead of the Nested Loops and ordered the table scan to accomplish it. Yet, that query runs faster and has less reads than all the others… As long as you don’t count the time to load the temporary table. Then it just doesn’t work as well as the others, at all. Speaking of execution times, the first three queries all had exactly 27 reads and, within some variations, ran in about 30ms on my system. Anyway, query tuning is not the point of the discussion.

You can see the differences in the execution plans that you get between an expression, a CTE, and actual temporary storage, either table variables or temporary tables. Yet, you can also see that the CTE, an expression, just an expression, was resolved in the exact same way as the derived table and the simple JOIN. In short, a common table expression is not a table. Absolutely not. It’s an expression.

On a side-note, if you’re thinking of attending SQL Rally Nordic 2015 in Copenhagen and you want to spend some time talking about query tuning, I’m doing a full day pre-conference seminar. Click here now to register.

Jan 09 2015

I Am Grateful For

A while back I wrote about saying “Thank you” to people. Last night I was listening to NPR on my ride home from a great New England SQL Server User Group meeting (yes, I listen to NPR). Two different reports came on during my ride that got me thinking again about my previous blog post. The first talked about how negative emotions directly impact your ability to make rational decisions. They’ve found that people actually spend more money on items when they’re depressed than they do when they’re happy. There’s a bunch of research around it, but I missed where it was published. It was a great discussion. They found that getting yourself into a positive mood directs your focus outwards rather than inwards. One of the best ways to get that positive emotion train running is to think about what you’re grateful for. More specifically who you’re grateful for in your life.

Then, a couple of segments later the author of this book, Connected: The Surprising Power of Our Social Networks and How They Shape Our Lives
(on my reading list, I’ll get back to you) was on talking about how positive/negative emotions transmit themselves through our social networks, and not just the people next to you, but the people they’re next to and the people they’re next to. Three layers of separation.

I’m not the brightest fellow, but taking my “Thank you” post, in combination with the two segments on NPR, I’ve decided to get off my behind and start a blog series, “I Am Grateful For…” and I’m going to single out someone that I’m personally grateful for and tell you why. Still trying to decide on frequency, but probably one a month for a little while. The purpose is purely selfish. I want to think better, so I’m going for that positive emotion. I want to be in a positive environment, in order to maintain the improvements, so I’m going to spread that positive emotion.

Let’s get started.

I am grateful for Aaron Bertrand (b|t).

I consider Aaron a friend. We’ve known each other quite a few years although I don’t remember where we met exactly (had to have been at an SQL event somewhere). Aaron lives nearby in Rhode Island (and I forgive him for that) so we actually get to see each other occasionally in the real world. I’m ashamed to say that it’s Aaron that usually reaches out for these get-togethers. He’s great that way. He invited me down to his house to help raise money and awareness for ALS. Even though he works for a company in competition with mine, we get along great and he regularly invites me to events his company is putting on. He’s also terribly smart and shares that through his blog all the time, stuff that I learn from and incorporate into what I do (no, I don’t steal it). I appreciate his voice when we get in discussions (you should hear him at an MVP meeting) because he always seems to come at things from such an informed place. Funny enough, he’s one of the meanest Canadians you’ll ever meet, but I think that makes him great too. He takes time away from a very young family to get out to events and share all the stuff he knows on a pretty frequent basis.

Thanks Aaron.

Jan 06 2015

Execution Plans, Performance Tuning and Rum

captain-jack-sparrow-captain-jack-sparrow-2485446-1024-768

This is me on the last cruise

In just a few more weeks I’ll be setting sail on the first of two SQL Cruise events this year. I’m honored beyond my ability to appropriately express it to be included as one of the Technical Leads for both the cruise in February (7-14, 2015, I think there’s a seat or two left) to the Caribbean and the one in June (14-21, 2015, definitely a couple of seats left) to the Mediterranean. Lest you think that this is just an excuse to hang out and drink, you ought to know a little about how sessions are presented on the cruise and the sessions I’m presenting.

Don’t mistake sessions on the boat for your typical one hour session at a SQL Saturday event. These sessions are two hours long. That means two things. First, I can spend quite a bit more time on my topic so that I get to put out some real in-depth information rather than the high level or glossed presentations you can do in one hour. Second, you get to ask questions, make comments, directly interact in an environment where, even if I say we’ll have to talk about this after the session, I’m right there so we’ll be talking about it after the session. Plus, you’ll still have the other Technical Leads there to help answer questions.

Because the most important aspect of the SQL Cruise is your access to both the Technical Leads and the other cruisers (including MVPs, MCMs, and just awesome and amazing people). This is the single most important aspect of SQL Cruise, the ability to sit down with people like Kevin Kline and pick his brain, without pressure, without the need to run to another session, without the ability of Kevin to escape (ha, not that he’d try, he’s one of the most helpful and kind people I know). The same goes for all the other cruisers and Tech Leads.

I’m doing two sessions on the cruise. The first is called “Execution Plans, What Can You Do With Them.” Yeah, yeah, yeah, query tuning. Whatever. You’re right, it is one of the important aspects of what execution plans do. But, in this session, instead of exploring an execution plan to try to improve performance, we’re going to explore execution plans to see what they tell us about how SQL Server works. What happens when you insert data? What happens when there’s a calculated column? How do the queries and the internals of the structures get resolved by SQL Server? Well, I can’t tell you everything about that because the execution plan is still an abstraction layer. But, there’s a ton of really interesting information available within these things. We’re going to dig around in them and see what we can see and just talk about what an execution plan shows us and how we can explore it. I’ve been working obsessively on a bunch of, what I think, are interesting plans with fun little details to explore. And, I’ll share all the scripts so you can explore them on your own.

Next, I’m doing a session on a performance checklist. I’m going to just flat out give away the stuff I go through and look at when setting up a server, designing a database, or consulting on performance problems. But more importantly, I want to discuss with you why I’m doing everything I’m doing. I’m counting on the fact that with all the MVPs and MCMs in the room, I’m going to be able to say a few things that they’re going to jump into the middle of. Oh yes, we’re going to have some excellent conversations during this session. I promise. I make a few suggestions that almost no one agrees with. We’re going to have fun. You won’t want to miss it.

Oh, yeah, and we get to do all this awesome learning, sharing, talking and laughing while basking in warm sunshine and sipping on the beverage of your choice (make mine a Mojito, no, wait, a Mulata Daisy, hang on, how about a Caipirinha? Yeah, I know, that’s not rum, it’s still good, and who knows, the rum might be gone).

SQL Cruise quite literally changes the life trajectory of people involved with it. The direct, extended, involvement between the people on the cruise energizes people to make positive changes in their lives. And we learn stuff. And we have fun. Don’t miss out.

Jan 02 2015

Speaker of the Month: January 2015

I love it that my first post of the new year is going to be Speaker of the Month. I’m really enjoying doing these because I’m getting to attend a lot more sessions at the events I go to in order to get choices. But, please, don’t bug me. If I can attend your session, I will. If I can’t…

Anyway.

Speaker of the Month for January 2015 is William Wolf (b|t) and his session “Common Coding Mistakes and How to Mitigate Them” that was delivered at SQL Saturday DC.

This was a good session. It was informative. I really liked how Bill (I’m going to use that because it’s easier to type and despite looking like the Demon Biker of the Apocalypse, he’s a bigger sweetheart than I am) kept referencing everything back to his own coding experiences. It’s a fantastic way to make a point and make it stick. Further, it lets the audience know where you’re coming from and why any particular point is important to you. Maybe they haven’t seen this problem much, or at all, and don’t think it’s important. But you’ve seen it burn down the house, so bringing your experience out is just a great way to communicate. I also liked how Bill emphasized each of the points he was trying to make. It wasn’t just a series of slides and examples, but he was making a specific point with each and the code reflected that, but Bill emphasized it. He really followed the “this is what I’m going to tell you, now I’m telling you, this is what I told you” model. I also liked how he worked Metallica into the slides. It was a useful and informative session, well delivered.

A few points that I think may help to make the session even better. First, stop using Profiler and trace events. Extended Events are here to stay. Learn them, love them, demo with them. Practice a little more with Zoomit. You used it very well, but every once in a while it sort of surprised you. That’s just practice. Be careful about reading the slides to us, especially when you turn to look at them and read them. Doing that to emphasize a point, cool. Reading for us, not so cool.

Thanks. Great session. It showed a lot of hard work, especially the demos. I was really impressed and enjoyed the session.

Dec 19 2014

“Pretty Plans vs. Performance” or “Grant Gets Pwned”

If you get an execution plan that looks like this:

prettyplan_uglyplan

I wouldn’t blame you for immediately thinking about query tuning. Especially if the code that generated it looks like this:

SELECT  soh.OrderDate,
        sod.OrderQty,
        sod.LineTotal
FROM    Sales.SalesOrderHeader AS soh
INNER JOIN Sales.SalesOrderDetail AS sod
        ON sod.SalesOrderID = soh.SalesOrderID
WHERE   soh.SalesOrderID IN (@p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9, @p10,
                             @p11, @p12, @p13, @p14, @p15, @p16, @p17, @p18,
                             @p19, @p20, @p21, @p22, @p23, @p24, @p25, @p26,
                             @p27, @p28, @p29, @p30, @p31, @p32, @p33, @p34,
                             @p35, @p36, @p37, @p38, @p39, @p40, @p41, @p42,
                             @p43, @p44, @p45, @p46, @p47, @p48, @p49, @p50,
                             @p51, @p52, @p53, @p54, @p55, @p56, @p57, @p58,
                             @p59, @p60, @p61, @p62, @p63, @p64, @p65, @p66,
                             @p67, @p68, @p69, @p70, @p71, @p72, @p73, @p74,
                             @p75, @p76, @p77, @p78, @p79, @p80, @p81, @p82,
                             @p83, @p84, @p85, @p86, @p87, @p88, @p89, @p90,
                             @p91, @p92, @p93, @p94, @p95, @p96, @p97, @p98,
                             @p99);

Let’s replace this with a table variable, maybe even one passed in as a parameter. The plan then looks like this:

prettyplan_pretty1

Ah, much prettier. I’m happy now, all’s right with the world… But, just in case, let’s look at performance. The first query ran in about 2.2ms and had 599 reads. The second query ran in about 24ms and had 598 reads… crud.

Well, let’s modify everything again. Instead of a table variable, we’ll use a temporary table and get some statistics into this puppy which will clean things right up. Here’s the new plan:

prettyplan_pretty2

Looks pretty familiar, although there are slight differences in the cost estimates between this plan and the preceding one. But the run time is 85ms with 714 reads AND I had to create the temporary table which added time to the whole thing.

Doggone it, that other plan is heinous and ugly and so is the query which uses an IN clause. Shouldn’t the cleaner, simpler, execution plan be an indicator that we’re going to get better performance?

Nope.

The thing is, just because the execution plan is simple and easy to understand does not mean it’s going to perform well. You can’t simply look to an execution plan to understand performance. You have to measure the run times on the query, look to the resources it’s going to use in order to understand where waits are likely, look at it’s reads, and take all this into account, along with understanding what the execution plan is doing in order to make the appropriate choices for performance on your system.

UPDATE:

I kept working with this because I was convinced I could get faster performance. The main difference as I saw it was that the optimizer sorted the data in the IN clause and I wasn’t explicitly sorting the data in any of my replacement queries. But nothing I did resulted in better execution times. And that was upsetting.

First, when you’re tuning a query, you’re going to look at the execution plans, as I did above. But, when you want to measure the performance of queries, it’s a very good idea to turn off execution plans and just capture the query metrics. I knew this and was doing it and you could see the results in the Extended Events where I was capturing each statement for the SPID I was working within. I also had the SET STATISTICS IO and SET STATISTICS TIME enabled for the query. Since each execution would cause those to fire as part of the statements and those were making my  ExEvents window messy, I decided to turn them off… WHOA! Query execution times radically changed.

In fact, my first attempt at tuning the query, substituting a table parameter, was suddenly faster than the original. The fastest was when I pre-sorted the data in a temporary table (discounting the costs of sorting and inserting the data into the temp table just for the moment). In fact, the prettiest plan was indeed the fastest.

Experimenting further, it was the STATISTICS IO that completely changed the execution times.

In short, pay no attention to my original post above, instead, let the lesson be that I need to be very cautious about the Observer Effect.

Dec 08 2014

Speaker of the Month: December 2014

For the month of November I was at the PASS Summit and Live360, so I didn’t get to find new speakers to see (heck, I didn’t get to see any sessions at Summit this year). But, that didn’t mean I didn’t get to see great speakers. I did (Rimma! Rimma! Rimma!). Live360 had some amazing presentations. I do like the mixed events like Live360 because I enjoy crossing into dev sessions as well as database sessions.

My speaker of the month for December 2014 is Bradley Ball (b|t) and his session “Inside the Query Optimizer

Oh, before I go any further:

HEY BALLS!

Sorry, but Brad had us do that at the start of the session. I figured I’d just carry it on. Anyway, wow. What a great session. Brad’s delivery is just awesome. He’s light and funny, but hits the points he means to hit and hits them well. I really loved the partitioned view example to illustrate simplification within the optimizer. Like I do with most of my speaker of the month sessions, I learned a few things. I also heard things that are driving me to do some research to help further my understanding. That right there should be one of the criterion for what makes a great session. Everything worked. His jokes were funny, but not overwhelming the content. As already noted, the demos supported the content extremely well. It was great.

Improvements… Yeah, a couple, maybe. I loved the way Brad presented, but, it had a feel as if he was speaking really fast. I don’t think he was, or maybe I just heard really fast, but you might try slowing down slightly. Again, more of a feeling than a real criticism. Brad had more scripts and material than he had time for. Like with my last winner, I can’t say this is a problem. But I’ve heard people complain about it. Again, you might try hiding some of the stuff and only pulling it forward as needed.

If you want to learn about the optimizer, come see me speak… kidding! Seriously, you won’t go wrong attending this session. You’ll walk away with a smile on your face (Brad’s funny) and more knowledge in your head. It’s a winner.

Side note: The last two winners are both established speakers who do this all the time and are therefore awesome. Isn’t this supposed to be about helping build up new speakers?

Yes.

I’ll try to do better. But this will happen sometimes.

Dec 05 2014

Speaker of the Month: November 2014

I have been very remiss in my blogging of late. Apologies.

One thing I haven’t blogged about is Speaker of the Month. I’m behind. So, I’ll be posting two of them over the next few days in order to catch up.

First up, Speaker of the Month for November.

I went to quite a few events in October, but I actually didn’t get to see too many sessions. However, the few I saw were actually quite good. Speaker quality is just going up and up. I finally got to see someone present that I’ve known for years and years, but just never had the opportunity to sit in on a class. Well, I sat in on one, and this is the result. My speaker of the month is Kathi Kellenburger(b) and her presentation, Writing Better T-SQL Queries with Window Functions.

Kathi is just great. If you’ve ever met her in person, she’s this quiet, unassuming woman. She’s like your favorite Aunt, maybe why her nickname is Aunt Kathi. But when she’s presenting, she takes total charge of the room. You know who the presenter is. She spoke without a microphone and I didn’t realize she could project like that, while sitting. The content of the session was great. She mainly worked in the code and through examples, but she had just enough slides that there was some content documentation to guide people through the lessons. Awesome work. Her examples were extremely illustrative of the points she was making about window functions. I picked up a few things and was reminded of several things that weren’t at the front of my brain any more. I loved it.

Everyone can improve a little. While Kathi presented a tour-de-force, I found a couple of things I didn’t like. She had a series of graphics about this guy on set of stairs, that just didn’t work for me. I don’t have a good suggestion on how to improve it, but it just didn’t hammer the point home well for me. Kathi also had more material than she could cover in the time. Now, that’s a mixed bag. I’d say it’s better to have too much material rather than too little. But, I’ve heard people complain that they didn’t get to see X, or Y, or Z, because of time constraints when, in fact, the presenter intended that material as padding in case the session ran short. I guess in order to avoid the complaints, maybe put the padding after the final slide so you can go to it, but only if needed.

However, as you can see, if those are my complaints, one slide graphic didn’t work, and she had too much great material, how amazing was this presentation? Exactly.

This is extremely useful material. I strongly recommend tracking down Kathi and checking out this session if you get the chance.