Aug 14 2014

A Full Day of Query Tuning

I’m excited to able to say that I’ve been given the opportunity to put on a full day workshop at SQL Connections on Friday, September 19th, 2014. The title is “Query Performance Tuning in SQL Server 2014″, but I assure you we’re going to cover things that are applicable if you’re still working on SQL Server 2005. We’ll start the day covering the different mechanisms you have to capture query metrics. We’ll go over dynamic management objects and extended events that are incredibly important to you in understanding which queries you need to tune. We’ll get an introduction into how the optimizer works and the importance that statistics, indexes and constraints play in helping the optimizer make the choices it makes. I promise, execution plans will be covered throughout the day, in great detail, because they are a fundamental part of how you’re going to understand the choices the optimizer made. Then, we’re going to go through common problems, how you can identify them, troubleshoot them, and solve them. We’ll get rid of a lot of myths and just easily fixed issues. Throughout the day we’ll be covering both older versions of SQL Server as well as SQL Server 2014. Then, to finish out the day, we’ll go over some of the new opportunities that are unique to SQL Server 2014, their use and their shortcomings.

In short, I’m trying to take you from not knowing which queries you need to tune, to identifying those problematic queries, understanding what the problems are and how to solve them. You’re going to know where to go to get started reading execution plans. You’re going to walk away with a love and fascination for extended events. You’re going to get tools and methods that you can apply to your own code, your own applications, your own servers. And, this all takes place after an amazing week of learning at the IT/Dev Connections event in Vegas. Please click here now to register.

Aug 06 2014

Say Thank You

I was approached by a man at SQL Bits who only wanted to thank me for blogging. He said, “I know you probably never hear from anyone, but you should. Thanks for what you do.” I’m not mentioning his name because he promised me he’d write an editorial for publication, so I’m going to wait on him to get that done so he gets full credit then, not on my silly blog post.

But, the idea stuck with me.

I realized, that I don’t do it. Oh yeah, I have an “Interesting Reading” set of links that I post to Twitter, Facebook and Linkedin. Sharing is “thanking” in our modern vernacular, right? No, of course not. It’s not the same. I really do appreciate the help that I get on the internet. Some of you are youngsters and don’t remember what it was like before the World Wide Web, even before America Online, before CompuServe, before the Bulletin Board Systems. We used to have to learn everything from books. EVERYTHING. We either knew someone who knew answers, or we suffered in ignorance. We walked to school and back home through 8 feet of snow, uphill, both ways… wait, where was I? Seriously, we have it good today. Better still, the SQL Server community is awesome for it’s sharing (oversharing even). There is almost limitless, free, good, information out there. The people who are generating it deserve your thanks.

So, if you read a blog post, and it was useful, take the time to post a quick thank you. Just do it.

For myself, I’m trying to think of a new blog series for thanking people.

In the mean time, real fast, in no order, no links, but very serious and heartfelt thanks to:

Kalen, Paul, Itzik, Brent, Tom, Tim, Adam, Wes, Benjamin, Kim, Don, Denny, Allan, Allen, Erin, Steve, Andy, Karen, Gail, Johnathan, and a bunch of people I’m not thinking of at the moment.

I’ve learned so much from you guys, for free. You’ve made a positive impact in my career. Thank you. I promise to try to do more to show that appreciation in the future.

Aug 01 2014

Speaker of the Month: August 2014

Speaker of the Month is now officially one year old.

I went back and reread my first post. The goals were for a relentlessly positive experience. Reading back through the twelve posts, I feel like I hit that mark pretty well. Yes, I’ve always pointed out places where improvements can be made, but I think I’ve done it in a constructive and positive manner. Plus, I’m picking your session (if you get picked), as the best session that I saw that month, which is pretty darned positive in and of itself. I also promised it would be random and arbitrary. Mission accomplished.

I’ve considered wrapping this up. I did it for a year, just to see what the response would be. I didn’t have much more of a goal in mind than trying to help out both attendees and speakers by pointing out what I thought were some awesome sessions. Maybe pointing out some places where people could improve would be helpful. Since I’ve been blessed with getting to present a lot, the implication is, I know what I’m doing and maybe I could share some of that knowledge with others. Done. Does it really need to continue? The feedback you get on most blogs and most blog posts is largely a null set. You toss the information out there, it sinks into the pond, the ripples quickly fade. You don’t have a clue if the blog and the posts are helping or not. I’ve received some feedback on this, but, it’s actually been mixed. I’ve had people tell me because I’m pointing out areas of improvement that I’m not being positive, that I’m in fact being mean. Which… I really don’t know where to go on that. If I just say, “It’s awesome” and walk away, is this useful? Is it helpful? Is it constructive? Will it make a positive difference to the person I’m praising? Heck, will it do anything for anyone except the person I’m praising? I’d say no.

I’m going to go with the thought that this matters, even if it is only to the people who get picked. But I think that hearing from someone, anyone, what works in a session and what doesn’t could be useful to those who are just getting started. So, assuming it makes a small, positive, difference, I’ll keep the Speaker of the Month award going for one more year.

Speaker of the Month for August 2014 is someone I just recently had the chance to meet. We had exchanged a number of emails on a topic that I knew nothing about, so I was utterly unhelpful to them. But, by explaining the problem to me, they figured out their own solution, and then turned that into a presentation. Wow! That alone is amazing. Then I saw the presentation and was blown away. My Speaker of the Month is Derek Stanley and his presentation Remove the Linked Server 2 Hop Limitation by Implementing Kerberos from SQL Saturday #302, Albany.

This is absolutely not a topic I’m familiar with or good at. I know I’ve run into the situation a couple of times. Explanations for exactly what was happening kind of pinged off my head. After sitting through Derek’s session, I have a better understanding of the problem and, I have a bunch of great solutions (that I couldn’t implement if you threatened me with torture, I just don’t know systems that well). Derek presented everything in a very clear fashion. He walked us through the problem space so we could understand what it was he was fixing, then he walked us through a bunch of different solutions. His explanations were good, but his demos really made the presentation shine. They were great. Demos can frequently be summed up as “look, I can make this work” but they don’t always teach. Derek’s were so clear and pointed that even I could understand how his solutions were working (still couldn’t replicate them). He put the presentation together really well, told a clear, concise, useful story in a way that made it easy to understand. The presentation was absolutely a win.

Areas I’d like to see improved are fairly slim. I think, my opinion, the slides were too wordy.  I get the “slides are documentation” school of thought. I just don’t agree with it. That one may be a throwaway. The one area that sometimes got a little confusing was caused by naming the servers VM1, VM2, etc. I think some clearer names that somehow help illustrate the points might help the demos along some. One more word: Zoomit.

In short, it was a good presentation with great demos. If you get the chance to see this one, I strongly recommend it. Derek, submit this to more events. If you’re organizing an event, accept this session. It’s good, you’ll be happy.

Derek, get a blog. Here are some articles he wrote for SQL Server Central. Here’s his LinkedIn profile.

Jul 23 2014

Challenge Accepted

There seemed to be some question whether my comfort level with my own masculinity would prevent me from wearing these:

Fluffy

Oh please! Couldn’t we be a little more challenging? Anyway, here’s the deal, you donate to a good cause, Doctors Without Borders. We hit 10K and I’ll sport those lovely rainbow whatever-they-are at the PASS Summit 2014. Sound good?

I’ll go one better. You double the goal, make it hit 20K, and I’ll present my session while wearing the rainbow whosimawatchits. BOOM!

BRING-IT

Jul 21 2014

Victims of Success

I took part in the PASS Summit 2014 selection committee this year because I was really curious about seeing how the sausage gets made. I’ve seen how actual sausage gets made and I still eat sausage.  Despite a few hiccups and communication issues, internal and external, I think the selection process for the Summit went really well this year. But, there was still some controversy. Being a naturally pushy person, I got involved in the controversy, for good or ill, and subsequently have had conversations with many people about the selection process (which, reiterating, I think went extremely well overall). But, the one thing that kept coming up over and over was a simple question:

How come I/PersonX didn’t get picked?

The easy answer is because you/PersonX had a horrible abstract. But you know what, in probably most cases, that’s not true. Good abstracts by good people didn’t get selected, so what the heck? I think the more complex answer does not go back to the selection committee or the selection criteria or the selection process. Do I think some improvements are possible there? Yes, and I’m putting my foot where my mouth is (or something) and joining the committees to try to make some tweaks to the system to make it better (and really, we need tweaks, I want to repeat, risking ad naseum, the process went well and worked great and I’m happy I took part and I think the outcome is pretty darned good). No, the real problem lies elsewhere, SQL Saturdays.

I’m not saying SQL Saturdays are themselves a problem. What I’m saying is that PASS took on the whole SQL Saturday concept for several reasons, one of which was for it to act as a farm team for speakers. This will be my 10th Summit. Looking back to 10 years ago, while I really loved the event, oh good god have the speakers improved. I remember sitting in sessions with people who were mumbling through their presentations so much that, even with a microphone, you couldn’t hear half of what they said. Slide decks that consisted of 8-12 pages of text (yes, worse than Paul Randal’s slides, kidding, don’t hit me Paul). Speakers who really, clearly, didn’t have a clue what they were talking about. It was kind of rocky back then. I learned my second year that you had to talk to people to find out, not just which sessions sounded good, but which speakers were going to present those sessions well enough that it would be worthwhile. Why were there so many weak presenters? Well, because there was almost nothing between speaking at local user groups and speaking at Summit (I made the leap that way). There were a few code camps around, a couple of other major events, various schools and technical courses, and Summit. I don’t know how the old abstract/speaker review process worked (and I apologize to whoever read my first abstract because I know now just how horrific it was and I’m so sorry I wasted your time), but I’m pretty sure they were desperate to get enough submissions that sounded coherent with a speaker attached that probably could get the job done. Not any more.

Now, people are getting lots of opportunities to present at SQL Saturday events all over the world. And it’s working. We’re growing speakers. We’re growing good speakers. Don’t believe me? Then you go to two or three events in a month, sit through 8-12 sessions, mostly by newer people, not Brent Ozar, not Denny Cherry, not Kim Tripp, and you review them, each, individually, then go back and try to pick the best one. Oh yeah, there’s going to be a few dogs in the bunch, but overall, you’re going to find a great bunch of presentations by a great bunch of speakers. Our farm system is working and working well. But there’s a catch.

Because we have upped the bar pretty radically on all the introductory level speakers (and if you’re thinking about presenting, don’t let that slow you down, everyone starts at zero and goes up), that means the competition at the top (and yes, I do consider the Summit the top in many ways, not all, see SQLBits) is becoming and more and more fierce. That means, my abstracts probably need quite a bit more polish than they’re getting (and so do yours) because there are a whole slew of speakers coming up that are writing killer abstracts. That means I need to really be concerned about the evaluations (despite the fact that I get dinged because the stage is low, the room is hot/cold, lunch didn’t have good vegetarian choices, England left the Cup early, all outside my control) because there are new speakers that are knocking it out of the park. In short, you/I/PersonX didn’t get picked because the competition has heated up in a major way.

In short, a sub-section of the community, defined by those who wish to speak, are victims of the success of the farm team system as represented by SQL Saturday. On the one hand, that sucks because I now need to work harder than ever on my abstracts, on the other, we’re going to see very few instances of really bad presentations at Summit. We’ve improved the brand and the community. It’s a good thing.

Jul 15 2014

Execution Plan Details

I wouldn’t say it’s common knowledge that you should look at execution plans when tuning queries, but it’s not exactly uncommon knowledge either. But, people tend to get focused on just looking at the graphical part of the plan and there’s just not enough information there. Let’s take a look at a query:

SELECT  pc.Name,
        ps.Name,
        v.Name,
        pr.ReviewerName,
        p.Name,
        v.ModifiedDate,
        p.Color
FROM    Production.Product AS p
        LEFT JOIN Production.ProductReview AS pr
        ON pr.ProductID = p.ProductID
        JOIN Production.ProductSubcategory AS ps
        ON p.ProductSubcategoryID = ps.ProductSubcategoryID
        JOIN Production.ProductCategory AS pc
        ON pc.ProductCategoryID = ps.ProductCategoryID
        JOIN Purchasing.ProductVendor AS pv
        JOIN Purchasing.Vendor AS v
        ON v.BusinessEntityID = pv.BusinessEntityID
        ON pv.ProductID = p.ProductID
WHERE   v.ModifiedDate = '2006-02-17 00:00:00.000'
AND p.Color LIKE 'Y%';

This generates an execution plan that looks like this:

PlanDetails

Neither v.ModifiedDate nor p.Color have indexes. Yet, we only see a single scan in this plan, on the BusinessEntity.Vendor table. Why? Well, we can’t tell from the GUI directly, so, you have to look to the tool tips or the properties. The tool tip in this case actually proves helpful, as does the properties:

PlanDetailsProperties

In short, the clustered index is used to seek out a number of rows and then a secondary predicate is placed on those few rows to further filter the results. You would never even guess at that using just the GUI alone. You have to look to the details of the execution plan to understand that. There are lots of other examples where you can only get the information from the properties because not everything is available in the tooltip. This can include things like Optimization Level, Reason For Early Termination, Scan Direction, and a bunch of other things. Just remember to drill down to the properties in order to better understand your execution plans.

Why do you need information like this? Are we going to tune this query? Maybe we could help it by adding an index t the Vendor table. But, at least for this query with this date range, it appears we don’t need an index on the Product table. But, that may change depending on the number of rows returned from the Vendor table. A different data set can result in an entirely different execution plan with entirely different performance characteristics.

Query tuning is tough. That’s why I’ve put together a one-day seminar to get you started on it. There’s still room in Albany, on July 25th. You can register here. I’ll also be teaching at SQL Connections in September in Las Vegas. Go here to register for this great event. In Belgium in October, I’ll be doing an all day session on just execution plans at SQL Server Days. Go here to register for this event. When I did a similar session at the PASS Summit two years ago, it sold out. I’ll be presenting at Summit this year in Seattle in November. Go here to register. But do it early or you may be put on a waiting list.

 

 

 

 

Jul 11 2014

Speaker of the Month, July 2014

Another month another bunch of great presentations. I almost don’t want to do this any more. It’s hard. I sit through a presentation and I think, “Well, here’s the winner this month.” Then I go to another presentation and I think, “Well, fudge, now one of these people loses.” Then I go to a third and I’m simply blown away. And now I have to pick. Well, it’s hard. So let me do this, I’m going to declare two winners this month, but only review one of them. Hey, my blog, my rules.

First, I want to award speaker of the month for July 2014 to Wayne Sheffield(b|t) and his presentation Table Variables and Temp Tables that I saw at SQL Saturday 294.

What’s my measure? That I learned stuff and was entertained. Well, I pretty much guarantee, unless you too are an MCM, and none of you can ever become one, ever again, you’ll probably learn something from Wayne too. This was a great presentation. I liked the way he started off with a “What do you think” set of slides to kind of get the audience primed for the material that was coming. It was a good approach, much better in a lot of ways than simply giving an agenda. And the information, scads of it, just flowed out of the presentation. I picked up a number of things that I didn’t know about some of the innards of cardinality estimates on table variables (during recompiles). The demos were absolute perfection. They did an excellent job of making the points in an extremely clear manner. And Wayne told a joke that I immediately stole (sorry Wayne, it was that good), the very next week while presenting (I did give you attribution, afterwards). In short, informative, structured, educational, entertaining, a great presentation.

There were a few issues that Wayne and I talked about. He has too much information to give out in one hour. Because of this, Wayne tends to not stop or pause to ensure the audience got his point and just moves on. It’s frankly a mixed bag on this. I think he ought to pause to ensure people are understanding his concepts (because they’re great), but he has a lot to present, so he can’t waste a lot of time (because his stuff is great). Tough conundrum, but I’d suggest going with trimming some material and ensuring full understanding, but that’s one that people can honestly disagree on. While when he tells a joke, it’s funny, the overall presentation style was just a little dry and slightly monotone. I think that’s just a question of a little practice more than anything. Maybe chatting up the crowd before you start to loosen yourself up will help.

That’s it. It was a great presentation with tons and tons of wonderful information. I’m quite happy I attended it. You should track it down too.

Second, and not second place, just my other awardee of the month, Louis Davidson(b|t)  simply blew my socks off with his presentation on Database Design Fundamentals at SQL Saturday 286. Louis is an MVP, author and frequent presenter, and oh my god, now I know why. It’s like a switch flips and Louis, who is a pretty quiet and unassuming guy, turns into SUPER-PRESENTER! And a tsunami wall of information comes hurtling at you along with quips and experience and stories. It was one of the single great presentations I’ve seen in, well, forever. So if I didn’t bring up that fact here on the blog, I’d be doing him an utter disservice. But… my blog, my rules, he’s so good, I didn’t want to just hand him the title and be done. It wouldn’t be fair to all the other amazing presentations I saw that they were upstaged by a consummate professional like Louis.

Please people, help me out, do really horrible presentations, at least some of the time, so I don’t have to actually think about this stuff. I’m lazy and you’re making me work.

Jul 08 2014

Worst Day of a DBAs Life

5worst_1_300x250_NEWRed Gate Software is running a campaign around coping with the worst day of a DBAs life. We’ve been posting some really fun stories with, I hope, a kernel of useful information inside each. Chances are, if your DBA career has been like mine, your worst days don’t involve explosions and car chases. But now they’re asking for people to write up stories, what was the worst day in your life as a DBA. I know, I know, first world problems, right? Regardless, I have yet to put a company out of business or kill anyone with any errors I’ve made, although I’ve worked at places where either was possible. But the one day that just stands out, well it started about three weeks ahead of the bad day.

I was working for an internet startup. I was very much still learning the ropes as a DBA, although, I had helped design a system on SQL Server 7.0 that was collecting about 1gb of data a day. Back in those days, that was big data. But, frankly, I wasn’t doing the monitoring correctly. I was doing a lot of manual checks and manual maintenance, stuff I should have taken the time to automate. Live & learn, right? Anyway, because I was taking a lot of time out of each day to do maintenance and run checks on the systems, I wasn’t spending lots of time supporting the development teams. One day, one of the managers came in and said, “No more maintenance. Things should be fine. Spend time on development.” And he was serious. I argued and lost. So I started spending a lot of time doing development and let the maintenance slide. Fast forward three weeks, things had largely been stable, but, I didn’t have monitoring in place, so I wasn’t noticing that we were running a little hot on transactions. The transaction log was bigger than it had been. And then disaster struck. The backup drive filled. I didn’t notice. Transaction log backups started failing. I didn’t have alerts. The log drive filled, all the way, and our 24/7, zero downtime web site went kablooey.

It was at 2 in the afternoon or something, so I, and my fellow DBAs were right there. We immediately tried to backup the log, but it wouldn’t backup. We tried adding a log file. Didn’t work. Then we started getting creative. We tried every possible thing we could think of. Some of them failed quick, so we tried something else. Some of them took hours to fail, making us think we had fixed the problem. It took us 48 hours of failed attempts before we finally admitted defeat, went to the last good backup and restored the database, losing about 12 hours worth of transactions. It was a nightmare.

The good news was, our directive to stop doing maintenance was cleared. We immediately went about setting up alerts and other things so that we wouldn’t get surprised like that ever again. It wasn’t fun, but it was a great learning experience. Plus, all the troubleshooting for 48 hours straight provided excellent camaraderie within the team. That said, I’d rather have avoided the whole thing, and could have with proper monitoring and alerting.

Jul 03 2014

Reflections on the 2014 PASS Summit Selection Process

Oh we are a bunch of high school kids at heart. Maybe high school never ends (and there’s a nightmare, god I hated high school). But, there’s been drama about the 2014 PASS Summit sessions and the Selection Committee’s work. I was on the committee. I worked as a part of the team responsible for rating sessions for the Azure track (said track is gone, more on that later). As self-serving a statement as this is, I think we did a good job. Further, I think the process worked. You can read the official explanation of the process here. Amy did great work and deserves your thanks. All the volunteers who reviewed over 900 submissions from more than 300 people, ON THEIR OWN TIME, FOR FREE, also deserve your thanks. The vitriol directed at the PASS organization over the outcome of this selection process is not directed only at the Board. It’s also directed at the volunteers. And, as a volunteer, that sucks.

The team I worked on rated, I forget, 50 sessions I think. We had to read through them and give them a score based on several criteria. We also had to write comments on each and every session. I was dinged by HQ for not writing a comment on a session that I gave 5’s to on the ratings (so I commented something like “Can’t wait to see this at Summit”). We were only given 10 slots to fill, so that means 40 sessions got kicked to the curb. That’s a lot of people who didn’t get selected. And not getting selected sucks (yes, I do know, I’ve been rejected by a number of events this year, big ones, even ones I’ve spoken at previously, not whining, just pointing out that I don’t have a secret method for getting accepted). Our track actually got eliminated and the sessions that we selected were distributed to other tracks. Also, a couple of sessions we rated highly didn’t do so well when the speaker scores were applied, so there was some shift there (one thing PASS could improve, give us some indication of the secret sauce there, we know there is one, but a little understanding of how it’s applied would help). But over all, the sessions we rated highly, got selected. Congratulations and well done to those speakers. Just look at the people presenting, many for the first time. That’s going to be an absolutely awesome event. And once more, thank the volunteers for doing all that work.

So, some of you are now thinking that, “Oh, Grant’s on the side of PASS” (well, actually, yes, I am, so should you be) “Grant has been told to be nice and play good and not be critical” (even though I’ve already made a criticism about the magic numbers and I was tweeting almost literally threatening messages this week) “Grant got selected so he’s being a <insert bad name here> about the whole thing” (I may or may not be a <insert bad name here> but I don’t agree that I’m being one about this) or, maybe you’re on the other side “OMG! He’s criticizing PASS in any regard, The HUMANITY! Have you at long last sir no decency” (no, not really).

Remember those comments, that I had to write for every abstract, including the great ones? I put a small critical review of the abstract in every one (OK, not the one that I gave 5s to). I said what was wrong with the abstract in my subjective opinion. And let’s be perfectly clear about that (channeling President Obama), they’re my opinions. If I thought you didn’t define the problem space your presentation was meant to address, I said that. You disagree? OK. If I thought your very clever and witty title seriously detracted from the clarity of what the session was about, and it wasn’t that witty, I said that too. You’re the wittiest person you know and everyone says so? OK. My opinion may not jive with yours. But, it’s the one thing I’ve seen everyone who has ever been rejected by the committee ask for, “Tell me what I can do to improve.” OK. I did. At least in my opinion. On every single abstract (except that one).

PASS didn’t release them.

And then, PASS did.

The volunteers (unpaid, remember) did the work, and now it gets to see the light of day.

This brings up a number of points. First, when I got rejected by those other events, did I get a reason for my rejection? Nope. Other events just reject you, thanks for playing. I think PASS, which is all about community, should be different. We should tell people why, not just that there were higher rated sessions, but what they can do to improve. I’ve talked to people in the know, not all the comments provide that kind of information. I think we’ll get better next time. Second, peoples feelings are going to be hurt by these comments. Yes. Yes they are. Suck it up buttercup. You want to know what you can do to improve so you can get selected, but your abstract is absolute perfection (in your opinion), so how dare someone else suggest that it’s not worthy of inclusion, blah, blah, blah, We’re going to see lots of blog posts where people disagree with these comments and that could reflect back in some negative way on the organization. I suppose so, but if we’re going to be about community and we’re going to try to raise up new speakers, we’re going to have to be able to deal with some degree of friction. That may even come from experienced people irked that they didn’t get picked. Everyone has a bad day. Again, I think we can weather this. Finally, the different teams and individuals on the teams probably gave substantially different levels of comments with varying degrees of quality. Some of the comments are just going to be useless. Further, My opinion probably doesn’t jive with my teammates in every regard. Maybe a team didn’t put critical comments in at all (although they had to put in comments). Yes, these things are going to be uneven, maybe even contradictory. OK. Again, cope.

This blog post once started off as a rebuke of the selection process around those comments. It’s not now. I want to repeat, one more time, I think the committee did great work and selected an awesome set of presentations that will make for a wonderful Summit. Thank you for all your hard work. And thank you, Amy, for doing a great job organizing what is a daunting task. And thanks for releasing the comments.

Jul 02 2014

Statistics and Natively Compiled Procedures

Statistics are one of the single most important driving factors for the behavior of the query optimizer. The cardinality estimates stored within the statistics drive costing and costing drives the decision making of the optimizer. So, how does this work with the new SQL Server 2014 natively compiled procedures? Differently.

In-memory tables do not maintain their statistics automatically. Further, you can’t run DBCC SHOW_STATISTICS to get information about those statistics, so you can’t tell if they’re out of date or not or what the distribution of the data is within them. So, if I create some memory optimized tables, skip loading any data into them and then run this standard query:

SELECT  a.AddressLine1,
                a.City,
                a.PostalCode,
                sp.Name AS StateProvinceName,
                cr.Name AS CountryName
        FROM    dbo.Address AS a
                JOIN dbo.StateProvince AS sp
                ON sp.StateProvinceID = a.StateProvinceID
                JOIN dbo.CountryRegion AS cr
                ON cr.CountryRegionCode = sp.CountryRegionCode
        WHERE   a.City = 'London';

The estimates in the execution plan for the query show some pretty wild values:

Estimates

That’s an estimated number of rows of 65,536 and an actual of zero in a table scan because I created my table without an index. If I recreate it with an index, but still no data, the estimates change and we have an index seek operation now:

EstimatesIndex

That’s suggesting that the optimizer thinks there are 256 rows. But there’s no data here. So, let’s load some data into the table. Here are the new estimates from the index seek operator:

estimateswdata

I haven’t yet updated the statistics, so the optimizer still thinks there are zero rows in the table, or at least, it has no statistics. Well, not true:

SELECT  s.name,
        s.auto_created,
        s.user_created,
        s.filter_definition,
        sc.column_id,
        c.name AS ColumnName
FROM    sys.stats AS s
        JOIN sys.stats_columns AS sc ON sc.stats_id = s.stats_id
AND sc.object_id = s.object_id
        JOIN sys.columns AS c ON c.column_id = sc.column_id
AND c.object_id = s.object_id
WHERE   s.object_id = OBJECT_ID('dbo.Address');

If I run this, I’ll see statistics on the table, including system generated statistics:

stats

One point, while statistics don’t update automatically, they clearly can still be created automatically. But I can’t run DBCC SHOW_STATISTICS to see what’s in there. So, let’s see what estimates look like in the natively compiled procedure. I’ll take the same query code above and convert it to a proc. Then, when I capture the estimated plan from the procedure (no actual plans allowed), the index seek operator shows these estimates:

estimatescompiled

Now, we have an accurate estimate on the number of rows this query is likely to return. Interesting. So, let’s drop the procedure, update the statistics and then rerun the query and the procedure. First the query. The estimates don’t change. I’m still seeing estimated values as 256 while the actual is 434. So, let’s free the procedure cache and try again:

EstimatesUpdated

Ah, there we go. The plan itself came out the same way, but we clearly have more accurate estimates now. On to the procedure. I’ll recreate it and then get the estimated plan. Here are the estimate values from the same index seek operation:

Estimatescompiledupdated

Oops. Still estimated 0 rows.

What’s all this mean? I’m not sure. The documentation from Microsoft in this area is sketchy. During the most recent 24 Hours of PASS, I was able to ask Microsoft about the impact of statistics on natively compiled plans. They suggested that it was not necessarily going to be the same as we see in standard queries. These tests make that fairly evident. Also, it looks like the default values of estimated rows for in-memory tables is different. If I create standard tables, empty, and run the same query against them, the estimated number of rows is what I expect, 1. But in the case of in-memory tables it’s 256 with an index and 65,536 without one (or at least that’s what I’m seeing). However, the estimates for the natively compiled procedure never changed in this test case, always at 0. This is hardly shocking, but it seems that different rules apply for in-memory tables and their statistics as well as natively compiled procedures and their consumption of those statistics. And, as Microsoft has changed the default estimated number of rows for table variables from 1 to 100 in SQL Server 2014, it seems we have another instance where they’re defaulting to an even higher value and one where the values seem to just disappear.

The behavior of statistics within in-memory tables is extremely interesting because you may see changing behavior with those tables as your queries get more complex and your data changes. It makes a very strong case for making sure that you update your statistics on a regular basis on these tables.

I’m taking this show on the road. If you want to get an all day class on query tuning, I’ve got a lot of opportunities coming up. I believe that Albany, on July 25th, is not yet sold out. You can register here. I’m teaching an all day session at SQL Connections in September in Las Vegas. Go here to register for this great event. In Belgium in October, I’ll be doing an all day session on execution plans at SQL Server Days. Go here to register for this event. I’m excited, and more than a little humbled, to get the opportunity to present an all day pre-conference seminar at the PASS Summit in Seattle in November. Go here to register.