Jul 20 2015

SQL Server Management Studio – Footloose and Fancy Free

That’s right. There’s been a divorce. SQL Server Management Studio (SSMS) has been divorced from the server product. In fact, Microsoft is inviting you to the new SSMS coming out party.

I’m pretty excited about this. While I’m very comfortable in SSMS, to a large degree, it’s like that old pair of jeans that you’ve worn for the last 10 years. They’re comfortable too. Well, maybe a little tight when you pull them on out of the wash. One of the knees is gone. The legs are frayed so much it almost looks intentional. You just noticed a hole in the bottom. The zipper is acting up… Yeah, OK. These jeans have had it. So has SSMS.

The plan from Microsoft is to upgrade SSMS independently from the boxed product. In fact, since one of the goals is to coordinate functionality within SSMS with releases of Azure SQL Database, I think it’s pretty safe to say that we’ll be seeing very regular and frequent updates. While this is a good thing for SSMS and it’s a good thing for Azure SQL Database. This level of frequent updates might not be a good thing for individual data pros if you’re not used to actively pursuing new knowledge and new tooling all the time. Or, it can act as an incentive to help keep your knowledge levels up to date because you’re more employable if you’re maintaining your skill set.

Let this act as a prod to go out and learn some new functionality. Oh, and buy a new pair of jeans.

 

Jul 13 2015

How Do You Know You’re Good At Your Job?

You’ve been working as a DBA for X number of years. How do you know you’re good at it? Heck, you’ve been doing any sort of job for a while. How can you measure whether or not you’re competent?

The single best measure isn’t how much work you do, your accomplishments, the number of databases designed, whatever measure you have. That’s not it. The real measure, the one that counts, how do you perform when everything goes south? When that server goes offline or that database develops corruption or that SSIS package fails or, heck, you get a request to fix something that’s broken, even non-technical stuff like an incorrect W-2 form (fighting this battle currently)? Did you run around like your hair was on fire? Did you sit there stunned into immobility? Did you incorrectly read and interpret the requests and send a new copy of the same, uncorrected, W-2 form (again)?

Or, did you sigh, roll up your sleeves, spit on your hands, and get the problem fixed?

If you did that last thing, you’re good at your job.

Let’s face it, just about anyone can set up a server, install SQL Server, set up backups and then sit there monitoring them for the next five years. The test is not the day-to-day functions. The test is the abnormal, the broken, the wrong. I know people who’ve never had to fix a W-2 form, uh, I mean, a corrupt database. They’ll say, “Yeah, never happened, so I haven’t really looked at what to do about it.”

Oh boy.

Jul 06 2015

But I Don’t Have a Test Server…

I frequently see statements on forums along the lines of “I don’t have a test server, so I’m going to do something I’ve never done before directly on my production servers…” This is usually followed by questions along the lines of:

  • But that’s OK and I won’t get fired, right?
  • So how do I know if it worked?
  • Is that dangerous?

The replies are:

  • No. You should be fired
  • You won’t
  • By All the Gods! Yes! It’s stupidly, insanely, I don’t really like my production server, my data, my boss, or my job, dangerous. Yes.

I understand. You’re working for a not-for-profit, so you can’t afford tons of new servers. You’re looking at a 50tb production server of which, you can’t make a copy. Your bosses just don’t want to spend the money on servers for testing. So these are valid reasons to just skip any semblance of testing or assurance that what you’re doing is good, bad or indifferent for your servers, right? No. No, no, no. 10,000 times, no.

If you’re doing something for the very first time in production, you’re going about your job wrong.

You need to set up a test machine of some type, even if it’s just a tiny VM on your laptop. At a bare minimum, this test machine should have a copy of the structure of your production database. This means you can at least validate any script that’s going to modify structures will compile and run without error. After that, you should strive to have a representation of production data. This doesn’t have to be an exact copy. In fact, it shouldn’t be an exact copy. You should clean any sensitive data prior to using it for testing. Barring that, you should at have, again, this is part of the bare minimum, a copy of your production statistics applied to this test database.

To get a copy of statistics from within SQL Server Management Studio (SSMS), right click on the database in question and select Tasks. Within the Tasks drill down menu select Generate Scripts. You’ll get a simple wizard. Script the entire database. On the Set Scripting Options screen, you’re going to choose Advanced. There, you’ll scroll down until you see Script Statistics. By default, this will be disabled. You’re going to select Script Statistics and Histogram. From there, generate your database script. You can now run this script to create a copy of your production database.

After you create your brand new test database, I’d suggest turning off Auto Update Statistics. If you get a statistics update event to fire through any of the automatic means, it’s going to see that you have no data and will replace your production statistics with empty ones. Done.

You now have the bare minimum of a test database and you no longer have any excuses for not validating your scripts prior to running them on production.

Want to talk in person about query tuning? I’m doing two different all day pre-conference seminars. The first is in Las Vegas at Connections on September 14th. Click here now to register. The second is at SQL Saturday San Diego on September 18th. Let’s get together and talk about query tuning and execution plans.

Jul 03 2015

Speaker of the Month: June 2015

With my travel schedule, I don’t always get out to events where I can see a community speaker. Sometimes, I just see pros. That’s the case this month. Instead of skipping the month entirely, which feels wrong, I’ve decided to do a special award.

Speaker of the Month for June 2015 are all the speakers (except me) on SQL Cruise Mediterranean 2015.

I’ve said it before and I don’t mind repeating myself, SQL Cruise changes people’s careers. There are a lot of reasons for this. The number one reason is all the work that Tim & Amy Ford do to put together the cruise, create the itinerary, ensure plenty of structured and unstructured time on the boat to ensure intense interaction, and all the rest of the stuff that they do. The second reason is the people that they get to staff the event as Technical Leads. These really are some of the smarter, knowledgeable, and most importantly, friendly and approachable, people in the industry. Look at the list: Stacia Misner, Andrew Kelly, Argenis Fernandez, Aaron Bertrand. Want to learn about BI? Done. Interested in hard core system tuning? Done. Want to know how to properly write your T-SQL? Done & Done. Most importantly, want to sit down on a couch, hand Aaron/Argenis/Andrew/Stacia a drink and have a nice long chat about ways to improve processes at your work, how to get started blogging, how to make the next jump in your career path and get answers to these questions from people who know what they’re talking about? Seriously and thoroughly done. Let’s not dismiss other Technical Leads from other cruises which includes: David Klee (flat out one of the single smartest AND nicest people you’ll meet), Jes Borland (probably the single fiercest community advocate going, plus a bundle of unbridled energy and intense knowledge), Kevin Kline (I’m not worthy), and Jeff Lehmann (let’s talk cross-platform knowledge as well as process and experience).

All these people working within the structure that Tim & Amy have created have more than earned Speaker of the Month. These are the people you want to learn from and they teach the knowledge you need.

Bad news is, next month won’t have any community people either. After that, back on the regular community speaker track.

Jun 30 2015

Leadership Training Possibility

I know a few people in the SQL Server community who have been involved in Scouting. A couple of them have taken or helped lead Scoutings leadership training program known as Wood Badge. It’s an amazing program. I won’t even attempt to tell you what Wood Badge has done for me and what it does for Scouting. Suffice to say, Wood Badge helped to make me a better person and in return, I helped make Scouting better.

What’s this have to do with SQL Server? Not much really. But, what does it have to do with you and your career? Tons.

It was pointed out to me that there’s an Oracle users group that actually runs a leadership program. You can read about it here.

It strikes me that, maybe this is something we in the SQL Community ought to have. I’m posting this to see what kind of interest there is, if any. So if you regularly just read this blog and never post, sorry, today is the day it’ll really matter if you respond. So please, read through my general thoughts and let me know what you think.

First up, this needs to cover generic leadership principals such as vision, planning, communication, goals and measures and all the rest. It’s not about management. It’s about leadership. These are different and will be differentiated. This also has to have available some mentors, people who volunteer to monitor & guide attendees through the process. There will probably also be volunteers leading the classes, possibly independent of the list of mentors, we’ll see. Additional training would be on, well, easier to say what it won’t be on. It won’t be on speaking, technical stuff of any kind, or any company or technology. Instead, it would be about leading, maybe community building, running a user group, setting up and running an event, that sort of thing. It’s not going to be a single class. It’s going to take time. Let’s say, somewhere between 6 months to a year of commitment to the leadership courses (done remotely, anything else is too insane for me to consider managing) and then another 6 months committed to delivering on your project(s). And yes, this means that there will be an expected amount of work to be delivered as part of the whole thing. I’m not suggesting holding a class and then printing out certificates. I’m suggesting something similar to Wood Badge, something that you would be proud to put on your resume, something you’d be clamoring to help out with after you complete it. One more thought, it would be best if we could get an organization to back it. Barring that, we’d need to create an organization to back it. There would probably have to be a charge for this, and not a trivial one either (I haven’t thought it through, but let’s say more than $500, less than $5000, figure it out as needed). It won’t be open to anyone. I suppose an open call for application, but since it requires commitment, it might be invite only. Something to talk about. It needs a name.

This is not a course outline or an abstract. It’s really just a stream of thoughts to act as the starting point towards building a structure. My question to you is, what’s the interest? Please post. If I get my usual three responses, I’ll just write this off as an idea that won’t fly and move on.

Jun 09 2015

Speaker of the Month: May 2015

When I attend sessions at events, I try to mix it up. I’ll go to a session that, really, I probably will never use the information, just to expand my knowledge level a little. I also go to sessions by the masters so that I can try to expand my skill set. I go to sessions on topics that I feel like I know well just to reinforce my own learning and understanding. That’s where I got the Speaker of the Month for May 2015, Denis McDowell (t).

I went to his session titled DevOps for the DBA at SQL Saturday #380 in New York City. I present pretty regularly on this topic and I’ve helped build multi-day training courses on it. Seriously, I feel like I probably know it. However, as stated, I like to question my own knowledge of a topic and the best way to do that is by seeing what others are saying about it. Denis delivered. I really liked how he started out by defining exactly what he means by DevOps, since the term can be used poorly. He carefully examined the resistance that he sees occurring, fear and uncertainty being one of the leading causes. I also loved one of his phrases that he used, automate what you know. His delivery of the information was good. I really liked how he had a wealth of stories to emphasize the points he was making. It was a very good presentation with great information.

Denis and I talked about improving the presentation. The one thing I suggested was that he interact with the audience more. Get questions out like “How many people are using PowerShell now?” or “Who has automated a deployment process before?” or other stuff related to the presentation. I’ve found that the more you interact with the audience right from the start, the more they’ll interact with you, asking questions, etc. He could also pause a little more when he does ask questions to give the people a chance to respond.

I got good insight out of this presentation and I enjoyed watching Denis give it. I strongly recommend you track him down next time he’s in your area.

Jun 01 2015

Parallelism and Columnstore Indexes

Columnstore indexes are fascinating and really cool. Unfortunately, they’re adding an interesting new wrinkle to an old problem.

What’s the Cost Threshold for Parallelism set to on your server? If you just said “The whatsis of whositz?” then the value is 5. The cost threshold is the point at which the estimated cost of an execution plan goes from definitely serial to possibly parallel. This default was set for SQL Server 2000 and hasn’t been changed since. I’ve long argued, loudly, that it’s too low. I’ve suggested changing it to a much higher value. My advice has gone from 35 to 50 and several places in between. You could just look at the median or the mode of costs on your system and use the higher of those values as starting point. Getting a cost higher than 5 in a query is insanely simple. Which means that queries that are absolutely not going to benefit from parallelism go parallel. This causes unnecessary CPU load on your system and slow performance of the queries. So, listen to my advice and run out there and change this value immediately.

Or…

Columnstore indexes have two methods of processing. The first is the row mode. That’s actually the same method used by all your standard indexes and tables. You can write a query that does aggregate calculations against a table that has a Columnstore index. You’ll then see a performance benefit. However, check the properties on the Columnstore operator in the execution plan. Does it say “Actual/Estimated Execution Mode = Row?” Then you’re getting row level processing. That can be fine and you can see a performance improvement using row mode. There is another level of processing though.

Batch mode processing is where it takes 1000 rows at a time and processes the aggregations simultaneously across multiple threads. Wait, multiple threads? Yes. You have to have parallelism in order to get batch mode. Joe Sack talks about it here. You can read more about it in the SQL Server documentation. Batch mode processing is faster (depending, etc.). Especially if you’re dealing with larger data sets (more than tens of thousands of rows), you’re going to want to see Batch as the Execution Mode in the Columnstore operators.

There is an issue here though. That’s right, we just bumped our Cost Threshold for Parallelism up to 5000 (if 50 is good, 500 is better, and we’ll add a zero just to be sure). OK, 50. How high does the cost have to be in order to see benefits from parallelism and batch mode processing? Ah, well, there’s the rub. When I was testing this, I had about 100,000 rows in a table. Nothing large. Execution times went from 100ms to 50ms. Yep, getting parallel execution, batch mode processing, of the nonclustered columnstore index cut my execution time in half.

Next question should be, oh, well, what was the estimated cost of that query? 30? 20? 15?

Try 3.7.

Yeah, I got a 50% improvement in performance by changing the Cost Threshold from 50 to 3.

In other tests I had a plan with an estimated cost of .77. Execution time went from 75ms to 57ms and reads went from 115 to 62 because of batch mode processing.

Now, I am not advocating dropping everyone’s Cost Threshold to 3 (or less). What I am absolutely saying is that you have a new wrinkle that you must add to your calculations. Are you using Columnstore indexes anywhere? Then you need to examine those execution plans. What are their costs? If you adjust your Cost Threshold below them, do you see a benefit? If so, how low did you just go? Now, is that negatively impacting your other queries? You could leave your Cost Threshold as is and use a query hint with TRACEFLAG 8649 (documented by the Sensei Paul White) on your queries against Columnstore indexes, but that comes with a list of caveats as long as your arm. The other alternative is to, again, leave the Cost Threshold alone and use Adam Machanic’s method of forcing parallelism just for the specified queries. These options can be unattractive, but you have to deal with getting parallelism for Columnstore indexes somehow.

Welcome to the quandary. Tuning in and around parallelism just got harder because of Columnstore Indexes.

Want to talk query tuning some more? I’m doing a live, all day session at Connections on query tuning using execution plans this September, 2015. We’ll go over examples with columnstore and lots more. Click here now to sign up.

 

May 27 2015

PASS Board Update: May 2015

I’ve had a busy month on the Board.

You may have seen some tweets going by about our compliance with international laws regarding SPAM. This has been something of an education, so anything I say wrong below, it’s because I’m wrong, but not malevolent (at least in regards to this topic), so please be kind. It appears that the mechanisms we had on our emails for showing where the email came from and how to unsubscribe from it, weren’t completely in compliance… IF… we were sending emails that involved advertising. By we, I mean Chapters, not HQ. Now, that IF, could mean that we could skip out of meeting this requirement when our emails didn’t involve advertisements, but then it’d have to be in compliance when it did and we’d have to adjust our footers depending on the type of email… blah, blah, blah. I decided that we can get in compliance, now. Stay that way. Sure, many, most, of our Chapter emails don’t have to meet these regulations, but, if we set it up so that we do, then we never have to worry. I’ve worked with HQ. We’re in compliance. We’re getting the word out to the Chapters so if they use email other than ours, they too can be in compliance. We’re also making it part of the documentation so that future people will also be in compliance. We have some other work to do in IT to make an adjustment to the unsubscribe process, but that’s going to happen to. In short, this is almost completely fixed. Many thanks to Karen Lopez for all her help through this process. We couldn’t have done it with you.

I’ll bet that’s a lot more sausage making on display than most of you bargained for. This is what being on the Board looks like.

There was also a Board meeting this month. I was traveling so I only got in the last half of the meeting (on the phone in a shared van sitting in traffic at Logan Airport I might add, oh the glamorous life of a Board member). Good information was exchanged, largely setting us up for the in-person Board meeting next month.

Other than that, standard stuff, meeting with the HQ people regularly so that we keep the Chapter side of things running. A few minor decisions to move things forward. Still pushing on the goals and I hope to figure out how to get IT support for what we need to meet a couple of them.

I’ll report back after the in-person meeting next month. Please, please, please, any feedback on me, the Board, PASS, Chapters, my updates and other Board-related blog posts, the whole magilla, I want to hear it.

May 26 2015

Database Engine Tuning Advisor

I would love to see the Database Engine Tuning Advisor (DTA) pulled from the product. Completely. Heck, I feel bad that I included a chapter on it in my query tuning book (all updated for SQL Server 2014 by the way). Let me tell you why we need to pull this tool.

First, I understand its purpose. It’s supposed to be a fast and easy way to get some performance tuning done for people who just don’t have the time or knowledge to go through the full process of gathering metrics, evaluating poor performers, understanding root causes and applying indexes to fix those causes. I also readily acknowledge that it actually is an amazing piece of software. If you don’t agree with that, go read this white paper. With those acknowledgements in place, I still think there are two distinct problems with the approach and a serious fundamental issue with the premise. Let’s talk about the problems of the approach first.

Many, probably even most, of your query performance problems are in your T-SQL. Some of your query performance problems are in your table structure. Some of your query performance problems are in your indexes and statistics, but not a majority. The DTA can suggest indexes, statistics, materialized views, and partitioning (partitioning for performance I might add, not data management, and that opens up a completely different can of worms we can’t address here, I don’t have time). In short, it can’t address your number one problem, your code. Since it can’t address the number one problem, all it can do is work around the edges. You’ll run this thing, think you’ve addressed your issues, but your issues are still there and now you’re just as stuck as before you found the DTA and ran it. It just doesn’t fix the core issue, your code.

The second problem I see with it is that it doesn’t have enough intelligence built into it’s functionality. I say that with some reluctance because it really is a pretty amazing piece of functional code. But it suffers from a couple of shortcomings there. It’s completely dependent on the load provided to it. If that load is off, it’s recommendations are off because it just can’t have the intelligence to recognize that a poorly representative load has been provided. This lack of intelligence is supposed to be offset by the person running the DTA to ensure that they are gathering the right information and that they can interpret and test the resultant recommendations. Which brings us to the fundamental issue with the premise.

This is supposed to be run by people with no internals knowledge. Right? BUT! These people are also supposed to make a judgement based on the recommendations whether or not they should be accepted. Further, they should test all the recommendations prior to applying them to their production server. They also must gather a well structured and meaningful representative load from their production system in order to supply the DTA with good information. Further, the very sophisticated set of tests around the DTA actually makes a fundamental assumption that could be radically off, that the person designing the database has done a good and thorough job of correctly normalizing the structures. You agree with all these assumptions on the part of the DTA? Am I alone in thinking that we have a problem here? If people lack any understanding of the internals they won’t be able to judge those recommendations. If people don’t have the ability to gather and interpret performance metrics they won’t be able to test the suggestions of the DTA or provide it with the necessary test load. In short, the DTA can’t be relied on to solve the problem it’s supposed to solve because of the root cause of that problem, peoples lack of knowledge and understanding.

I won’t even get into finding indexes with names like this:

[_dta_index_TD_13_15_1334295813__K44_K1_K8_K57_K17_K60_K16_2_3_4_5_6_7_9_10_11_12_13_14_15_18_19_20_21_22_23_24_25_26_27_28_29_]

Instead of evaluating the suggestions made by the DTA and applying just those that make sense and will have a positive impact, people assume that every single suggestion from the tool is Gospel. They apply them all, without thinking, without knowledge, without appreciation of the possibility, sometimes even the likelihood, of serious negative impact.

I recognize that many people are stuck. They have a SQL Server instance that’s causing them pain and they don’t have the knowledge necessary to fix it. Further, I know a few of you have used this tool successfully in some situations. I agree that there ought to be some way to mechanically and automatically tune the server. However, the DTA is not that tool, despite it’s clear and obvious sophistication. Let’s get rid of it.

Want to learn how to tune queries? I’m putting on an all day seminar at Connections in September. Click here right now to register. We won’t use the DTA.

May 19 2015

Book Review: Connected

I heard about Connected from a show on NPR (Yes, I listen to NPR, why do people keep saying that?). It was right after another segment talking about how positivity affects your mood and your ability to think and act in a clear fashion. I’ve long been a believer in the ability of your network to impact you, but I really didn’t think about it beyond that. Hearing about the book Connected changed my thinking, so I ran out (meaning, connected to Amazon) and got a copy.

The premise of the book is pretty simple. You have close friends and acquaintances. Your close friends and acquaintances also have friends and acquaintances, that may or may not over lap with yours. Those people also have another set of friends and acquaintances. And here’s the kicker, that third layer, not your friend, or your friend’s friend, but your friends friends friend can affect your daily mood, the amount of exercise you do, whether or not you smoke, your involvement in crime, all sorts of things. The book sets out to prove it. Along the way you also learn about things like why you probably only have somewhere between 3-8 close friends. Why you probably don’t have more than about 100 people that you communicate with regularly (uh, but what about my 7,000+ Twitter followers?). How these are to a degree biological factors hardwired into you. Most interesting of all is how the ripples just fade away at the third layer, over and over again throughout their studies and their testing.

The book was just filled with highly interesting facts about how your network influences you. Also, how you can influence your network. It also matters the type of network that you have. Are you connected to lots of people that aren’t connected to each other, weak ties, or are you connected to lots of people that are all connected to one another, strong ties. Each of these types of networks influences you differently. Your behavior within a network is probably following one of three paths; cooperator, you’re willing to help others, free rider, you’re letting others do the heavy lifting, enforcer, you’re making sure everyone follows the rules. Your behavior is also likely to shift between those roles depending on who you’re interacting with and when.

In short, a fascinating book. I do have a nit to pick with it though. At the end of it all, I have a great set of information about what a strong network would look like. I get a good sense of why I would want to have a strong network. Nothing about how to really get a strong network other than making sure my friends are connected with my friends and that my friends, and as much as possible their friends and their friends, are all on a positive path. Right. I’m sure that’s easy to work out. Guidance around this network thing would have been nice.

My own takeaway, be positive, act positive, strive, earn my Trident every day, and, at least according to Connected, that should go out into my network like ripples in a pond. Further, I should see those same ripples coming back, reinforcing my own approaches.

I have no idea how to measure this. Ha!