Understand the True Source of Problems

There’s an old joke that goes, “Doctor, doctor, it hurts when I do this.” While the person in question swings their arm over their head. The doctor’s response is, “Don’t do that.” Problem solved, right? Well, maybe not. Let’s take a quick example from life. I do crossfit (yeah, I’m one of those, pull up a chair I’ll tell you all about my clean & jerk progress… kidding). I’ve been experiencing pain in my shoulder. “It hurts when I do this.” But, I’m not going to stop. I’ve been working with my coach to identify where the pain is and what stretches and warm-ups I can do to get around it (assuming it’s not a real injury, and it isn’t). In short, we’re identifying the root cause and addressing the issue rather than just coming up with a “don’t do that” style solution.

As is usual with one of my rambling intros, I’m going to tie this back to our work as data professionals, hang on.

I recently ran into a set of requirements that a new DBA was told are “best practices” and that he had to comply with. I’m editing them in order to save anyone embarrassment (although, if you wrote these, you should be embarrassed). I think that each one represents a moment of “it hurts” followed by “don’t do that” which, as you’re going to see, is absolutely the wrong response. As crazy as they all are, it’s not the first time I’ve seen them. This information is clearly coming from some fever-filled internet-swamp where stupid mutates and grows. Let’s nip these in the bud.

1 – All queries against the production database must be executed by using RDP to connect to the production system. This is because connecting through SSMS will cause the server to crash

I couldn’t even begin to guess what happened that caused a system outage because of a query from SSMS, but I suspect it was some sort of crazy, extreme outlier of a query that probably didn’t run much better while RDPed into the server, but somehow avoided a network issue (or six). Who knows. Suffice to say, no. This is crazy. On multiple levels. The most important being, you’re giving people access to the operating system on production that really probably shouldn’t have it. And, you can only have one person connecting to production at any one time. Two teams ready for a deployment? Tough. Oh, and your production system, in addition to all the work it’s doing for SQL Server, it now has to handle all the work of SSMS and displaying the results of your queries. Was your server under stress when you started? It’s worse now. I really don’t know of anyone, including a very large number of remote DBAs, who don’t connect to SQL Server through SSMS running on their desktop.

2 – NOLOCK is required on every query. It makes things run faster.

The magic Turbo Button of SQL Server rears its ugly head, yet again. I suspect that if you went and looked at all the responses to questions on forums, blog posts, articles and presentations, there are more places defining exactly why this is bad than almost any topic except backups. One more time. Using NOLOCK on all your queries…  and if you’re putting it everywhere anyway, why not just use READ_UNCOMMITTED on your connections, it’s easier… Anyway, Using NOLOCK on your queries results in dirty reads. Yes, everyone says, “Oh, that means if someone is change ‘dog’ to ‘cat’ I could see either ‘dog’ or ‘cat’. I don’t care.” Right. That’s true. It also means when pages split and rearrange, you may miss rows or have rows duplicated. All the businesses I know love it when data is missing from the queries they’re running. I’m sure you wouldn’t mind if your bank used NOLOCK and told you that you didn’t have any money in it, right?

3 - Transactions take the server offline. Never use them in queries.

Again, I’m having a hard time imagining what exactly occurred to make this seem like a good idea. You do know that every query has an implicit transaction? And your server is still online. Can you over use transactions, your transactions can be overly large, your transactions can run too long, or you can have too many transactions. But no, just using a transaction will not bring the server down.

4 – Do not JOIN tables because it leads to performance problems.

I’m actually used to a slight variation on this, “Never join more than three tables.” This is just a holdover from the old days when several things were going on. One, the old query optimizers just sucked, so choices on loops or merge or hash joins were really bad. We also didn’t really know how to index our tables properly back in the day (well, I didn’t), so it hurt performance a lot to join between tables. Our code was horrendous back in the day too, so it didn’t help. But, since at least SQL Server 2000, the optimizer is good, performance is fine. I saw an 86 table join (not one I wrote) run in under 200ms on SQL Server 2000 on ancient processors and small memory. Talk about completely busting that myth. This rule is crazy. Seriously crazy. You’re using a RELATIONAL storage engine and then saying that you can’t actually use relationships. If you don’t have or need relational data, certainly a very valid option, use a non-relational data storage engine. But if you have relational data and you’re using a relational engine, I strongly, very strongly, recommend you use the all the relational tools. That includes JOINs but also includes, primary keys, foreign keys, enforced referential integrity, and all the rest.

5 – Don’t use SSMS’s ‘Edit Top 200 Rows’ because it leaves locks on the table.

One I actually agree with. But not because of what they’re saying. I suspect someone must have found a situation where the UPDATE process from this window held a lock. But I don’t think this is a good way to work because I don’t like the Top 200 rows approach because, well, here’s a query from this menu choice in SSMS:

SELECT TOP (200) BusinessEntityID, CreditCardID, ModifiedDate FROM Sales.PersonCreditCard

Spot any issues? Apart from the missing semi-colon and missing column owners? Yeah a top without an ORDER BY. Which 200 rows are we getting? Yes. If you want to edit data, use an UPDATE command or a proper interface for the data, not this.

6 - Stored procedures should never be used. Prepared statements should never be used.

Plan cache should never be used. Oh, I thought that was the next statement. Look, we don’t have to get into the stored procedures are a blessing from Odin vs. stored procedures are a frost giant trick argument. We can say that using parameterized queries (stored procedures or prepared statements) leads to the reuse of plans for cache. Whereas, writing nothing but ad hoc queries results in massive amounts of queries piled into the plan cache, running through the full optimization process, chewing up your memory, your CPU time, and then you never use them ever again. Can you think of a more efficient mechanism for hurting your performance than taking away these tools? Me neither. Parameterize your queries, somehow. Every ORM tool I’ve worked with or read about can do this (and they can all use stored procedures too, just saying). There is no good excuse for not using parameterized queries for most of your code. And yes, there are places where ad hoc or dynamic T-SQL makes sense. But I’d argue that they’re the exception, not the rule.

Please, take the time to understand what went wrong when you hit a problem. Don’t just shy away from the thing associated with it. It’s like the Dark Ages when they executed inanimate objects after an accident. Surely we’re a little more sophisticated than that. Identify the problem, understand it, then apply appropriate solutions.

If you want to talk more about performance tuning, and you’re attending SQL Rally Nordic 2015 in Copenhagen, I’m doing a full day pre-conference seminar. Click here now to register.

17 thoughts on “Understand the True Source of Problems

  • Jeff Moden

    I remember the original post where you got this from, Grant. I was absolutely amazed at the total ignorance displayed by the people that the OP said announced the ridiculous Draconian standards. Sounds like one of those shops where the only import people are front-enders and they all think that SQL Server is just a place to store data. Not the first time we’ll run into such people nor the last but stand in total disbelief that anyone would think that such standards will actually be effective. What were they thinking?

  • It was a stunner.

    I don’t know that they were thinking, or rather, they were, but coming from a position of complete ignorance. Once you start making decisions with no, or inadequate, knowledge your decisions are bound to be flawed.

    Heck, I keep having fights with Microsoft tech people about their recommendations for Sharepoint SQL Server databases regarding MAXDOP. It’s crazy what gets into peoples heads. Sometimes you can’t educate your way out of it.

    I really feel bad for the starting, but smart, DBAs, or the mid-range DBAs who won’t have the knowledge and the intestinal fortitude to override this kind of madness. That’s why I took that nutty stuff, edited it to hopefully mask the source, and posted it here. We have to fight this nonsense with every possible weapon (including pork chops).

  • herbert_tobisch

    halfway agree with #4 edit top 200 rows.
    it’s not a screamer, but if there are more than 200 rows in a table (would not use the feature on large tables anyway) you may tailor the underlying query to display any set of rows in any order you want. Always use it that way.

  • “Oh, and your production system, in addition to all the work it’s doing for SQL Server, it now has to handle all the work of SSMS and displaying the results of your queries.”

    BY SSMS I think you mean RDP right?

  • Sam Shollenberger

    Aaron M.,

    I think what he’s saying is that instead of all the overhead of SSMS running on DBAs desktops, it’s now sucking more resources from your production server. That’s assuming that once you remote in you fire up a session of SSMS on the Prod server to do your queries…

  • Kirk

    As for the stored procedure argument, most of what I have run into there are developers who either think stored procs are the only way the plan cache gets used, vs those who don’t really have no business querying data and are afraid of them. And you REALLY should see the tables these later people might create.

    But while on the subject of dated material perhaps one might address these:
    1) The plan cache can’t work with ad-hoc queries. Reality is that optimizer will attempt to change ad-hoc into parameterized query as of 2005.
    2) Always use joins over sub-queries. Which ignores the optimizer’s ability to re-write your query into a join many times.

  • Hello Kirk,

    Thanks for the feedback.

    1) It will through simple parameterization. But the queries it will work on are extremely simplistic, probably not the kind most people are generating. You can enhance this with forced parameterization, which is probably a good idea on most systems. But that also is fairly limited in what it can parameterize for you. That’s not to say you shouldn’t use it, but good coding practices are a better solution.

    2) Oh yeah, I wouldn’t support advocating for JOIN over a sub-query by default. Depending on the situation, each has its place. Further, with CTE, you can get, at least a little, code reuse within T-SQL in those situations where that makes sense.

  • Hi Grant;

    Nice hall-of-shame list. It seems there’s always some set of practices that aren’t “best” and general cargo-cultism that needs to be dealt with at each new engagement.

    Minor nit – in point #4, you seem to share the common misconception that the Relational Model of Data is so named because of the relationships between tables. It’s actually named for the mathematical concept of a relation, which is the basis for the structural aspect of the data model (i.e., “relation” maps to SQL’s “table”, not to joins/FKs).

  • Elizabeth

    Hey, Grant, thanks for the chuckle! Your article made me laugh out loud! I’m glad I didn’t think any of these were true before I read the article!

  • Keith Corser

    Grant you can add this quote to your list as well. “No triggers are allowed on any SQL server because they will crash the box”.

Please let me know what you think about this article or any questions:

This site uses Akismet to reduce spam. Learn how your comment data is processed.