Why Don’t People Use Columnstore Indexes?

Professional Development, SQL Server
I saw this question on SQL Server Central the other day and had an immediate, visceral reaction. I know why. Now, before I explain my answer, please, let me reassure you. I get it. You're busy. What I'm about to suggest is not meant as a direct critique of you. It's just an observation of the human condition. Heck, maybe I'm wrong. So, before you write the angry screed about how busy you are and why you can't possibly do what I'm about to suggest, believe me, I already understand. I'm still going to suggest something that's going to make some of you angry. Common Knowledge If you'll permit me, I want to talk about Extended Events before we talk about Columnstore. SIDE NOTE: Standing invitation, any time I'm at…
Read More

Adaptive Joins

T-SQL
I was surprised to find out that a lot people hadn't heard about the new join type, Adaptive join. So, I figured I could do a quick overview. Adaptive Join Behavior Currently the adaptive join only works with columnstore indexes, but according to Microsoft, at some point, they will also work with rowstore. The concept is simple. For larger data sets, frequently (but not always, let's not try to cover every possible caveat, it depends, right), a hash join is much faster than a loops join. For smaller data sets, frequently, a loops join is faster. Wouldn't it be nice if we could change the join type, on the fly, so that the most effective join was used depending on the data in the query. Ta-da, enter the adaptive join.…
Read More

Parallelism and Columnstore Indexes

T-SQL
Columnstore indexes are fascinating and really cool. Unfortunately, they're adding an interesting new wrinkle to an old problem. What's the Cost Threshold for Parallelism set to on your server? If you just said "The whatsis of whositz?" then the value is 5. The cost threshold is the point at which the estimated cost of an execution plan goes from definitely serial to possibly parallel. This default was set for SQL Server 2000 and hasn't been changed since. I've long argued, loudly, that it's too low. I've suggested changing it to a much higher value. My advice has gone from 35 to 50 and several places in between. You could just look at the median or the mode of costs on your system and use the higher of those values as…
Read More

Is Dynamic T-SQL a Good Design Pattern?

T-SQL
In a recent discussion it was suggested to me that not only is dynamic T-SQL useful for things like catch-all queries or some really hard to solve problems involving variable table lists, but is, in fact, a perfectly acceptable design pattern for all queries against a database. Note, in this case, we’re not talking about an ORM tool which takes control of the system through parameterized queries, but rather an intentional choice to build nothing but dynamic T-SQL directly on the system. To me, this was immediately problematic. I absolutely agree, you’re going to have dynamic T-SQL for some of those odd-ball catch-all search queries. But to simply expand that out to include all your queries is nuts. There really is a reason that stored procedures exist, and it’s not…
Read More

Which SELECT * Is Better?

SQL Server, T-SQL
The short answer is, of course, none of them, but testing is the only way to be sure. I was asked, what happens when you run ‘SELECT *’ against a clustered index, a non-clustered index, and a columnstore index. The answer is somewhat dependent on whether or not you have a WHERE clause and whether or not the indexes are selective (well, the clustered & non-clustered indexes, columnstore is a little different). Let’s start with the simplest: SELECT    * FROM    Production.ProductListPriceHistory AS plph; This query results in a clustered index scan and 5 logical reads. To do the same thing with a non-clustered index… well, we’ll have to cheat and it’ll look silly, but let’s be fair. Here’s my new index: CREATE NONCLUSTERED INDEX TestIndex ON Production.ProductListPriceHistory (ProductID,StartDate,EndDate,ListPrice,ModifiedDate); When I…
Read More