Which SELECT * Is Better?

The short answer is, of course, none of them, but testing is the only way to be sure.

I was asked, what happens when you run ‘SELECT *’ against a clustered index, a non-clustered index, and a columnstore index. The answer is somewhat dependent on whether or not you have a WHERE clause and whether or not the indexes are selective (well, the clustered & non-clustered indexes, columnstore is a little different).

Let’s start with the simplest:

SELECT    *
  FROM    Production.ProductListPriceHistory AS plph;

This query results in a clustered index scan and 5 logical reads. To do the same thing with a non-clustered index… well, we’ll have to cheat and it’ll look silly, but let’s be fair. Here’s my new index:

CREATE NONCLUSTERED INDEX TestIndex
  ON Production.ProductListPriceHistory
(ProductID,StartDate,EndDate,ListPrice,ModifiedDate);

When I rerun the query it results in an index scan, non-clustered, with 5 logical reads. Granted, this is stupid. Instead, let’s do this. We’ll create a meaningful non-clustered index and then force it’s use:

CREATE NONCLUSTERED INDEX TestIndex
  ON Production.ProductListPriceHistory
(ListPrice);

Then run this:

SELECT    *
  FROM    Production.ProductListPriceHistory AS plph
WITH (INDEX(TestIndex));

Now that’s using a non-clustered index in a ‘SELECT *’ situation. It results in 794 logical reads and this execution plan:

image

Clearly, this is not an improvement. Finally, let’s get rid of the non-clustered index and put this columnstore index in place:

CREATE NONCLUSTERED columnstore INDEX xtest
  ON production.ProductListPriceHistory
(productid,startdate,enddate,listprice,modifieddate);
GO

Now when I run the ‘SELECT *’ query I have 37 reads and this execution plan:

image

This means the columnstore index is being used, but, if you look at the properties, you’ll see that the execution mode on this one is Rows, which is not the preferred use you’ll want out of a columnstore index. You want to see the execution mode be Batch.

If we stopped here, the answer is simple, a clustered index scan is better. But what if we add filtering? Let’s modify the query to look like this:

SELECT    *
  FROM    Production.ProductListPriceHistory AS plph
WHERE    plph.ListPrice = 23.5481;

Now, I’ll go back and set up the table so that it has just the clustered index, a non-clustered index, or the columnstore index. The results for the clustered index are identical. Since the column, ListPrice, is not part of the clustered key, a scan is necessary and the results are 5 reads and an execution time of about 1ms (the data is cached). The non-clustered index resulted in the same execution plan as before, but only 8 reads. But, the execution time was 42ms, so the added processing of getting the data put together from the key lookup was a little costly. Finally, the columnstore index results in 42 reads and an execution time of 4ms. The execution mode of the columnstore index was still Row.

What’s all this mean? Not much since you shouldn’t be using ‘SELECT *’ anyway, but the main takeaway I’d suggest is that columnstore indexes are not magic. They don’t replace traditional indexing. Further, if you’re going to use them, be sure that you’re really using them correctly. Just because you see the columnstore operator in the execution plan doesn’t mean you’re taking advantage of all the fantastic benefits they offer. Dive down into the properties and check the execution mode to ensure you’re getting a Batch execution. Then you’ll know that you’re benefiting from the columnstore index.

One thought on “Which SELECT * Is Better?

Please let me know what you think about this article or any questions:

This site uses Akismet to reduce spam. Learn how your comment data is processed.