SQL University–Recommendations for a Clustered Index

04Apr 2011 by Grant Fritchey 6 Comments

Welcome, SQL University Students to another extension class here at Miskatonic University, home to the Fighting Cephalopods (GO PODS!). Never mind the stains on the floor, or the wallsâ€¦or those really nasty ones on the ceiling. There was aâ€¦ oh what did the dean call itâ€¦ an incident last week when one of the students had a little accident after reading Die Vermiss Mysteriis one too many times. But weâ€™re not here to talk about arcane tomes and unspeakable horrors today. No, today weâ€™re here to talk about clustered indexes.

SQL Server storage is really predicated around the idea of clustered indexes. Donâ€™t believe me? Letâ€™s list a few places that require a clustered index:

Partitioning.
A table in SQL Azure
In order to create XML indexes

What about the fact that the default primary key is clustered? Think that was by accident? How about the fact that when you create a clustered index, it becomes the data? Isnâ€™t it interesting that you create a materialized view, an indexed view, by creating a clustered index? Do you think that the fact that all non-clustered index key values point back to the clustered index is significant?

In short, picking a clustered index is an extremely important undertaking. But, most of the time, people leave the cluster on the default primary key or, worst of all, they remove it entirely to â€œhelpâ€ performance. Iâ€™m going to quickly address each of these choices,

Primary Key

Iâ€™m mostly a â€œdefaultsâ€ kind of guy and donâ€™t generally mess with the systems when I set them up. So, since the default for clustered indexes is the primary key, that should be where the overwhelming majority of indexes are left in your database design, right? Well, letâ€™s consider this query:

SELECTÂ  sod.UnitPrice,
Â Â Â Â Â Â Â  sod.OrderQty
FROMÂ Â Â  Sales.SalesOrderDetail AS sod
WHEREÂ Â  sod.ProductID = 927;

And letâ€™s assume, just for this discussion, that query is called constantly and that there are other queries, similar to that one, also called all day long. In short, the most common access path into your data is through the ProductID column. Letâ€™s take a quick look at the execution plan for that query:

Ah, a key lookup. The most common query on the system and weâ€™re paying the cost of a key lookup operation, each and every time. Ask yourself, could this column support an index? Since it has one, yes, itâ€™s selective enough to support an index. Could we change the non-clustered index to make it covering? Maybe, but why do that when we could just modify the cluster and achieve good results since this is the most frequently access path to the data.

Just remember that if a clustered index is not unique, SQL Server will add a value, called a uniquifier, to make it so. This could be another consideration when determining where and what to cluster.

Heaps

Storage is storage, right? Wrong! How things are stored matters. When a table doesnâ€™t have a clustered index, itâ€™s called a heap. Not a pretty name, is it? Darn right itâ€™s not. Thatâ€™s because what youâ€™re doing is effectively piling your data into a, well, a heap. Itâ€™s not stored in any particular manner, so retrieval is certainly less than optimal. How much less? Ah, thereâ€™s the question. Letâ€™s create copies of a table:

SELECT *
INTO HeapTable
FROM Sales.SalesOrderDetail AS sod;

SELECT *
INTO ClusterTable
FROM Sales.SalesOrderDetail AS sod;

We could just compare the heap against the clustered index, but you know that wonâ€™t work well. Instead, letâ€™s compare an index on the heap against the cluster. Weâ€™ll reuse the query from above as our test. On each table Iâ€™m going to create an index:

CREATE INDEX ixHeap ON HeapTable (ProductID) ;

CREATE CLUSTERED INDEX ixCluster ON ClusterTable (ProductID) ;

You should already have some idea of how this will work out, but, just in case, here are the execution plans:

The queries both had a single scan, but the first query had 11 reads to get the data and the second had 3 reads to get the data. Again, you could make the other index covering:

CREATE INDEX ixHeap2 ON HeapTable (ProductID)
INCLUDE (UnitPrice,OrderQty);

Then when the query is run, you get 2 reads instead of 3. But hereâ€™s a question. What happens when it comes time to defrag storage? Oh yeah, you canâ€™t do that with a heap. So while you might be able to put a few indexes on to get good performance, what happens when the next query looks like this:

SELECTÂ  sod.UnitPrice,
Â Â Â Â Â Â Â  sod.OrderQty,
Â Â Â Â Â Â Â  sod.LineTotal
FROMÂ Â Â  dbo.ClusterTable AS sod
WHEREÂ Â  sod.ProductID = 927;

Right, the cluster still only has 3 reads, but now youâ€™re back to 11, or more, on the heap.

Conclusion

This idea behind this post is to simply get you to think about your clustered index. There are endless debates about the exact, most perfect, method of using your cluster. My take is rather simple. Youâ€™re better off having one than not. Since you have to have one, make sure the system is using it, so place it on the most frequently access path that is also selective enough to support an index. Iâ€™ll leave it to the experts to debate the finer points.

See you for the next class in a couple of days, assuming no disruption in the space/time continuum or elder gods ripping apart the planet.

6 thoughts on “SQL University–Recommendations for a Clustered Index”

K. Brian Kelley

Love the Lovecraft reference. 🙂

April 5, 2011 at 10:19 am
Grant Fritchey

Thanks! Had to come up with a “scary” hook, hence Miskatonic University.

April 5, 2011 at 11:08 am
SQL University: Index Usage | Home Of The Scary DBA

[…] our last several talks have all been about indexes and indexing. One of the things that we havenâ€™t talked about is how to tell if, how or when your indexes are […]

April 6, 2011 at 9:37 am
@GFritchey series SQL University Performance Tuning | sqlmashup

[…] SQL University: Recommendations for a Clustered Index […]

April 8, 2011 at 2:41 pm
SQL in the Wild » Blog Archive » SQL University: Advanced Indexing – Indexing Strategies

[…] https://www.scarydba.com/2011/04/04/sql-universityrecommendations-for-a-clustered-index/ […]

November 11, 2011 at 10:01 am
stephanie b

Thank you for the very informative, clear and nice to read SQL University series of blogs.

March 29, 2012 at 4:55 am

Please let me know what you think about this article or any questions:Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.