The company I work for has a very well defined need for versioned data. In a lot of instances, we don’t do updates, we do inserts. That means that you have to have mechanisms for storing the data that enables you to pull out the latest version of all the data or a particular version of all the data, or the data at a particular moment in time, regardless of version.
That means maintaining a version table and a series of inserts into various tables. Some tables will have pretty much a new row for each version, some tables may only have one or two versions out of a chain. With the help of a very smart Microsoft consultant, Bill Sulcius, we have a mechanism that works very well. However, questions about the ultimate tuning of the procedures remain. So we may have a query that looks like this:
FROM dbo.Document d
INNER JOIN dbo.Version v
ON d.DocumentId = v.DocumentId
AND v.VersionId = (SELECT TOP(1) v2.VersionId
FROM dbo.Version v2
WHERE v2.DocumentId = v.DocumentId
ORDER BY v2.DocumentId DESC, v2.VersionId DESC)
There’s a clustered index on the Version table that has DocumentId & VersionId in it. This query works great.
But you can also write the same query to get the same results using MAX or ROW_NUMBER(). What’s more, those work well too, all nice clean clustered index seeks. You can also use CROSS APPLY rather than JOINS. All these appear to work well, but in different circumstances, some work better than others. That makes establishing a nice clean “if it looks like this, do this” pattern for developers to emulate difficult. I’m creating a series a tests to outline as much of the differences as I can. I’ll write it up and submit it all to Chuck over at SQL Server Standard first. If he doesn’t like it, it’s Steve’s. I’ll also post a few tid bits here.