Measuring Query Execution Time: What Is Most Accurate

Probably the single most important factor when deciding which query to tune, or actively tuning a query, is how you go about measuring query execution time. SQL Server provides a number of different mechanisms (really, maybe too many) to get this done. However, all measures are not created equally. In fact, they frequently disagree with one another. Let’s take a look at this odd phenomenon.

Measuring Query Execution Time

Before we get into all the choices and compare them, let’s baseline on methodology and a query to use.

Not sure why, but many people give me blow back when I say “on average, this query runs in X amount of time.” The feedback goes “You can’t say that. What if it was just blocking or resources or…” I get it. Run a query one time, change something, run that query again, declare the problem solved, is not what I’m suggesting. Notice the key word and trick phrase “on average.” I don’t run the query once. I run it several times, capture them all, then get the average of the durations.

TLDR:

  1. Execute the query a bunch of times (we’ll go with 50)
  2. Average the execution time measured

Now, some of the ways you can measure a query are NOT conducive to this concept. For example, one way to capture execution time is within an actual execution plan. However, capturing 50 plans and then querying them to get a perfect measure… royal pain. I will have to make a few compromises along the way. However, the overall plan remains the same for all measures.

For the query, I don’t want something that runs either too long or too quickly, but is just right.

Using AdventureWorks I have a simple stored procedure and parameter value that runs in approximately 70ms or so, returning 194 rows. Very simple and quick, but not too simple or quick, just right:

Finally, I’m focused here on only measuring query execution time. Don’t fall into the trap that this is the only measure that counts. You should also be measuring wait statistics, I/O, compiles, recompiles, network time, etc.. I just need to focus on a single metric to keep this blog post size to something manageable.

Then, the question is, how are we going to measure. Here’s my list. I hope I didn’t leave anything off:

  • Include Client Statistics (only allows for 10 measures)
  • Connection Properties
  • SET STATISTICS TIME
  • QueryTimeStats in an actual execution plan
  • QueryStore
  • Dynamic Management Views
  • Trace Events (batch completed)
  • Extended Events (batch completed)

I’m not going to clean the buffers or force a recompile, or any of the other things that you might do as part of taking these measures. I will reset the system between tests so that I can be sure of a clean measure. For example, all methods for measuring query execution time will be turned off and I’ll clear out the query cache before I measure the DMVs, just so it’s purely the DMVs being measured. I won’t capture execution plans while doing other measures, etc. You get the idea.

Also, for both Trace Events and Extended Events, there are various metrics I could go for to get more granular. However, to get to a like/like overall, I decided to stick with straight forward measures that can be easily compared.

The Results

The first two results are from the two measures that involve the client measuring, not the server. You can see that they are radically different from all the other measures. I’ve left them out of the average of the averages at the bottom:

Method Average (ms) StDev Min Max
Include Client Statistics 6000 7745.96 0 15000
Connection Properties 211.96 98.17 62 390
Actual Execution Plans 71.9 62.9 20 214
SET STATISTICS TIME 75.14 72.58 31 249
DMV 72.24 * 27 *
Query Store 69.58 67.79 26 255
Trace Events 74.32 74.08 27 236
Extended Events 71.94 83.92 27 294
Average 72.52 72.254 26.33333333 249.6
StDev 1.977230386 7.857491966 3.559026084 29.38196726
Min 69.58 62.9 20 214
Max 75.14 83.92 31 294

I’ll admit to being surprised. If you asked me, I’d say that there would be a lot of skew to these results. I would have expected to see real differences between the measures. However, except for the pure client-side measures of Include Client Statistics and the Connection Properties, everything else is roughly in line.

The small variation we see between all the other measures is just noise. None of them are wildly off from the others. What I didn’t add to this equation is any indication of the load incurred by using one of these methods of measuring query execution time. The DMVs are effectively free. The others all incur some degree of cost. Obviously, capturing an Actual Execution Plan is easily the most costly measure up there. However, if you’ve already paid the cost, clearly, you can just use the data within the plan to get an idea of the query execution time. Although, as you see, the min & max are widely variant here, and any given plan could have the min value, the max value, or anything in beween.

I suspect, but don’t know for a fact, that this means that there is a consistent process internally that several/all these methods use when measuring query execution time.

Conclusion

The single biggest question now becomes, which one do you choose. My immediate response is that this is situational.

First, using the Include Client Statistics is an utter waste of time. The measurements are just radically off. If you do want to measure how long the round trip takes and have a degree of certainty, you’ll use the Connection Properties.

Second, what version of SQL Server are you running? Below 2016 or not in Azure? Then Query Store is off the table. Below 2012? Then I wouldn’t recommend using Extended Events.

Third, do you need granular measures, or are aggregates OK? If you can use aggregates, then Query Store and the DMVs are good, but STATISTICS TIME and Actual Execution Plans will be an utter pain in the bottom. If you’re looking at a one off, immediate feedback, then, if you’ve captured an Actual Execution Plan, you can use it. Otherwise, STATISTICS TIME, Trace or Extended Events are the way to go. If you need longer term granular measures, then I’d go with Extended Events (and, if forced, Trace Events).

Determine your needs for measuring query execution time, and, largely, any of the measures will accurately support you.


Want to learn how to use all these different measures (except Include Client Statistics of course)? Then come to my all day seminar on SQL Server Query Performance Tools. I’m hosting sessions in the US and Europe over the next several months:

I’m at SQLSaturday Indianapolis (and my first time speaking in Indiana) on August 10, 2018. Please go here to sign up.

I’ll be at SQLSaturday Sioux Falls (and my first time speaking in South Dakota) on August 18, 2018. Please sign up here.

For SQLSaturday Oslo on August 31, 2018. Click here right now to register.

I’ll be at my local event too, SQLSaturday Boston, on September 21st. You can go here to register.

I’m going to be presenting at SQLSaturday Munich on October 26, 2018. Go here now to join the class.

 

2 thoughts on “Measuring Query Execution Time: What Is Most Accurate

OK, fine, but what do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.