The question came up that someone wanted to monitor the queries impacting tempdb using Extended Events. Initially, I was pretty sure that there was no real way to do this. The standard query events like rpc_completed includes spill information, if any occurred, but that’s it. There are also no actions that you can add to an event that would capture tempdb behaviors. So, I want to my senior DBA, YaDuBiGle.
I found this ancient (2009) article on SQLTips, that surprised me with a pretty slick solution using, are you ready, waits.
How well does it work? Let’s try it out.
Testing Capturing Queries Impacting TempDB
I’m not stealing their code, so you can follow the link to see the setup. I’m much more interested in the results. How well does it do on showing queries that are impacting tempdb, regardless of how they’re hitting tempdb (sorts, hashes, spills, temp tables, table variables, etc.)? So, I will share the queries I ran to see how it works.
I started with a simple query that won’t go to tempdb:
SELECT AddressTypeID, Name, rowguid, ModifiedDate FROM Person.AddressType;
Sure enough. No activity. So, let’s go straight to creating a temporary table & see what happens:
CREATE TABLE #Sales ( SalesOrderNumber NVARCHAR(25), RevisionNumber TINYINT, OrderDate DATETIME, OrderQty SMALLINT, LineTotal NUMERIC(38, 6) ); INSERT INTO #Sales ( SalesOrderNumber, RevisionNumber, OrderDate, OrderQty, LineTotal ) SELECT soh.SalesOrderNumber, soh.RevisionNumber, soh.OrderDate, sod.OrderQty, sod.LineTotal FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail AS sod ON sod.SalesOrderID = soh.SalesOrderID; SELECT SalesOrderNumber, RevisionNumber, OrderDate, OrderQty, LineTotal FROM #Sales; DROP TABLE #Sales;
Running this, I got nothing. Zip. Zilch. Nada. Null. And I’m not a fan of null.
The key to understanding this is that not every query has to wait. If the query is processed without ever going into the SUSPENDED state during it’s operation, you just won’t see waits. Therefore, you’re not going to see every query that passes through tempdb.
What you are going to see are the queries that are either putting a real load on tempdb, or, are waiting on those queries that are putting a load on tempdb (let’s be fair, probably both). Executing the above query just 100 times results in instances where it’s waiting (frequently on SOS_SCHEDULER_YIELD). Let’s look at the output:
A few things to point out here. Every wait_info event has a start and an end, which is why there are two on display. You can see the “opcode” value in the one I have highlighted is “End”. Other than that, it’s the stuff we told it to capture.
Overall, I’d say this is a success, with caveats. How could we improve it?
Upgrading the Session
Now that we know what we have and how it works, there are a few things we can do to help improve functionality. First up, I’d suggest we add and additional filter on the opcode. I lean towards only capturing the ones that are “End” and not the ones that are “Begin”. The reason I say this is because I’m interested most in the duration value of the waits. It’s going to give me valuable information. However, you could argue that it means you lose activity where the connection was dropped or timed out. In that case, capturing “Begin” would be better. You could argue for both I suppose, but this quickly becomes quite a lot of data. So, I’m going to add that filter.
The next thing I’d do is add another filter. This time on duration (which means I need to capture both or only the “End” values). The reason I say that is, sure, waits are waits, but really, something that waited a millisecond, I’m not that concerned with. Granted they can add up over time and become interesting. For our purposes though, we care about the queries impacting tempdb that are really hurting it, ones that run longer, do more, have more data. So yeah, I’d make the session look more like this:
CREATE EVENT SESSION [QueriesImpactingTempdb] ON SERVER ADD EVENT sqlos.wait_info (ACTION ( package0.callstack, sqlserver.database_id, sqlserver.sql_text ) WHERE ( [sqlserver].[database_id] = (2) AND [duration] > (500) AND [opcode] = 'End' ) ) ADD TARGET package0.event_file (SET filename = N'QueriesImpactingTempdb'); GO
I picked 500ms as the threshold, but obviously, there are clear options here.
In conclusion, while using the wait_info event along with the sql_text action gets us where we want, it’s not a perfect solution. I did some testing with other events as well. The wait_completed event works much the same, but doesn’t have a Begin & End, only the completion. Still, it’s another option. Overall, that article is a great addition to monitoring SQL Server. Kudos to the author. Now you have another tool in your toolbox that will let you see queries impacting tempdb.