system_health Extended Events in Azure SQL Database

The system_health Extended Events session is incredibly useful. Further, it’s running, by default, in every server you have under management that is 2008 or greater. Things are not the same in Azure though.

system_health in Azure SQL Database

If you look at the documentation for system_health, it shows that it’s applicable to Azure SQL Database. However, if you try to run the example query, it won’t work. This is because the implementation of Extended Events inside Azure SQL Database is a little different. Instead, you need to use the Azure SQL Database equivalent system views to create the same query like this:

SELECT CAST(dxdst.target_data AS XML)
FROM sys.dm_xe_database_session_targets AS dxdst
    JOIN sys.dm_xe_database_sessions AS dxds
        ON dxds.address = dxdst.event_session_address
WHERE dxds.name = 'system_health';

Now, running this in Azure, prepare to be disappointed. While the system_health documentation says it applies to Azure SQL Database, there is not a system_health session there.

Can we make one?

Recreating system_health in Azure SQL Database

Referring back to the documentation, the u_tables.sql file in the Install directory of your on-premises SQL Server instance has the script needed for recreating the system_health session.

Of course, we can’t simply run this as is on Azure SQL Database because one of the two targets is to your LOG folder. No such thing in Azure. First thing I did was remove that target. Then, you can’t use the ‘SERVER’ key word in the commands, so I replaced that with ‘DATABASE’. Running the script resulted in an error:

Msg 25744, Level 16, State 1, Line 20
The action ‘package0.callstack’ is not available for Azure SQL Database.

OK. Time for some quick research. So, the call stack is of a lot more interest to Microsoft when debugging than it is to me, so I’m going to strip this action and see what we get next. After a bunch of trial and error, I ended up with this script:

IF EXISTS (   SELECT *
              FROM sys.dm_xe_database_sessions AS dxds
              WHERE dxds.name = 'system_health')
    DROP EVENT SESSION system_health ON DATABASE;
GO

CREATE EVENT SESSION system_health
ON DATABASE
    ADD EVENT sqlserver.error_reported
    (ACTION (sqlserver.session_id,
             sqlserver.database_id,
             sqlserver.sql_text,
             sqlserver.tsql_stack)
     -- Get callstack, SPID, and query for all high severity errors ( above sev 20 )
     WHERE severity >= 20
           -- Get callstack, SPID, and query for OOM errors ( 17803 , 701 , 802 , 8645 , 8651 , 8657 , 8902 ), Hekaton checkpoint/merge errors (41354, 41355, 41367, 41384), Hekaton compilation related errors (41336, 41309, 41312, 41313)
           OR (   ERROR_NUMBER = 17803
                  OR ERROR_NUMBER = 701
                  OR ERROR_NUMBER = 802
                  OR ERROR_NUMBER = 8645
                  OR ERROR_NUMBER = 8651
                  OR ERROR_NUMBER = 8657
                  OR ERROR_NUMBER = 8902
                  OR ERROR_NUMBER = 41354
                  OR ERROR_NUMBER = 41355
                  OR ERROR_NUMBER = 41367
                  OR ERROR_NUMBER = 41384
                  OR ERROR_NUMBER = 41336
                  OR ERROR_NUMBER = 41309
                  OR ERROR_NUMBER = 41312
                  OR ERROR_NUMBER = 41313)),
    --    ADD EVENT sqlserver.xml_deadlock_report,
    ADD EVENT sqlos.wait_info
    (ACTION (sqlserver.session_id,
             sqlserver.sql_text)
     WHERE (   duration > 15000
               AND (   (   wait_type >= N'LATCH_NL' -- Waits for latches and important wait resources (not locks ) that have exceeded 15 seconds. 
                           AND (   (   wait_type >= N'PAGELATCH_NL'
                                       AND wait_type <= N'PAGELATCH_DT') --PAGELATCH_NL;PAGELATCH_KP;PAGELATCH_SH;PAGELATCH_UP;PAGELATCH_EX;PAGELATCH_DT
                                   OR (wait_type <= N'LATCH_DT') --LATCH_NL;LATCH_KP;LATCH_SH;LATCH_UP;LATCH_EX;LATCH_DT
                                   OR (   wait_type >= N'PAGEIOLATCH_NL'
                                          AND wait_type <= N'PAGEIOLATCH_DT') --PAGEIOLATCH_NL;PAGEIOLATCH_KP;PAGEIOLATCH_SH;PAGEIOLATCH_UP;PAGEIOLATCH_EX;PAGEIOLATCH_DT
                                   OR (   wait_type >= N'IO_COMPLETION'
                                          AND wait_type <= N'NETWORK_IO') --IO_COMPLETION;ASYNC_IO_COMPLETION;NETWORK_IO
                                   OR (wait_type = N'RESOURCE_SEMAPHORE')
                                   OR (wait_type = N'SOS_WORKER')
                                   OR (   wait_type >= N'FCB_REPLICA_WRITE'
                                          AND wait_type <= N'WRITELOG') --FCB_REPLICA_WRITE;FCB_REPLICA_READ;WRITELOG
                                   OR (wait_type = N'CMEMTHREAD')
                                   OR (wait_type = N'TRACEWRITE')
                                   OR (wait_type = N'RESOURCE_SEMAPHORE_MUTEX')))
                       OR (   duration > 30000 -- Waits for locks that have exceeded 30 secs.
                              AND wait_type <= N'LCK_M_RX_X' -- all lock waits
                       )))),
    ADD EVENT sqlos.wait_info_external
    (ACTION (sqlserver.session_id,
             sqlserver.sql_text)
     WHERE (   duration > 5000
               AND (   ( -- Login related preemptive waits that have exceeded 5 seconds.
                           (   wait_type >= N'PREEMPTIVE_OS_GENERICOPS'
                               AND wait_type <= N'PREEMPTIVE_OS_ENCRYPTMESSAGE') --PREEMPTIVE_OS_GENERICOPS;PREEMPTIVE_OS_AUTHENTICATIONOPS;PREEMPTIVE_OS_ACCEPTSECURITYCONTEXT;PREEMPTIVE_OS_ACQUIRECREDENTIALSHANDLE;PREEMPTIVE_OS_COMPLETEAUTHTOKEN;PREEMPTIVE_OS_DECRYPTMESSAGE;PREEMPTIVE_OS_DELETESECURITYCONTEXT;PREEMPTIVE_OS_ENCRYPTMESSAGE
                           OR (   wait_type >= N'PREEMPTIVE_OS_INITIALIZESECURITYCONTEXT'
                                  AND wait_type <= N'PREEMPTIVE_OS_QUERYSECURITYCONTEXTTOKEN') --PREEMPTIVE_OS_INITIALIZESECURITYCONTEXT;PREEMPTIVE_OS_LOGONUSER;PREEMPTIVE_OS_QUERYSECURITYCONTEXTTOKEN
                           OR (   wait_type >= N'PREEMPTIVE_OS_AUTHZGETINFORMATIONFROMCONTEXT'
                                  AND wait_type <= N'PREEMPTIVE_OS_REVERTTOSELF') --PREEMPTIVE_OS_AUTHZGETINFORMATIONFROMCONTEXT;PREEMPTIVE_OS_AUTHZINITIALIZECONTEXTFROMSID;PREEMPTIVE_OS_AUTHZINITIALIZERESOURCEMANAGER;PREEMPTIVE_OS_LOOKUPACCOUNTSID;PREEMPTIVE_OS_REVERTTOSELF
                           OR (   wait_type >= N'PREEMPTIVE_OS_CRYPTACQUIRECONTEXT'
                                  AND wait_type <= N'PREEMPTIVE_OS_DEVICEOPS') --PREEMPTIVE_OS_CRYPTACQUIRECONTEXT;PREEMPTIVE_OS_CRYPTIMPORTKEY;PREEMPTIVE_OS_DEVICEOPS
                           OR (   wait_type >= N'PREEMPTIVE_OS_NETGROUPGETUSERS'
                                  AND wait_type <= N'PREEMPTIVE_OS_NETUSERMODALSGET') --PREEMPTIVE_OS_NETGROUPGETUSERS;PREEMPTIVE_OS_NETLOCALGROUPGETMEMBERS;PREEMPTIVE_OS_NETUSERGETGROUPS;PREEMPTIVE_OS_NETUSERGETLOCALGROUPS;PREEMPTIVE_OS_NETUSERMODALSGET
                           OR (   wait_type >= N'PREEMPTIVE_OS_NETVALIDATEPASSWORDPOLICYFREE'
                                  AND wait_type <= N'PREEMPTIVE_OS_DOMAINSERVICESOPS') --PREEMPTIVE_OS_NETVALIDATEPASSWORDPOLICYFREE;PREEMPTIVE_OS_DOMAINSERVICESOPS
                           OR (wait_type = N'PREEMPTIVE_OS_VERIFYSIGNATURE'))
                       OR (   duration > 45000 -- Preemptive OS waits that have exceeded 45 seconds. 
                              AND (   (   wait_type >= N'PREEMPTIVE_OS_SETNAMEDSECURITYINFO'
                                          AND wait_type <= N'PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL') --PREEMPTIVE_OS_SETNAMEDSECURITYINFO;PREEMPTIVE_OS_CLUSTEROPS;PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL
                                      OR (   wait_type >= N'PREEMPTIVE_OS_RSFXDEVICEOPS'
                                             AND wait_type <= N'PREEMPTIVE_OS_DSGETDCNAME') --PREEMPTIVE_OS_RSFXDEVICEOPS;PREEMPTIVE_OS_DIRSVC_NETWORKOPS;PREEMPTIVE_OS_DSGETDCNAME
                                      OR (   wait_type >= N'PREEMPTIVE_OS_DTCOPS'
                                             AND wait_type <= N'PREEMPTIVE_DTC_ABORT') --PREEMPTIVE_OS_DTCOPS;PREEMPTIVE_DTC_ABORT
                                      OR (   wait_type >= N'PREEMPTIVE_OS_CLOSEHANDLE'
                                             AND wait_type <= N'PREEMPTIVE_OS_FINDFILE') --PREEMPTIVE_OS_CLOSEHANDLE;PREEMPTIVE_OS_COPYFILE;PREEMPTIVE_OS_CREATEDIRECTORY;PREEMPTIVE_OS_CREATEFILE;PREEMPTIVE_OS_DELETEFILE;PREEMPTIVE_OS_DEVICEIOCONTROL;PREEMPTIVE_OS_FINDFILE
                                      OR (   wait_type >= N'PREEMPTIVE_OS_GETCOMPRESSEDFILESIZE'
                                             AND wait_type <= N'PREEMPTIVE_ODBCOPS') --PREEMPTIVE_OS_GETCOMPRESSEDFILESIZE;PREEMPTIVE_OS_GETDISKFREESPACE;PREEMPTIVE_OS_GETFILEATTRIBUTES;PREEMPTIVE_OS_GETFILESIZE;PREEMPTIVE_OS_GETLONGPATHNAME;PREEMPTIVE_OS_GETVOLUMEPATHNAME;PREEMPTIVE_OS_GETVOLUMENAMEFORVOLUMEMOUNTPOINT;PREEMPTIVE_OS_MOVEFILE;PREEMPTIVE_OS_OPENDIRECTORY;PREEMPTIVE_OS_REMOVEDIRECTORY;PREEMPTIVE_OS_SETENDOFFILE;PREEMPTIVE_OS_SETFILEPOINTER;PREEMPTIVE_OS_SETFILEVALIDDATA;PREEMPTIVE_OS_WRITEFILE;PREEMPTIVE_OS_WRITEFILEGATHER;PREEMPTIVE_OS_LIBRARYOPS;PREEMPTIVE_OS_FREELIBRARY;PREEMPTIVE_OS_GETPROCADDRESS;PREEMPTIVE_OS_LOADLIBRARY;PREEMPTIVE_OS_MESSAGEQUEUEOPS;PREEMPTIVE_ODBCOPS
                                      OR (   wait_type >= N'PREEMPTIVE_OS_DISCONNECTNAMEDPIPE'
                                             AND wait_type <= N'PREEMPTIVE_CLOSEBACKUPMEDIA') --PREEMPTIVE_OS_DISCONNECTNAMEDPIPE;PREEMPTIVE_OS_PROCESSOPS;PREEMPTIVE_OS_SECURITYOPS;PREEMPTIVE_OS_SERVICEOPS;PREEMPTIVE_OS_SQLCLROPS;PREEMPTIVE_OS_WINSOCKOPS;PREEMPTIVE_OS_GETADDRINFO;PREEMPTIVE_OS_WSASETLASTERROR;PREEMPTIVE_OS_FORMATMESSAGE;PREEMPTIVE_OS_REPORTEVENT;PREEMPTIVE_OS_BACKUPREAD;PREEMPTIVE_OS_WAITFORSINGLEOBJECT;PREEMPTIVE_OS_QUERYREGISTRY;PREEMPTIVE_CLOSEBACKUPMEDIA
                                      OR wait_type = N'PREEMPTIVE_OS_AUTHENTICATIONOPS'
                                      OR wait_type = N'PREEMPTIVE_OS_FREECREDENTIALSHANDLE'
                                      OR wait_type = N'PREEMPTIVE_OS_AUTHORIZATIONOPS'
                                      OR wait_type = N'PREEMPTIVE_COM_COCREATEINSTANCE'
                                      OR wait_type = N'PREEMPTIVE_OS_NETVALIDATEPASSWORDPOLICY'
                                      OR wait_type = N'PREEMPTIVE_VSS_CREATESNAPSHOT')))))
    /*    add target package0.event_file      -- Store events on disk (in the LOG folder of the instance)
    (
        set filename           = N'system_health.xel',
            max_file_size      = 5, /* MB */
            max_rollover_files = 4
    ),*/
    ADD TARGET package0.ring_buffer -- Store events in the ring buffer target
    (SET MAX_MEMORY = 4096, max_events_limit = 5000)
WITH (MAX_DISPATCH_LATENCY = 120 SECONDS,
      STARTUP_STATE = ON);
GO

IF NOT EXISTS (   SELECT *
                  FROM sys.dm_xe_database_sessions AS dxds
                  WHERE dxds.name = 'system_health')
    ALTER EVENT SESSION system_health ON DATABASE STATE = START;
GO

That gets me a bunch of the interesting stuff from system_health, but, it is missing one thing that I left in the script, just commented out; xml_deadlock_report. That sucks just a little. Although, there is already a way to capture deadlocks in Azure SQL Database, so it's not necessarily needed in your new, fake, system_health session.

Conclusion

The short answer to, is there system_health in Azure SQL Database question, is no. At this time, that does not exist. However, can we at least simulate it with a little work? Yeah, but, I did not spend a lot of time investigating what all the events that I eliminated (almost all related to buffers of one sort or another) are causing me to lose. What I did get though are the long running queries, waits, errors, and other stuff on display within the system_health extended events session. It's not everything, but it's a start.


If you want to learn more about how to use Extended Events and other tools built into SQL Server that will help you identify and tune queries easier and faster, I have some all-day seminars coming up on the topic:

SQL Day, May 13-15, Wroclaw, Poland
SQLSaturday Columbus Precon, June 7, Columbus OH

4 thoughts on “system_health Extended Events in Azure SQL Database

Please let me know what you think about this article or any questions:

This site uses Akismet to reduce spam. Learn how your comment data is processed.