Sep 26 2016

Query Store and What Happened Earlier On Your Server

Here’s a great question I received: We had a problem at 9:02 AM this morning, but we’re not sure what happened. Can Query Store tell us?

My first blush response is, no. Not really. Query Store keeps aggregate performance metrics about the queries on the database where Query Store is enabled. Aggregation means that we can’t tell you what happened with an individual call at 9:02 AM…

Well, not entirely true.

The aggregations that Query Store keeps are actually broken up into intervals that you can control. The default interval is 60 minutes. This means that the information stored in the DMV covers sets of intervals. This means that if, at 9:02AM, you had a query, or queries, that ran considerably longer than normal, you may be able to take a look at them. I’m not guaranteeing this will always find what you want, but at least it’s a starting point. Here’s how we could compare a known, well behaved, period of time, to a known, ill-behaved, period of time:

DECLARE @CompareTime DATETIME,
   @BaseTime DATETIME;

SET @BaseTime = '2016-09-22 13:33';
SET @CompareTime = '2016-09-22 12:15';

WITH CoreQuery
AS (SELECT qsp.query_id,
       qsqt.query_sql_text,
       qsp.query_plan,
       qsrs.execution_type_desc,
       qsrs.count_executions,
       qsrs.avg_duration,
       qsrs.max_duration,
       qsrs.stdev_duration,
       qsrsi.start_time,
       qsrsi.end_time
    FROM sys.query_store_runtime_stats AS qsrs
    JOIN sys.query_store_runtime_stats_interval AS qsrsi
       ON qsrsi.runtime_stats_interval_id = qsrs.runtime_stats_interval_id
    JOIN sys.query_store_plan AS qsp
       ON qsp.plan_id = qsrs.plan_id
    JOIN sys.query_store_query AS qsq
       ON qsq.query_id = qsp.query_id
    JOIN sys.query_store_query_text AS qsqt
       ON qsqt.query_text_id = qsq.query_text_id
   ),
BaseData
AS (SELECT *
    FROM CoreQuery AS cq
    WHERE cq.start_time < @BaseTime
          AND cq.end_time > @BaseTime
   ),
CompareData
AS (SELECT *
    FROM CoreQuery AS cq
    WHERE cq.start_time < @CompareTime
          AND cq.end_time > @CompareTime
   )
SELECT bd.query_sql_text,
   bd.query_plan,
   bd.avg_duration AS BaseAverage,
   bd.stdev_duration AS BaseStDev,
   cd.avg_duration AS CompareAvg,
   cd.stdev_duration AS CompareStDev,
   cd.count_executions AS CompareExecCount
FROM BaseData AS bd
JOIN CompareData AS cd
   ON bd.query_id = cd.query_id
WHERE cd.max_duration > bd.max_duration;

In this instance, I’m finding anywhere between the two intervals where the max on any of the queries that match between the intervals ran longer. We could look just at the average. We even could simply look to see if the aggregated standard deviation on duration is higher between intervals (or CPU, or I/O, the list goes on).

As I said, there’s no guarantee that we’re going to capture anything interesting, but if you don’t have any other monitoring in place that will capture query metrics at a more granular level, Query Store does offer a partial solution.

Sep 19 2016

Query Store and Plan Forcing: What Do You Use It For

If you’re working with Azure SQL Database or you’ve moved into SQL Server 2016, one of the biggest new tools is the Query Store. It provides a mechanism of capturing query performance over time and persisting it with the database. You also get the execution plans for those queries. Finally, you can choose to have the Query Store override execution plan selection by use of Plan Forcing. I’ve written about Query Store a few times:

Query Store, Force Plan and “Better” Plans
Query Store, Force Plan and Dropped Objects
Precedence Goes to Query Store or Plan Guide
Query Store, Forced Plans and New Plans
Query Store and Optimize For Ad Hoc
Query Store and Recompile
Finding Your Query in Query Store
Removing All Query Store Data
Monitor Query Performance

OK, maybe I’m a little obsessed with Query Store. I can’t help it. It’s one of the most exciting developments in query tuning within SQL Server in quite some time. If you’re working with Azure SQL Database or SQL Server 2016, you should be using it. If you’re not, you’re missing a trick. If you’re preparing to move into Azure SQL Database or SQL Server 2016, you should learn this now.

Why?

You’ll note that I’ve written about Force Plan or Plan Forcing multiple times. The single biggest reason for this is because of one of the most common problems with query performance in SQL Server is when parameter sniffing goes bad and Force Plan lets you fix bad parameter sniffing without changing your code. That’s a huge win.

But…

And you knew there was a but coming.

But, what else does Force Plan do for you? What if you never experience bad parameter sniffing (you do, but I’m not going to argue the point)? Is there something else that Force Plan can do for you? Heck yes! The whole point of creating the Query Store was in order to address Plan Regression. What the heck is plan regression? When Microsoft makes any change to the Query Optimizer, and those changes come all the time, it’s possible that you might see a change in your execution plans. Most of the time, it’s going to be a positive change. That’s why they’re changing the Optimizer after all, to improve it. However, sometimes, you’re benefiting from the old behavior of the Optimizer and that new plan doesn’t work as well as the old plan. This is plan regression. When Microsoft changed the Cardinality Estimation engine in SQL Server 2014, more than a few people experienced the new estimator giving row estimates that resulted in a different execution plan that didn’t perform as well as the old plan. This is plan regression. What to do?

QUERY STORE! Ta-da!

Testing For Regression

Here’s what you’re supposed to do. You’ve decided to finally retire that SQL Server 2000/2005/2008R2/2012 database and join the future in SQL Server 2016 (what I’m about to describe actually doesn’t help for migrations to Azure SQL Database). Good. Backup your database (always, backup your database), restore it to that shiny new 2016 instance, but leave the compatibility mode in whatever version you’re coming from. Don’t change it to the new compatibility mode yet. This means it will use the old Cardinality Estimator and not implement new Query Optimizer choices, yet. Enable Query Store because it’s not on by default. Run a bunch of tests against your database. Run all the queries you can with as many different parameter values that you can. Now, change the compatibility level to 2016. Run the tests again. Use the nifty GUI (actually, I dislike the GUI, but it’s kind of cool to see this) to run the Regressed Queries report. That will show you queries that have run slower over time as you can see below (click to embiggen):

RegressionReport

I have highlighted the fourth query in the report because the first three all have identical plans, even though the execution times varied for those queries (blocking, resource contention, who knows). That one has two different plans, which I can see by the plan listing:

planid

Now I know that my plan changed over time. The trick is, investigate the plans and determine which plans changed because of the change in the compatibility level. You can tell this by clicking on the properties of the first operator (the plans shown in the report can be explored just like any other plan within SSMS) and look for the CardinalityEstimationModelVersion property:

cardinalityestimationmodelversion

Any plans that you like, that run better, based on the old model, you can choose, after testing and due consideration, please be careful with this, to force the plan simply by clicking a button:

forceplan

I can’t emphasize enough, be sure that you have tested the plan choice adequately. You can easily unforce a plan, but that may only occur to you after you have performance issues in production. This is a great new toy, uh, I mean tool, but you have to be careful when using it. You can force a plan, but it can be the wrong plan. Here are the steps in simple order:

  1. Backup the database on the old system
  2. Restore it to the new system
  3. Enable Query Store
  4. Run lots of tests
  5. Update the compatibility level
  6. Run the lots of tests again
  7. Check the Regression report
  8. Where applicable, use plan forcing

Conclusion

Query Store and the ability to force a plan is going to change the way we do query tuning. In some instances, but not all or even a majority, we won’t have to modify our code or add indexes in order to arrive at superior performance. We’ll be able to pick an appropriate plan, force it, and get better performance.

If you want to get started on Query Store and you’re attending PASS Summit 2016, I’ll be doing a session there on this topic on Wednesday, Oct 26 at 3:15 PM.

Sep 12 2016

The Clustered Index Is Vital To Your Database Design

This is post #9 supporting  Tim Ford’s (b|t) initiative on #iwanttohelp, #entrylevel. Read about it here.

You get one clustered index per table.

That bears repeating, you get one clustered index per table.

Choosing a clustered index is an extremely important and fundamental aspect of all your SQL Server work. The one clustered index that you get determines how the data in your table is stored. Because the clustered index determines how your data is stored, it also determines how your data is retrieved. Much of SQL Server is engineered around the clustered index because it is such a foundational object for the rest of all behavior. Without a clustered index, the data in your table is stored in what is called a heap. It is essentially a pile, a heap, of data, stored without internal structure. The data in a heap can be found fairly quickly when only a single value is needed, but as soon as you start to retrieve ranges of data, more than one row, heaps become extremely problematic. Meanwhile, the clustered index stores all your data at the bottom of a B+Tree, structured storage, that is used to retrieve this data through the key, or keys, that you give your clustered index. Further, each non-clustered index also uses the key value from your clustered index as it’s means of retrieving data from the clustered index.

All this means that the vast majority of your tables should have a clustered index. Yes, there are exceptions, but, exceptions should be exceptional.

The standard advice, and it’s not wrong, is to make the clustered key as narrow as possible. Next, clustered indexes are designed to support monotonically increasing values extremely well. Clustered indexes must be unique. You either have to create them on a unique value, or, internally you’ll get a UNIQUIFIER applied to the key to ensure that each value is unique. All this taken together means that they work nicely on IDENTITY columns. Because of this advice, you’ll frequently see all the clustered indexes in a database on the IDENTITY columns. However, this is a bit of a trap. You need to remember that the clustered index is not just about storage, but about retrieval. If the most common access path to your data is through, for example, a foreign key, that value might be a better choice for the clustered key. If you always retrieve the data through a date range, that might be a better choice for the clustered key. Keep in mind the standard advice for clustered keys; narrow, monotonically increase, unique. However, also remember, they are where your data is stored. It is frequently better to sacrifice one or more of the standard practices in order to be able to use your clustered index to retrieve the data in most of your queries.

Think through where and how to apply your clustered indexes, and don’t get stuck into using them in a single manner, everywhere, within your system. You get one clustered index on a table. Be sure you put it somewhere that it will work well and somewhere that it will be well used, both.

Sep 07 2016

Kilt Day! PASS Summit 2016, Thursday

Thursday at the PASS Summit is Kilt Day.

This means you should wear a kilt. Whether you do it in support of Women in Technology, or you just like to swan about in a kilt, this is the day to sport your kilt. Whether you have a fully traditional tartan, you believe in utility, comfort, or you’re prepared for the zombiepocalypse, there’s a kilt for you. All are welcome. All are encouraged.

Sep 06 2016

Query Store, Force Plan and “Better” Plans

I am endlessly fascinated by how the Query Store works. I love teaching it at every opportunity too. Plus, almost every time I teach it, I get a new question about the behavior that makes me delve into the Query Store just a little bit more, enabling me to better understand how it works. I received just such a question at SQLSaturday Norway:

If you are forcing a plan, and the physical structure changes such that a “better” plan is possible, what happens with plan forcing?

Let’s answer a different question first. What happens when the plan gets invalidated, when the index being used gets dropped or some other structural change occurs so that the plan is no longer valid? I answered that question in this blog post. The plan being forced, after the object is dropped, becomes invalid, so that plan can no longer be used. The Query Store still attempts to apply the plan during any recompile or compile event of the query in question, but it fails and a proper plan is used. All this means, I think, the Query Store is going to ignore the new index, since a new index doesn’t invalidate an existing plan. A new index just makes new plans possible. However, when I was asked this question, this wasn’t something I had tested, so I gave a speculative, best guess, answer with plenty of caveats and the promise to provide a tested answer ASAP. Here we go.

I’ll start with the same sample query:

SELECT  sit.Quantity,
        sit.TransactionOccurredWhen,
        i.InvoiceDate,
        si.StockItemName
FROM    Warehouse.StockItemTransactions AS sit
JOIN    Sales.Invoices AS i
        ON i.InvoiceID = sit.InvoiceID
JOIN    Warehouse.StockItems AS si
        ON si.StockItemID = sit.StockItemID
WHERE   sit.TransactionOccurredWhen BETWEEN '3/1/2015'
                                    AND     '3/5/2015';

The results are returned in about 53ms with 4850 reads and this execution plan:

PlanWithoutIndex

As you can see, there’s the suggestion of a possible missing index. We’ll apply that in a moment. First though, I’m going to get the plan and query identifiers from the Query Store:

SELECT  qsq.query_id,
        qsp.plan_id,
        CAST(qsp.query_plan AS XML)
FROM    sys.query_store_query AS qsq
JOIN    sys.query_store_query_text AS qsqt
        ON qsqt.query_text_id = qsq.query_text_id
JOIN    sys.query_store_plan AS qsp
        ON qsp.query_id = qsq.query_id
WHERE   qsqt.query_sql_text LIKE 'SELECT  sit.Quantity,
        sit.TransactionOccurredWhen,%';

With this information, I’ll use Force Plan to ensure this is the plan used going forward.

EXEC sys.sp_query_store_force_plan 42995,487;

With that done, I’ll create the index:

CREATE INDEX TransactionOccurredWhen
ON Warehouse.StockItemTransactions
(TransactionOccurredWhen)
INCLUDE (StockItemID,InvoiceID,Quantity);

When I do this same set of operations, run the query, identify a missing index, create a new index, rerun the query, when there is no plan forcing occurring, the execution plan above is replaced with one that uses the index and results in about 32ms execution time with 2942 reads, a significant improvement. You get a recompile event because the schema involved with the query has changed. With the change, a new index is available, so the recompile event uses that new index. What happens when you force the plan?

The recompile event after running CREATE INDEX still occurs. However, because we have elected to force a plan, that plan is what is used. In this instance, a recompile to a new plan would result in a faster query using fewer resources. However, as long as we’re forcing a plan and that plan stays valid, the plan will be forced.

In short, the behavior is exactly as I expected. Choosing to force a plan in the Query Store results in that plan being forced. While I think that the Query Store and plan forcing are wonderful new tools in our tool box, I am concerned that plan forcing will become far too easy a thing to implement. I worry that people will implement it without thinking through the implications and potential impacts.

It gets worse. If I change the query, let’s say I make it into a stored procedure and parameterize the query, and, instead of a very limited date range, I send in a whole month, the execution plan is quite different (with or without the index). Forcing the plan that is expecting less than 1,000 rows onto a query that is retrieving 10,000 rows results in pretty horrific performance. We really are going to have to be careful about using plan forcing appropriately because, as in so much of the rest of SQL Server, and in all of programming for that matter, the code is going to do exactly what we tell it do, whether that’s what we really want it to do or not.

Aug 29 2016

Azure SQL Data Warehouse Execution Plans

Azure SQL Data Warehouse can sometimes feel like it’s completely different from SQL Server, but under the covers, it is still (mostly) SQL Server and it does have execution plans. Let’s take a look at one.

I’ve created a new SQL Data Warehouse using the sample database available on the portal, AdventureWorksDW. Here’s a query against that database:

SELECT dd.FullDateAlternateKey AS OrderDate,
dc.LastName,
SUM(fis.SalesAmount) AS SumSalesAmount
FROM dbo.FactInternetSales AS fis
JOIN dbo.DimDate AS dd
ON fis.OrderDateKey = dd.DateKey
JOIN dbo.DimCustomer AS dc
ON dc.CustomerKey = fis.CustomerKey
GROUP BY dd.FullDateAlternateKey,
dc.LastName
HAVING SUM(fis.SalesAmount) > 5000.0
ORDER BY OrderDate DESC;

If I attempt to capture an execution plan using the SQL Server Management Studio GUI, nothing happens. If I try to use T-SQL commands, I get an error that those commands are not supported with this version of SQL Server. Same thing if I try to capture a plan using Visual Studio. So… now what? Enter our new command:

EXPLAIN
SELECT  dd.FullDateAlternateKey AS OrderDate,
        dc.LastName,
        SUM(fis.SalesAmount) AS SumSalesAmount
FROM    dbo.FactInternetSales AS fis
JOIN    dbo.DimDate AS dd
        ON fis.OrderDateKey = dd.DateKey
JOIN    dbo.DimCustomer AS dc
        ON dc.CustomerKey = fis.CustomerKey
GROUP BY dd.FullDateAlternateKey,
        dc.LastName
HAVING  SUM(fis.SalesAmount) > 5000.0
ORDER BY OrderDate DESC;

If I run this through the SSMS query window, I get a syntax error. So we’re now in Visual Studio. This is how we generate an execution plan from within Azure SQL Data Warehouse. What you get is XML output in the results like this:

xmlresults

According to the documentation on EXPLAIN, I should be able to click on the XML and it will open up to explore. In my version of Visual Studio (2015), I didn’t find that to be the case. Instead I had to copy and paste the XML into an XML file window that I created within Visual Studio. This is what I finished with:

<?xml version="1.0" encoding="utf-8"?>
<dsql_query number_nodes="1" number_distributions="60" number_distributions_per_node="60">
  <sql>SELECT  dd.FullDateAlternateKey AS OrderDate,          dc.LastName,          SUM(fis.SalesAmount) AS SumSalesAmount  FROM    dbo.FactInternetSales AS fis  JOIN    dbo.DimDate AS dd          ON fis.OrderDateKey = dd.DateKey  JOIN    dbo.DimCustomer AS dc          ON dc.CustomerKey = fis.CustomerKey  GROUP BY dd.FullDateAlternateKey,          dc.LastName  HAVING  SUM(fis.SalesAmount) &gt; 5000.0  ORDER BY OrderDate DESC</sql>
  <dsql_operations total_cost="5.98868068474576" total_number_operations="13">
    <dsql_operation operation_type="RND_ID">
      <identifier>TEMP_ID_14</identifier>
    </dsql_operation>
    <dsql_operation operation_type="ON">
      <location permanent="false" distribution="AllComputeNodes" />
      <sql_operations>
        <sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_14] ([DateKey] INT NOT NULL, [FullDateAlternateKey] DATE NOT NULL ) WITH(DATA_COMPRESSION=PAGE);</sql_operation>
      </sql_operations>
    </dsql_operation>
    <dsql_operation operation_type="BROADCAST_MOVE">
      <operation_cost cost="1.99584" accumulative_cost="1.99584" average_rowsize="7" output_rows="1188" GroupNumber="12" />
      <source_statement>SELECT [T1_1].[DateKey] AS [DateKey],         [T1_1].[FullDateAlternateKey] AS [FullDateAlternateKey]  FROM   [DWTest].[dbo].[DimDate] AS T1_1</source_statement>
      <destination_table>[TEMP_ID_14]</destination_table>
    </dsql_operation>
    <dsql_operation operation_type="RND_ID">
      <identifier>TEMP_ID_15</identifier>
    </dsql_operation>
    <dsql_operation operation_type="ON">
      <location permanent="false" distribution="AllDistributions" />
      <sql_operations>
        <sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_15] ([OrderDateKey] INT NOT NULL, [CustomerKey] INT NOT NULL, [SalesAmount] MONEY NOT NULL ) WITH(DATA_COMPRESSION=PAGE);</sql_operation>
      </sql_operations>
    </dsql_operation>
    <dsql_operation operation_type="SHUFFLE_MOVE">
      <operation_cost cost="3.93098847457627" accumulative_cost="5.92682847457627" average_rowsize="16" output_rows="60398" GroupNumber="10" />
      <source_statement>SELECT [T1_1].[OrderDateKey] AS [OrderDateKey],         [T1_1].[CustomerKey] AS [CustomerKey],         [T1_1].[SalesAmount] AS [SalesAmount]  FROM   [DWTest].[dbo].[FactInternetSales] AS T1_1</source_statement>
      <destination_table>[TEMP_ID_15]</destination_table>
      <shuffle_columns>CustomerKey;</shuffle_columns>
    </dsql_operation>
    <dsql_operation operation_type="RND_ID">
      <identifier>TEMP_ID_16</identifier>
    </dsql_operation>
    <dsql_operation operation_type="ON">
      <location permanent="false" distribution="AllDistributions" />
      <sql_operations>
        <sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_16] ([FullDateAlternateKey] DATE NOT NULL, [LastName] NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AS, [col] MONEY NOT NULL ) WITH(DATA_COMPRESSION=PAGE);</sql_operation>
      </sql_operations>
    </dsql_operation>
    <dsql_operation operation_type="SHUFFLE_MOVE">
      <operation_cost cost="0.0618522101694915" accumulative_cost="5.98868068474576" average_rowsize="111" output_rows="136.985" GroupNumber="25" />
      <source_statement>SELECT [T1_1].[FullDateAlternateKey] AS [FullDateAlternateKey],         [T1_1].[LastName] AS [LastName],         [T1_1].[col] AS [col]  FROM   (SELECT   SUM([T2_2].[SalesAmount]) AS [col],                   [T2_1].[FullDateAlternateKey] AS [FullDateAlternateKey],                   [T2_2].[LastName] AS [LastName]          FROM     [tempdb].[dbo].[TEMP_ID_14] AS T2_1                   INNER JOIN                   (SELECT [T3_2].[OrderDateKey] AS [OrderDateKey],                           [T3_2].[SalesAmount] AS [SalesAmount],                           [T3_1].[LastName] AS [LastName]                    FROM   [DWTest].[dbo].[DimCustomer] AS T3_1                           INNER JOIN                           [tempdb].[dbo].[TEMP_ID_15] AS T3_2                           ON ([T3_2].[CustomerKey] = [T3_1].[CustomerKey])) AS T2_2                   ON ([T2_1].[DateKey] = [T2_2].[OrderDateKey])          GROUP BY [T2_1].[FullDateAlternateKey], [T2_2].[LastName]) AS T1_1</source_statement>
      <destination_table>[TEMP_ID_16]</destination_table>
      <shuffle_columns>FullDateAlternateKey;</shuffle_columns>
    </dsql_operation>
    <dsql_operation operation_type="ON">
      <location permanent="false" distribution="AllDistributions" />
      <sql_operations>
        <sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_15]</sql_operation>
      </sql_operations>
    </dsql_operation>
    <dsql_operation operation_type="ON">
      <location permanent="false" distribution="AllComputeNodes" />
      <sql_operations>
        <sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_14]</sql_operation>
      </sql_operations>
    </dsql_operation>
    <dsql_operation operation_type="RETURN">
      <location distribution="AllDistributions" />
      <select>SELECT   [T1_1].[FullDateAlternateKey] AS [FullDateAlternateKey],           [T1_1].[LastName] AS [LastName],           [T1_1].[col] AS [col]  FROM     (SELECT [T2_1].[FullDateAlternateKey] AS [FullDateAlternateKey],                   [T2_1].[LastName] AS [LastName],                   [T2_1].[col] AS [col]            FROM   (SELECT   SUM([T3_1].[col]) AS [col],                             [T3_1].[FullDateAlternateKey] AS [FullDateAlternateKey],                             [T3_1].[LastName] AS [LastName]                    FROM     [tempdb].[dbo].[TEMP_ID_16] AS T3_1                    GROUP BY [T3_1].[FullDateAlternateKey], [T3_1].[LastName]) AS T2_1            WHERE  ([T2_1].[col] &gt; CAST ((5000.0) AS DECIMAL (5, 1)))) AS T1_1  ORDER BY [T1_1].[FullDateAlternateKey] DESC</select>
    </dsql_operation>
    <dsql_operation operation_type="ON">
      <location permanent="false" distribution="AllDistributions" />
      <sql_operations>
        <sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_16]</sql_operation>
      </sql_operations>
    </dsql_operation>
  </dsql_operations>
</dsql_query>

So now we just save this as a .sqlplan file and open it in SSMS, right?

Nope!

See, that’s not a regular execution plan, at all. Instead, it’s a D-SQL plan. It’s not the same as our old execution plans. You can’t open it as a graphical plan (and no, not even in that very popular 3rd party tool, I tried). You will have to learn how to read these plans differently because, well, they are different.

Aug 25 2016

PASS Board Update

Time for another update.

The minutes for the June in-person meeting have been approved and are now ready for your information. We accomplished a lot in the two days and you can read what we did there. I’m pleased about two things. First, while it took until the July meeting to get the vote in, we have defined how the PASS board will deal with PASS-branded and PASS-sponsored events when it comes to payment to members of the Board. Thank you so very much for your feedback to my blog post on this topic. That feedback helped us to come to the conclusion that, for PASS-branded events (these are events that PASS doesn’t run, like a SQLSaturday, but that are operating under the PASS brand) a board member can receive payment, say for a pre-con, from the event, but that these payments must be announced (the fact of the payment, not the amount) to the board and read into the minutes. That’s it. Nice and simple. We’ll provide nice, open, clarity on when this is occurring, while making sure that becoming a board member doesn’t mean you give up the ability to run a pre-con at a SQLSaturday or similar event. It seems like a small thing, but it’s important that we do things right for the community and for the members of the board. I feel like this was a model of how to deal with issues before the board. We were open and up front about what we were discussing. We got input from you. We made a clear decision that will support the community well in the future.  Win.

Next, we voted to extend the Global Growth Committee for another year and I got my wish to lead the Committee. PASS is a global organization and while a large  majority of our current membership is from the US, we have had tremendous growth globally over the past several years and anticipate significant growth to continue from outside the US. For example, we are adding a LATAM seat to this year’s elections with the incredible growth that we’ve seen in the region over the past few years.  PASS’ plan is to expand on our success there and continue to grow in Latin America.

While we have already established ourselves in Europe, the Middle East and Africa (EMEA), we know that our community continues to need our support here as well.  We are looking at ways to bring more people into the PASS community and we’re also going to start thinking about how best to expand into other regions as well. We’re working on figuring out ways to get more feedback and ideas on how we can best serve each of the unique regions, territories and countries and their unique needs. I think you can see, the key word there is unique. I completely recognize that what works for one country/region/language will not work for the next. We’re going to be flexible about how we do things and what we do so we can reach you better, regardless of where you live. Lots more to come on this one, but, please, never hesitate to give me feedback. I’m personally invested in this topic and need your help.

That’s it for the details of this update. I’ve helped finish my first $10 million budget and am now working with the auditors. Finally, I’m starting to get nervous/pumped for my part in the keynote this year. You’ll have to stay tuned for more on that as well.

That’s what I have for the moment. Let me know how we’re doing.

Aug 22 2016

Run For the PASS Board

When the word comes around for elections to the PASS Board, are you one of those people who, even in passing, considers running? If you are, my advice is to well, do it.

By the way, here’s that word.

In order to make this a stronger, even more vibrant organization that continues to provide support to its members, we need more good leadership. A healthy set of choices, meaning lots of people, is necessary. That means we need you to run. There is no commitment required immediately. We’ve just started the process. What I’m asking you to do now, is start thinking about running, and thinking about it in a positive fashion. Track down a board member, or former board member, and find out what they think of the experience. You can ask them what they didn’t like, sure. More importantly, ask them what they got out of the experience. I’ve talked to a lot of former board members (and did so before I ran). Each can tell you a horror story. However, every single one of them also talked about the positive aspects of their time on the board.

Personally, my first year, while a learning experience (and we all know what they’re like), was unremittingly positive. I’m excited about what I got done while I was on the board. I’m excited about what I may get done. I’m VERY excited about my new role. My positive experience comes from two places. First, I’ve been able to make a useful impact on the organization that has done so much for me personally, paying forward some of what I’ve received. Next, I’ve been working on the skills needed to take part in running a very large organization, which is hugely personally beneficial. I won’t tell you that everything has been easy. It hasn’t. This is work, and extra work, on top of your job, family, career, etc. However, it has been, and continues to be, a rewarding experience.

I’m asking you, as Grant, please, run for the PASS Board.

Aug 17 2016

Updates to Azure Data Platform Learning Resources

I’m acting as the curator to a list of Azure Data Platform Learning Resources. It’s speakers and teachers, blogs, articles and more, all focused around the Azure Data Platform. I have it hosted on GitHub so that anyone can contribute. There have been several updates recently that you may want to take a look at.

If you’re looking for where and how to get started in the Azure Data Platform, this is a great resource to get you going.

If you’re teaching the Azure Data Platform, anything from PowerBI to DocumentDB to Azure SQL Data Warehouse and all the stuff in between, please add your information so that this list is up to date.

Aug 15 2016

Query Store, Force Plan and Dropped Objects

I love the Query Store. Seriously. It’s a huge leap forward in the capabilities of Azure SQL Database and SQL Server in support of performance monitoring and query optimization. One of my favorite aspects of the Query Store is the ability to force plans. Frankly though, it’s also the scariest part of the Query Store. I do believe that plan forcing will be one of the most ill-used functions in SQL Server since the multi-statement table-valued user-defined function (don’t get me started). However, unlike the UDF, this ill-use will be because of poor understanding on the part of the user, not a fundamental design issue. No, plan forcing and the Query Store are very well constructed. Let me give you an example of just how well constructed they are.

Let’s imagine that have a situation such as bad parameter sniffing where you’ve determined that from the more than one possible execution plans against a table, there is a preferred plan. Enabling plan forcing to ensure that plan gets used is a no-brainer. Let’s further imagine that you have a junior DBA who is… let’s just say overly aggressive in their duties such that they do silly things occasionally. What happens when your pretty plan, which uses a particular index meets your junior DBA who just dropped that index?

Here’s the setup. We’re using the WideWorldImporters database and we have this query:

SELECT  *
FROM    Warehouse.StockItemTransactions AS sit
WHERE   sit.TransactionOccurredWhen BETWEEN '9/9/2015'
                                    AND     '9/11/2015';

This query, with the default configuration, will scan the existing table, so I’ll add an index:

CREATE INDEX TransactionOccurredWhenNCI
ON Warehouse.StockItemTransactions
(TransactionOccurredWhen);

For a limited range such as the one I’m passing above, I’ll get a plan with a key lookup operation which runs faster than the scan, so I’m happy. For a broader range, I’m likely to see a scan again, but since most of my queries have a very narrow range, I’d sure like to be able to force the plan to always compile to the seek and key lookup. To do this I need to find the query_id and plan_id from the Query Store (assuming I’m not using the GUI):

SELECT  qsp.plan_id,
        qsp.query_id,
		qsqt.query_sql_text,
		qsp.count_compiles
FROM    sys.query_store_plan AS qsp
JOIN    sys.query_store_query AS qsq
        ON qsq.query_id = qsp.query_id
JOIN    sys.query_store_query_text AS qsqt
        ON qsqt.query_text_id = qsq.query_text_id
WHERE   qsqt.query_sql_text LIKE 'SELECT  *
FROM    Warehouse.StockItemTransactions AS sit%';

With those values, I can force the execution plan so that it will always use the plan I want:

EXEC sys.sp_query_store_force_plan 42460,463;

That’s it. I’m happy because I’m going to see the execution plan used over and over, despite any values passed during a recompile.

Then…

Along comes our aggressive junior DBA who decides that there are “too many” indexes on the server. No, I don’t know what that means either, but they evidently read it on the internet or something so they drop the index we created before:

DROP INDEX TransactionOccurredWhenNCI ON Warehouse.StockItemTransactions;

What now happens to our lovely execution plan and the plan forcing? We’ll take a look at two events in Extended Events, sql_statement_recompile and query_store_plan_forcing_failed. Nothing happens immediately on dropping the index. The plans associated with that object, if any, are marked as invalid in the cache. The next time we call the query it’s going to recompile and we can see the event:

recompile_event

The most important part of the event is the recompile_cause which is “Schema changed”. However, I would also note the attach_activity_id.guid. I’ve chosen to enable causality tracking in this Extended Event session. This will cause all events associated with a common activity to get a GUID and then a sequence. This is interested because, after the recompile event, we get the query_store_plan_forcing_failed event:

plan_forcing_failed

The guid value is the same as the event above and the *.seq number is now 2, showing that, for these events, the recompile event occurred and then this event occurred. That makes perfect sense. The plan is marked for recompile, so, it’s going to be recompiled. I have enabled plan forcing though, so I have a particular plan that I want the optimizer to use. However, thanks to my “helpful” junior DBA, the plan is now invalid. You even get the description of what happened in the message field for the event:

Index ‘WideWorldImporters.Warehouse.StockItemTransactions.TransactionOccurredWhenNCI’, specified in the USE PLAN hint, does not exist. Specify an existing index, or create an index with the specified name.

The first question now is, what happens with this query and the execution plan? Does the new plan generated now that the index is missing get stored in cache? Yes, it does. We can validate that by querying the cache, or, when capturing the actual execution plan, checking the “Retrieved from cache” property.

Because plan forcing is enabled, do we see a recompile every time this query is called? The answer to that question is slightly complex. Under normal circumstances, no. As long as that plan remains in cache, it’s simply reused. No other recompiles occur. A normal recompile event will cause another attempt at applying the invalid execution plan and we would see yet another query_store_plan_forcing_failed event for each recompile on the query. However, during testing, Joey D’Antoni (who was helping me play with this when we discussed what would happen when a plan was made invalid) had severe memory pressure on his server. He saw intermittent recompiles with a cause message that said plan forcing had failed. So if your server is under extreme stress and you cause this issue, you might see different messages. Just remember, the cause of the recompiles was not the plan forcing, but the memory pressure.

The fun thing is, as long as I don’t remove the plan forcing or take the query and plan out of the Query Store manually, if I recreate the index on my table with the same name and definition as that expected by the plan, the Query Store will simply reapply the plan and then successfully force it during any subsequent recompile situation. This is because Query Store is persisted with the database and barring outside activity, the information there will remain, just like the rest of the data in the database.

All of this means that Query Store works exactly the way we would expect, not forcing additional recompiles when you, or your junior DBA, inadvertently invalidate a plan. It also works as expected in that forcing a plan is stored with your database so that, assuming you don’t remove that plan from the Query Store, it will simply be reapplied after you fix the problem. It’s fun to see the thought that went behind the design of the behavior of Query Store. However, please, use plan forcing judiciously.