What? Execution plans and the GDPR? Is this it? Have I completely lost it? Well, no, not on this topic, keep reading so I can defend myself.
GDPR and Protected Data
The core of the GDPR is to ensure the privacy and protection of a “natural person’s” information. As such, the GDPR defines what personal data is and what processing means (along with a bunch of additional information). It all comes down to personally identifying (PI) data, how you store it, and how you process it. More importantly, it’s about the right for the individual, the natural person, to control their information, up to and including the right to be forgotten by your system.
OK. Fine. And execution plans?
Execution Plans and PI Data
If you look at an execution plan, in theory, you won’t see any data at all, let alone PI data, right? Well, no. There are a bunch of places where you could be seeing PI data within an execution plan. First, the query itself is stored with the plan, so if you’ve used hard-coded values in your T-SQL, you will see those values stored in the plan. This is one of many reasons why parameterizing queries is so important. However, it doesn’t end there.
If you’ve parameterized your queries, well done. Now, you have something else to worry about with the GDPR. What? Yeah, well, I’ve pointed it out a lot, but the parameter compile time values are stored in the first operator of an execution plan:
Now, when you save an execution plan out to a file, you’re potentially transmitting PI data. It goes further. When you hard code values, PI is not just in the query. Those PI values can also be stored throughout the plan in various properties.
So now you see what I mean when I say that the GDPR affects how we deal with execution plans. I’m not done yet.
Where Execution Plans Live
In my example above, I mention storing execution plans out to a file and how, that could be moving PI data in violation of the GDPR (potentially). That’s not the only thing you should be thinking about.
Execution plans are stored in cache. What happens when someone deletes a row of data (logically or physically) to the plan in cache that used a value from a column in that row to help define the execution plan? Nothing. The plan stays in cache even though the data that was the basis for the plan was removed. Yeah, next time it recompiles for whatever reason, it obviously can’t reuse that value… or can it?
Another place besides cache that query plans are stored is in the Query Store. When plans are stored there, they are stored with the query and compile time values and all the associated possible persistence of PI data. Further, when you choose to force a plan, you’re making any PI data in that plan survive beyond the deleting of that data from the database. We’re not done.
If you use OPTIMIZE FOR hints and supply a value, you might be potentially including PI data. When you do this using a Plan Guide as opposed to modifications to the code, you’re potentially keeping PI data.
As I’ve said before and will likely say again, there is zero cause for panic when it comes to the GDPR. There is cause for concern. There is cause for thoughtfulness. We do not yet know exactly how the GDPR is going to be enforced since that doesn’t start until May 2018. We do know that, per the definitions within the GDPR, that it is entirely likely that values stored in execution plans might be something that you need to concern yourself with. You might not be concerned for those values in cache or the Query Store. You certainly should for those values that you export out to a file. The possibility of causing a GDPR breach through an execution plan is something to be thoughtful and concerned about.
Still think I’m crazy on this topic?
Want to discuss more about the GDPR? Join me for the PASS Marathon on the GDPR sponsored by Redgate on 13, March, 2018. Go here to register for this event.