Figuring Out How To Hide Production Data

There’s a really simple conundrum that we go through all the time.

  1. The best data for development is production
  2. You can’t have production data for development

You have to split the difference here in a pretty fine way. Get the developers the best tools possible while protecting the production information. It’s not easy, but I think I can help.

Chris Unwin

Of my many amazing co-workers, I sometimes think I’m the most jealous of Chris Unwin. It’s not because he’s smart and capable, most of them are. It’s because he’s young, filled with ideas, AND smart and capable. He’ll be running things long after we’re all gone. If you haven’t heard Chris speak, I’d recommend you start with DBAle, the beer flavored podcast that he runs covering all things data.

However, you also have the opportunity to get a little help while you listen to Chris. Next week, Chris will be presenting a session called “Protect Your Data By Design” at the all new community event, Redgate Streamed. Go and watch the session and learn how to deal with the problem of giving your devs data while protecting production at the same time.

Also, just so you know, Redgate is donating money to the WHO Covid-19 fund. We’ll donate additional funds based on registration (not even attendance) for this event. So even if you’re not interested in a three-day virtual conference, you can help the WHO by just registering.


Speaking of Chris and his partner Chris on DBAle, they’re going to be hosting a virtual Pub Quiz. So, follow this link, get registered. It’ll be as close to the SQLBits pub quiz as they can make it. Time to have a little fun.

2 thoughts on “Figuring Out How To Hide Production Data

  • This entire field is known as “data anonymization” or “data masking”. Because simple hashing or “scrambling” data values will either break applications directly (i.e. formatted data such as credit card numbers, etc) or indirectly (i.e. making non-unique values unique, etc), most anonymization or masking techniques involve list replacement, where the masking process holds a list of “valid” data items in a “domain” that are used to replace the actual data items in the domain. For example, a domain might be first names, last names, street names, cities, etc. Replacement can work for text or numeric data or even images and unstructured data. Anonymization can occur either upon data retrieval (a.k.a. dynamic) or on data at-rest (a.k.a. static). There are numerous vendors in this field including Delphix, Informatica, IBM, RedGate, DataVeil to name a few.

Please let me know what you think about this article or any questions:

This site uses Akismet to reduce spam. Learn how your comment data is processed.