The Curse of Relational Databases

Let’s face it, none of Information Technology is easy. Oh yeah, there are those few geniuses that have an absolute grasp over some small aspect of the stack, or those other geniuses that have a very shallow knowledge level, but understand the entire stack. But the stack itself, it’s vast, deep, wide, utterly unfathomable. So what do you do? You cheat. You take shortcuts. You ignore things you don’t like/understand/appreciate. And then there’s all the things you just don’t know. Or, you cheat another way, you get experts that have drilled down on a particular technology so that they’ll provide you with the knowledge you need. Ah, but then you have to listen to them and what happens when your local genius (deep or wide) doesn’t agree with your hired gun? Do you override your local person for the hired gun (I’ve seen this happen a ton where consultants were favored over in-house), or do you go with your local person (I’ve also seen this where the local person who has solved all the problems before may be over their heads now, but they’ve always been right and are therefore trusted)?

I just read (and I mean I finished about 90 seconds ago) this really interesting article on The Curse of the Excluded Middle. I won’t even pretend to you that I understood all of it. But, I did get a pretty fundamental concept out of it, this programming stuff is very hard, we’re going to take shortcuts to get through it, and those shortcuts come with a cost. The argument being put forward isn’t to somehow find a magic solution. It’s simply to acknowledge that there really is a cost, maybe even a cost you don’t completely understand. Further, that cost, and especially your lack of understanding of it, will come up and bite you on the behind.

Which brings me around finally to developers and databases. Relational databases are a pain the bottom. They really are. Speaking just of SQL Server (where I spend most of my time) you have to work with a ridiculous, archaic, language, T-SQL, in order to manipulate the data. And the rules of normalization, yeah, we can all learn them, but applying them makes every single aspect of coding harder. Plus the language lets us do things that it then interprets in horrendous fashion. Oh, and don’t forget all the obscure and weird maintenance and configurations that you have to go through to keep the silly servers online and functioning correctly. Then there’s the whole object/relational impedance mismatch thing to chew on our behinds even further. In short, I completely understand why developers would like to burn the entire edifice to the ground (come see one of my presentations when I talk about the “data persistence layer” that a particular dev team wanted to build). And all that is just the technical side of this mess. I’m not even going to address the personnel issues that come with the different focuses of responsibility between a developer and a DBA.

So when the developers bring in an Object Relational Mapping (ORM) tool or they explicitly attempt to slap out at DBAs by going after a NosQL database (and no, despite the new twist, it means NO F’ING ESSQUEELL, instead of Not Only SQL as many are saying now), I understand why they would do this. It short circuits all the issues. We get around the problem. We speed development by eliminating that thing that we didn’t completely understand and certainly didn’t like and…. Hang on… Isn’t there a darn good chance we’re digging a hole here?

Yes.

Don’t get me wrong. I see the need for unstructured data stores, ID/Value pairs, speed over consistency, speed over durability, the need to move fast because your competition is sure as heck trying to move fast. So NoSQL databases serve an absolutely valuable purpose and used correctly fix unique and difficult problems. A well structured ORM properly applied absolutely saves development time. But there’s this nasty little surprise hidden behind the need, the sometimes seemingly desperate need, to completely get rid of relational storage. That surprise? Relational storage actually works and works well when applied to the appropriate problems in the appropriate ways. It provides a means of collecting information fairly quickly (although not as fast as many NoSQL databases), storing it efficiently (although, maybe not as efficient as some object databases), and returning it to the users on demand (and here relational does stick out again). And does it all in on place, not one for collection, another for reporting, or some of the other strange perambulations I’ve seen people going through with some NoSQL implementations (again, not all, some are awesome, but many are horrific).

About twice a year I get to read a “death of the DBA” article that points to a technology or process or tool that’s going to eliminate the need for those nasty, ugly, difficult, relational databases and those freaks who try to keep them online and available. And about twice a year I see lists of the most needed workers in IT and guess what’s almost always there, yep DBAs. The fact is, relational storage does work. And instead of trying to eliminate it, or the DBA, or the code necessary to interface with it, embrace the stuff and learn to use it, or hire someone who actually knows how to use it and then listen to them. I’ve just seen too many places where the need to eliminate relational storage and DBAs is driven by one of two things, I have a shiny new hammer and everything is a nail, or, databases and DBAs are a pain because they make us do stuff we don’t want to, so let’s bypass them. Those are almost precisely the wrong reasons to go about moving to a NoSQL implementation, because you’re going to be ignoring stuff, as the Curse of the Excluded Middle talks about (and I know, it didn’t talk about databases, I’m extrapolating, hang with me here), and the things you ignore, or worse yet, don’t know about, are going to hurt and may hurt badly.

12 thoughts on “The Curse of Relational Databases

  • Dave Wentzel

    We are retrogressing as a profession. Everything is in a constant cycle. We are relearning what we thought we unlearned in the Seventies. I remember working on network and hierarchical dbms’s as late as 1995 and hated it. SQL and relational was so much better for business people who ONLY cared about what was functional and didn’t need to know intricate ISAM/VSAM b-tree traversal crap. Now we are back to writing mappers and reducers for NoSQL that require an IT guy to write, again. Although Hive and Pig are getting much better.

    But, there’s a bigger problem no one wants to talk about.

    In every case where I’ve seen a NoSQL proposal the overarching reason (once it was broken down far enough) was always trying to get around “DBA processes” and getting change in the database faster. The business needs new tables and cols and the DBAs are viewed as impediments. After a while people will eventually invoke Godwin’s Law when talking about their data people. Not just the coders saying this, the business units too. So, if we can get NoSQL in the door we don’t need all of this process around schema changes.

    That’s a shame because there are good uses for NoSQL (in memory key stores and graph dbs) that aren’t political. Every NoSQL vendor I’ve seen starts their Prezi with “you don’t need a DBA anymore.”

    Data professionals might be their own worst enemies. I just wish we would all realize this.

  • I don’t disagree with you at all. I’ve got several presentations I’m doing now on implementing database deployments through continuous deployment and other devops and alm mechanisms, all to bring database development and deployment into the 21st century. We can be fast, agile, and protect the business at the same time. It just takes work and discipline. In short, it’s hard. Which means, it might not happen.

  • Dave Wentzel

    I’ve got what I think is a really cool and totally different approach to evolutionary databases. I’d love to share it with you and you can use whatever pieces you find valuable. As an example of its power it can upgrade almost any schema with just a few seconds of downtime and no goofy ddl or dacpac weirdness. Ci loops work great. If interested shoot me an email.

  • Lee Markum

    I’ve been doing an increasing amount of DBA work for three years now and was just moved into my first role as a DBA in March. I pursued this role precisely because I saw the need my company had for it and because the work seemed to make a lot of sense to me. I saw gaping holes in efficiency of management of the few SQL instances we do have and I knew it would only get worse.

    As I have applied my knowledge, I have noticed that questions I raise sometimes slow things down. But, wouldn’t you want a process to slow down if someone saw danger ahead and could explain the looming danger.

    I have also noticed the opposite effect of my work. The things I do often make processes better and development’s work easier and faster. For instance, copying and moving SSRS databases from a SQL 2005 instance to a 2012 instance I configured. That process made it possible to transfer hundreds of reports and subscriptions in a few minutes. Or what about backing up and restoring a database on a schedule so that it can be reported from and take load off a busy OLTP system?

    Everyone wants the fastest development possible so their company can beat the competition to the punch, but getting to the finish line first and winning the customer does not mean much to the customer in the long term if what you offer them is a pieced together app that hobbles along because the company ignored the people trying to make things better for the long term.

  • Dave Wentzel

    Hey Lee, I definitely agree with what you are saying, but most business folks don’t, IMHO, at least at the places I’ve consulted. We are always told to cobble things together and then add the “technical debt” to the backlog to fix it the right way later. Problem is, when the backlog gets prioritized later the technical debt is given short shrift. It’s only when the technical debt results in downtime and unplanned work that the technical debt will be paid down.

    I guess my point is, when viewed by the decision makers, your “good way” doesn’t fall anywhere on their value stream.

    I’m not trying to be insulting, just wondering, like you I think, how we can change this as an industry? I think one way is not slow things down, even if they ought to be so things can be thought out and planned better. We (relational guys) need to be thought of as being as flexible as the nosql guys.

    In other words, when “DBA” is uttered in the hallways by the executives, a happy face is seen instead of some muttering about slowing down the Java guys implementing that new iPhone mashup.

    More importantly, the issues you raise like moving databases, somehow need to be presented as being directly beneficial to the bottom line. I wish I knew how to do this. Unfortunately the execs don’t want to hear whining about new tables not being in 3rd NF.

    • Neither in my opinion. I think relational databases work great, when used appropriately. I think ORM tools work great, when used appropriately. I just think a combination of grumpy, sleep-deprived, DBAs who tried to stand in the way of progress instead of help it along and developers who were ignorant of best practices and methods have combined to make relational databases seem like a bad thing that slows down development and hurts projects. It’s just not true.

  • Yaroslav

    Quoting last comment by Grant “I think relational databases work great, when used appropriately. I think ORM tools work great, when used appropriately…”. I couldn’t agree more with you. Many of the problems I see is the one you mentioned having a shinny hammer and seeing everything as a nail. And ofc, DBA’s are scary as hell and better not talk with us 😉

Please let me know what you think about this article or any questions:

This site uses Akismet to reduce spam. Learn how your comment data is processed.