May 07 2013

How to Tell Your Windows Azure SQL Database Moved

The very concept of the Windows Azure SQL Database (WASD) is predicated on the up-time created by having three active copies of your database. Should there be a hardware or software failure that would cause the primary replica to go down, your database gets moved to one of the secondary replicas which has been maintaining a copy of the data from the primary replica. In theory, the worst thing that happens is that you have to retry a query. In fact, building that sort of resilience into your code is a pretty fundamental aspect of working with WASD.

I was asked the question, how do you tell if your database has been moved between replicas. I didn’t have a complete answer, so I set out to find one. The first, more obvious part of the answer, is that you’ll receive a connection error to your app while this event is occurring. Microsoft has a very solid white paper on how to deal with your connections, Windows Azure SQL Database Connection Management. Scrolling down you can see that you’ll receive a very specific error number depending on why you lost your connection. These errors include 40197, which is the number you’ll receive during any of the events that trigger a failover to the secondary replicas. Answer done, right? Well, not exactly.

I knew about the error code, but the real question is, how do you tell if the move occurred while you weren’t connected? Is there a dynamic management object or system view that shows this information? The first thing I thought of is sys.event_log. That’s because it stores aggregated information about connectivity and errors such as deadlocks and throttling. Surely, if you were moved it would be there. But, reading through the documentation there’s no indication of it. And, it looks like that information is stored within the database and is therefore going to be moved (unlike all the information in cache which just goes away).

Speaking of cache, you could try running queries against sys.dm_exec_query_stats to see when the oldest compile time of any given query might be. But, remember, we’re on a shared server with a hundred other databases. Cache is highly volatile. I don’t think I’ve ever seen more than about one day’s worth of information within the cache dependent set of DMOs. In fact, I seldom see more than about an hours worth there and sometimes quite a bit less. So that’s not a possibility unless you set up a mechanism for monitoring your servers and can see a massive delta in what’s in cache between one look at the query stats to the next. But that could still be cache volatility.

Looking through the rest of the DMOs and system views I couldn’t identify anything that definitively answers this question. So, at this point, I’d say, if you don’t get the error, there is no real indication that you’ve moved from a primary to a replica. On the one hand, it concerns me that I don’t know. On the other, who cares. The whole idea behind this approach to development is that the guts of the things are handled by others allowing you to concentrate on building the database and the app code appropriately.

For lots more about Windows Azure SQL Database, sign up for the pre-conference seminar, How to be a Successful DBA in the Changing World of Cloud and On-Premise Data, at Tech Ed 2013 in New Orleans or Tech Ed Europe in Madrid.

 

8 Comments

  • By Feodor Georgiev, May 7, 2013 @ 2:32 pm

    Grant,
    a while back I wrote a blog post called “Fiddling with Azure”, which was a result of a total boredom during the winter months in the North.
    In that post I was trying to figure out some internals of Azure by poking around.
    It actually shows a way to find out the names of the servers on which the database is hosted.
    I do, however agree completely with you that it does not matter too much where the database is, as long as it is accessible.
    Here is the post: http://sqlconcept.com/2012/11/23/fiddling-with-sql-azure/

    Feodor

  • By Grant Fritchey, May 7, 2013 @ 2:59 pm

    Excellent feedback. Thanks Feodor.

  • By Herve Roggero, May 11, 2013 @ 2:51 pm

    Hi Grant – Good article. To find out if my database has moved since the last time I checked I fetch the database ID of my database: SELECT db_id(). When you fail over, chances are your database ID will be different because you are on a different server. It’s not a perfect solution because there is a chance that you end up on the same Database ID, but considering there are typically hundreds of databases this approach works most of the time.

  • By Grant Fritchey, May 12, 2013 @ 7:17 am

    Thanks Herve. That’s a good idea. Wish I’d thought of it.

  • By Mike Lewis, May 13, 2013 @ 4:14 am

    Hi Grant,
    Another indicator of a move, is to monitor the percent full a database is, and the even more granular index fragmentation. If you see a jump up or down in these numbers the chances are you’ve failed over to a secondary. The reason being that a secondary is likely to have different fragmentation levels – secondaries are trashed and rebuilt relatively regularly behind the scenes, which defrags the indexes.

    If you couple this info with the suggetions above, you can be pretty sure a failover has occurred.

  • By Grant Fritchey, May 13, 2013 @ 6:34 am

    Thanks Mike. More good info.

Other Links to this Post

  1. Windows Azure Community News Roundup (Edition #65) - Windows Azure Blog — May 10, 2013 @ 6:00 pm

  2. Windows Azure Community News Roundup (Edition #65) | McCloud — May 13, 2013 @ 12:06 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment