Life/Work Balance

Apple iPad - Work Life Balance ToolTechnology, especially information technology, is the greatest thing to ever happen to mankind, freeing us from toil and drudgery. Technology, especially information technology, is a pernicious evil taking over our lives forcing us to work harder and longer. Depending on the time of day, the day of the week, my mood, my wife’s mood, or the direction the wind is blowing, either of these statements could be true.

The fact is, I love technology and I do have to wrestle with keeping it from taking over my life, but only because I have so much fun with the toys that technology brings. You want to know how much I love toys, ask me about my Droid sometime. Pull up a chair. We’re going to be here a while. The trick is, finding that sweet spot, where you use the tools presented to you in order to enhance your life while enhancing your work. Just enough of each and you can be a hero at home and on the job and have a blast doing both.

The one thing I really hate about being a DBA is being on call. I’m not sure why but most systems fail one of three times, right when I’m going to sleep, so I get to stay up another 1-3 hours fixing the issue; around 3AM, so I can spend about 1/2 an hour figuring out how to log into the network before I spend 1-3 hours fixing the issue; or, when I’m half way up a mountain with the Scouts, in which case, I just have to call the boss and get someone else engaged (and yes, I do prefer these last failures to the others). The real trick here is, to get your systems set up so that you don’t have constant emergencies, regardless of the time of day. How do you do this? Proactive monitoring.

Red Gate handed me 10 iPad’s along with 10 licenses for SQL Monitor, their new monitoring tool. I’m to give these 10 devices away to the best responses in the comment section of this post to the question I’m going to put to you shortly. That’s right, you can get out in front of the issues you’re running into and avoid whenever it is that you get called from work and get an awesome toy at the same time.

The goal is life/work balance. Notice which one I put first. That’s the priority. Here’s your question:

What do you think the most common cause of server outages is, why, and how would being able to monitor your systems remotely help solve this issue, thereby improving the quality of your life?

The contest runs from now until 11:59 PM, December 17th, 2010. Please reply below, but keep it pithy. Don’t publish your version of War & Peace in the comments (I might delete it). I’m the sole judge and arbiter (which means, I probably will delete anything resembling War & Peace). One entry only. Make sure there’s a means of contacting you in the post, or I’ll give your iPad to someone else. Remember, pithy is our watch word. You can answer this question in three well constructed sentences. If you win, I’ll want to get a picture of you using the iPad to monitor your systems remotely. Plan on sending me that picture by January 31st. An interesting picture. Something with you sitting in your cube at work just won’t fly.

That’s it. I’ll announce the winners in a new post on the blog at the end of the week. Here are the official rules:

  1. The contest is open to professionals with SQL Server monitoring responsibility. Entrants must be 18 years old or over.
  2. Entries must be received by Friday, December 17, 2010. The contest organizers accept no responsibility for corrupted or delayed entries.
  3. Employees of Red Gate, the contest organizers and their family members are not eligible to participate in the contest.
  4. Entries are limited to one per person across the three simultaneous contests hosted on SQLServerCentral.Com, BrentOzar.Com, and ScaryDba.Com.
  5. The organizers reserve the right, within their sole discretion, to disqualify nominations.
  6. The organizers’ decisions are final.
  7. Red Gate Software and those involved in the organization, promotion, and operation of the contest and in the awarding of prizes explicitly make no representations or warranties whatsoever as to the quality, suitability, merchantability, or fitness for a particular purpose of the prizes awarded and they hereby disclaim all liability for any loss or damage of any kind, including personal injury, suffered while participating in the contest or utilizing any prizes awarded. 

133 thoughts on “Life/Work Balance

  • To me, the most common cause of server outages are the issues you don’t see and/or can’t react to. They may be very easily resolved if caught quickly, but if allowed to fester for a few hours they can turn into a much larger issue. All the examples you list about when systems fail seem to occur during times when you’re not at your desk, which makes perfect sense. When you’re at your desk, monitoring for problems can be as simple as glancing at a monitor. When at home or on the go, it’s not quite as easy – you have to get to a computer, log on to a VPN and open a remote desktop session to monitor things. Not only does this involve a few steps, but if you’re not at home or near a computer, you’re hosed.

    Having a device like an iPad would make remote monitoring easier, which means it can likely be done more often, therefore catching more minor problems before they blow up into emergencies. The convenience of the iPad would mean that finding a computer would never be an issue – you wouldn’t even have to leave the room (or perhaps the mountain!) For these reasons, a cel phone could also make a great monitoring tool, but the screen size is the deal breaker. The iPad’s amazing portability exceeds that of a laptop by far, and its larger screen size means it would be incredibly convenient as a monitoring tool.

  • WIDBA

    The supreme cause of outages in my experience come from poor application code. The reasoning is a blend of developer’s not testing, not understanding T-SQL and lack of discipline when putting together deployment packages. A portable solution for monitoring would enable me to support and respond to server issues all the while teaching my kids the art of fishing or foraging for mushrooms in the woods.

  • Many a time I’ve seen outages occur simply because there’s BEEN an existing problem that you simply weren’t aware of because you didn’t have the insight (read also: lack of proper monitoring). Having a tool that can give you constant feedback in an easy and quick manner can you save you TONS of headaches in the long run since you can then be more proactive than reactive. Just my $0.02 USD 😉

  • Ward

    I am responsible to monitor many small databases for many different clients. With small clients, the servers must multi-task(example: a server that the Sql machine, file server and primary DNS). We use virtualization at many clients, but this has caused different issues. We have a system that monitors event logs and emails flags events, but with the number of clients we have and the number of emails that get sent because of the non Sql events, the information gets lost. Since I don’t have control of what is on the servers, I need a way to see what is important to Sql, without seeing all the other things I don’t care about. It is email overload to look through all the messages the system sends.

    The most common cause of outages in my environment is disk space. With the servers doing so many different things, I don’t always have control of what is stored on the server.

  • Erik

    On the development side, the most common cause is under-trained developers hacking together something that “works”, but not considering the outcome, this will either fill the log files and crunch disk space, or kill the processor/memory because someone read something once that a CURSOR could do this task that’s called 20 times a minute all day.

    Being able to remotely watch when a developer deciding to pull a late night commits a Stored procedure change, which in turn crashes the box, I may have been able to stop it much sooner, and not have to spend a few hours going through the logs.

  • I, The DBA, have been a cause of server outages. Yes, that’s right – I said it.

    How’s that? I just didn’t realize my servers were sick. In the early days I didn’t know what symptoms to look for. When I finally started to know what to look for, I knew enough to be busy with other tasks and didn’t always look for what I should have been looking for (Its kind of tough to do fire prevention when fighting a four alarm structure fire. It’s also kind of late…).

    So yeah, I think the majority of database server crashes have a common pattern – folks just don’t know they have an issue brewing, they only know when the issue is ready for consumption.

    Being able to monitor an environment (with packaged software that someone else maintains, not roll your own scripts that take time away from) means we are told when things start heading south. It means we are able to see a baseline of that path coming together. Quality of life? Reactive time goes down which means stress goes down and ruined family time and holiday occurrences go down. That means I don’t get -that look- so much at home. 🙂

  • We all know machines break and parts fail, yet many IT people forget to do regular health checks on their database, their servers, and the systems those servers are plugged into. We allow ourselves to get busy, be distracted, and worry about the “more important fires,” then we wonder why something breaks out of the blue. Remote monitoring of servers would allow me to know of potential issues before they become issues. During a break at SQL Saturday (or Rally), I’d be able to step aside to resolve the issue before the outage occurs. SQL Monitor would take the surprise out of the equation, making my job, and my life, much more pleasant.

    Not to mention I’d have time to write more SQL articles for SSC. @=)

    You can contact me at the email attached to this post or via PM on SQLServerCentral.com for user “Brandie Tarvin.”

  • Karen Lopez

    I would use SQL Monitor on an iPad to identify potential server issues without having to drag myself out of bed in my my Helo Kitty / Computer Engineer Barbie themed boudoir.

    See, I support systems that have no monitors in place. Systems gauge the time to fail spectacularly while I’m on a plane, on vacation, sleeping or attending conferences. I think the systems themselves check my calendar to find the worse possible time to fail.

    So if I had and iPad with monitoring tools, I could keep the servers in line way before they were able to stage their next coup. Leaving me more time with Hello Kitty and Barbie, reading blogs, leranng more about how to keep those servers under control.

  • EvilPostIt

    In my humble opinion…. SQL Server failures are caused by fools…. Now this is not a bad thing as all of us are fools at some point or another. These failure could be caused by developers / users not know what SQL actually does under the hood, improper monitoring in place any number of factors.

    Every single DBA has had at least one time when your heart drops into your stomach and you think “oh no, undo, undo!!!!!!”.

    Being able to remotely monitor this with an iPad would mean that while watching the SQL Infrastructure I could watch catch-up tv and play angry birds at the same time, thus restoring my life/work balance!

  • Simone Burcombe

    Our department has recently implemented a new SLA to cover after-hours support for critical systems. We’ve done this unofficially in the past, but now it’s in writing, and we’re compensated for being on call. The hardware and network guys are able to rotate night duties, but in our midsize company, I’m the only one looking after data and databases. At the same time, I have a 6-year-old son who is very active in extra-curricular activities. I wouldn’t for the world miss being there with him for all of the “Watch me Mom!”s. Not to mention that the little people Ukrainian dancing is especially entertaining! While I can take trouble calls via cell, there’s really very little I can do for at least a couple of hours. It would be great to be able to not only monitor the SQL servers, but more importantly to resolve any issues before they escalate. That kind of life/work balance is precious indeed.

  • Mike Robinson

    Most of the outages I see are due to improper planning across the IT department. Every app seems to require a database, many want their own dedicated server, different versions, etc. Over time this grows to where you can’t easily stay on top of all the servers any more.

    Looking a SQL Monitor, it would be simple to set up alert triggers to go to your e-mail, then be able to look at the problem remotely and decide what the best course of action would be.

    The server is going to have a problem anyway, I’d rather know about it before it happens, or at worst case before I get in at 8am and everyone’s breathing down my neck to get things fixed.

  • Outages? From experience: (1) The hard drive fills up from sudden database growth, or transaction logs keep growing due to failing log backups. (2) Important service account credentials change or the account disappears. (3) Deadlock in competing ulk transfers over night or weekend. (4) Over running connection limits. (5) Power outages and hardware failures. (6) Trends that build up over time and have abundant warning signs but go unnoticed in the event log. With monitoring, all of these issues can be detected, and it is really too labor intensify and sometimes too diverse to effectively babysit without automated monitoring. The iPad then makes monitoring not only automated, but my ability to view it portable. I can step away from my desk and into a conference room for a meeting and still be “connected.” I think such portability will be phenomenal! My response-center simply goes with me.

  • More times than not, we get outages simply from a lack of communication. Right hand doesn’t know what the left hand is doing… things get missed this way in planning, changes don’t get communicated, and ‘fixes’ don’t get applied.

    A device like this and SQL Monitor would help start to centralize our process and communication. If I am out of the office, I can commit to 5 minutes or so with the device to check in and communicate because I would likely be playing on the device anyways. Sometimes taking a couple minutes to communicate with the team and get everyone to slow down and think can have a mountain of payoff.

  • Jeff Tetzloff

    I live in Wisconsin. When we get buried under 2 feet of snow, I shouldn’t have to shovel my way to a cubicle to find out we ran out of drive space because maintenance jobs blew up. I should be able to do this in the comfort of my home, with my 9 month old crawling over me while I do it.

  • Black Magic.

    Seriously.

    Most of the outages I have seen (and participated in, been the cause of, etc) have not been the first time that the outage has occurred (or that the symptom has been displayed for a deeper problem). In most of these cases we terated a symptom or setup a new problem simply because there was no visibility into the server to see signs of ill health, poor performance, or any of a dozen other signs. had we seen the signs we oculd have gotten ahead of the issue or even solved it before it became one.

    So I am going with Black magic. It’s when we apply a fix (like the daily reboot or the 3x reboot) but don’t know why it works. When we use a consistent incanctation (nolock hints) only because we know that deviating causes bad things to happen. When we do what we can because when the choice is between brining that server back up or leaving it down while we try and root cause it, the business is always going with bring it back up.

  • Being able to get this setup and running for my DBA would cause me to not have to hear the stream of obscenities that typically flow out of his mouth when he’s bothered to monitor a server and he’s not near a wi-fi location. While he does pull into local cafe’s I’m pretty sure that all he’s really doing is drinking some high end coffee and not actually doing anything useful. So this would at least make it appear like he’s more productive. It might not stop him from getting hammered and dropping tables that have names that annoy him, but it might help.

  • I’m a lazy DBA. I want my servers to tell me when something’s wrong. (I started writing PowerShell scripts for this reason.) SQL Monitor is doing this at my client sites where it’s installed. I like it.

  • The number one reason I was called to work on a down server was — seriously — dba’s that didn’t care. I know — not you guys! I worked for a MAJOR company that just loved Oracle. This one dba, on the side, ‘managed’ all the SQL Server instances appearing. But she didn’t have the time to learn what SQL Server liked and they didn’t give her the time learn. So log’s overflowed, resources were limited and indexes completely fragmented. The final straw was she had to be the Windows Administrator on boxes, installing and configuring it. Can you say spread too thin without resources to handle new tools.

  • One of the most common causes of problems in my environment have been new code being released in production that I didn’t know about. I know software can’t solve communication issues, but with a complete monitoring solution, I could see side-effects of these pushes hopefully before the end users see it. That way, once I go home for the evening, I don’t have to worry about that midnight process bringing the server to it’s knees… and then having to drag my sleepy butt out of bed!

  • The majority of outages that I’ve dealt with is because of a lack of a simple oversight, on someone, or on my own part.
    Here are a couple examples.
    Write a script to delete C2 Audit logs from filling up the default location on a new server, but the path was not fully correct. A database in a has unexpected growth, and the online re-index, sort in tempdb job blows out the tempdb.
    Code that captures vastly more data than the Database was sized for, or the code was not properly tested and failed to close it’s database connections after it made it’s call.
    The 3rd party Vendor app that gets new features enabled without proper vetting.
    A Server Engineer decided to ignore the repeated emails about the errors you were seeing in the error logs, until the hard drive goes belly up, and when mirrored Raid HD goes down, and the Engineer pulls the wrong drive, leaving you to backups, prayers, and a nice 24 hour work day on Fathers Day.

  • Jonathan Beckham

    Most server outage issues are caused by unknown use cases that systems weren’t designed/developed to handle appropriately. Using software that allows you to monitor your SQL server would allow for you to analyze cases you never even thought of, proactively react to them and resolve them before they become an issue on Saturday night when you’re supposed to take your family to that thing you promised them you would take them to for 6 months and your kids are crying because they can’t go since daddy has to work now. Proactive monitoring and resolution saves marriages and families improving work/life balance. 😉

  • Emily Johnson

    Our usual suspects are application databases that we don’t have the time or personnel to worry about. Every once in a while a “canned” app sneaks by us, designed to run on its own dedicated server rather than playing nicely on a shared server. We have high-level monitoring on disk usage, but nothing more detailed than that. It’s never good when an application owner knows about database problems before a DBA does! Another pest is a hung backup due to files locked by other server processes. This can consume the whole server’s CPU until we find out and resolve it. This isn’t a hassle while you’re sitting at a desk at work, but how often do these problems occur from M-F 9-5? Never!

    I’m on the younger side of the DBA age range and tend to plan my free time activities on a whim. Being on call is a necessary evil in the life of a DBA, but needing to drag a backpack and laptop everywhere makes it all the worse. I’ve turned down many a football game, road trip and other fun things for fear of not having an outlet or flat surface in case I get paged. Using SQL Monitor on an iPad would allow me to check up on our SQL Servers, anticipate issues before they happen, and respond to alerts quickly from anywhere. This means actually being able to have a life! Being a DBA no longer means needing to be chained to a computer screen, and that is VERY good news.

  • In my experience and humble opinion server outages occur due to either not following best practices or by not being proactive and constantly monitoring server health and performance. Best practices from your product vendors and community experts are real-world proven implementations that give you a recipe for success.

    Proactive monitoring allows you to identify potential areas where there is a risk of failure or performance bottlenecks before they happen. In the case of Microsoft SQL Server, this is very important because there are so many variables that could contribute to a system outage, data loss or performance degradation. Being a DBA you may already know how cumbersome and time consuming identifying and keeping track of all these variables can be, especially at 3am in the morning!

    A good monitoring tool such as Red Gate’s SQL Monitor can facilitate proactive monitoring remotely, from home or on the road! Being able to get alerts when things go wrong or even before they happen will definitely make my life easier and be able to enjoy family life, get a good night sleep, have a fulfilling career, pursue other interests and look good in front of my boss by having a smooth running server environment (or at least the perception of).

    How many times have DBAs and IT professionals scurried in the middle of the night or during a family vacation when issues arise with your SQL Server environment back in the office?

    How much time of your life do you spend troubleshooting and putting out fires on your SQL server environment?

    Monitoring tools like these will definitely let me enjoy and balance life.

  • I’ve got a full-time job where the systems are fairly stable. We have coding and configuration issues come up from time to time, but our sql servers are already fairly well maintained and monitored by redgate software. I am however writing a web-based game for myself. Because I already have a full-time job where I’m working plenty of hours and on-call 1 out of every 4 weeks, anything I can get to make the server for my website work more smoothly means more time I can spend either developing the site and adding new features or chilling at home with the wife and cats. As for the question of how I would use this remotely, well I don’t actually have a physical server for my website. I rent a virtual so it’s impossible for me to get physical access. I currently log in remotely via one of my desktop computers at home, and it would be incredibly handy to have an iPad to be able to handle server issues either on my lunch break at my real job or if i’m off in the wild. Heck, maybe i could keep the ipad on my nightstand and that 3am email to my cell from sqlmonitor can be handled without booting up the desktop. That’d be a treat!

  • What do you think the most common cause of server outages is?
    Hardware Failure 70%
    Software installations and updates 20%
    Humans doing the wrong thing 10%

    Why and how would being able to monitor your systems remotely help solve this issue thereby improving the quality of your life?

    This product allows to you respond to issues from anywhere, without having to skip out on lunch, dinner, etc. A quick walk into the hallway to handle an emergency and then you’re back to your event. Limiting the lenght of interuptions and seeing the issue with your ‘own eyes’can help speed up the resolution and get you back to what’s important when you’re in or out of the office.

  • In places I have worked I believe it is lack of initiative either by either the company or the employee. Most people I have been around come to work, do their time, and then leave. I understand work takes you away from what you may be wanting to do (sleep, spend time with family, watch re-runs of Andy Griffith), but you should take pride in everything you do at work and outside of work. When companies get past a certain number of servers they have to take that initiative and help themselves. Monitoring tools are available for almost every piece of network equipment and then some applications (SQL Server being one of them). If the data is important to the company and would break you if you lost it or it went down, then there is no reason not to implement some monitoring tool. Now if that company offered me the ability to fire up my iPad and open up Redgate’s SQL Monitor without me having to leave my episode of Scooby-Doo…you just offered me more initiative to do my job better and more thorough.

    I can be reached at wshawnmelton AT gmail DOT com or @meltondba on Twitter

  • In my experience most outages are either caused by something I should have caught earlier or because of an external issue like power or cooling. 9 times out of 10 it takes me less time to fix the issue and get the system up and running than it does to power up a laptop, connect to the vpn, and get RDP’d into a machine with the tools I need. With an iPad loaded with SQL monitor I suspect that the time required to get connected to tools that let me fix an issue will be dramatically decreased – letting me get back to my really important job – playing and interacting with my kiddo. I would much rather watch a ballet class than watch a SQL server on the weekend.

  • Adam

    This is a little off the center of the question, but worth considering.
    One great way to give employees a good life/work balance is to allow, where appropriate, tele-commuting (something I have been doing for over 4 years now). It doesn’t work for everyone, but those who it does work for get the benefit of an hour or more (drive time) of extra life (or work if needs be) every day they are on the job. The tele-commuter is in much more control of their environment (think fung-shue) and potential distractions. Further, for the regular tele-commuter who is off the clock when a server goes down or some other emergency arises, they are quite used to working remotely, and there is a greater chance they will be able to hop on the problem in minutes from their normal office as opposed to having to drive in, potentially late at night. I imagine a greater life/work balance could be achieved when the simple outages only interupt the admin for a few minutes as opposed to having to cancel dinner (or something similar) and spend an hour in the car to just kill a deadlocked process (or the like).

    I can’t comment on how to determine when and for who telecommuting is appropriate, but when it is, my experience shows that the around the clock work often expected of SQL Server admins is much more enjoyable and keeps your job from being something you detest.

  • Sam Greene

    Most database outages arise from a lack of information, which we have available, but may have a hard time sorting through. Information that a job failed, that a query from a vendor almost killed the server, that the transaction log drive is filling fast, or that the monthly patch will leave users unable to print their favorite report in IE.

    All this information can leave you feeling a bit overwhelmed. Not being able to handle it properly leaves us with frantic days and sleepless nights.

    Being able to KNOW my 20 servers are happy will make my work day easier and my time away much more relaxing and who knows – maybe even carefree.

  • Dave Mulanaphy

    I find the most common cause of server outages is not having a good insight into what is going on with the database server as a whole. Any tool that can give more insight into what is going on with the server is great! That being said I would use SQL Monitor in a different way, to help with the life/work balance. When my wife tries and gets me to go to a birthday party for a one year old I don’t even know or something else like that, and let’s say the Patriots are playing the Bears in Chicago and they could clinch a playoff spot, then I can open up my IPAD connect to SQL Monitor and say “Sorry, but you see all these messages on SQL Monitor it means that I have to stay home and work.” Even if there is no real issue I am sure I could find something in SQL Monitor that needs “immediate” addressing. Yeah for technology.

  • Pamela Hurd

    The backups failed, disc is full….Oh, Noooo

    The AS400 app was updated last night and the SQL Server loads are late. Everyone wants their reports, like, right now….Oh, Noooo

    #&@*#@ Finance has the tables locked….Oh, Noooo

    I am the only DBA and I am going on vacation next week, the developers are my backup….Oh, Noooo

    Wait, if I had an iPAD and SQL Monitor I could easily monitor my systems through a web app. I can be proactive. I can be a (GASP) hero.

  • Gregor Suttie

    Sql Server outages are normally due to backups running at silly o’clock in the morning (quiet period) and then running out of disk space causing the server to barf. Using SQL Monitor I would setup alerts to check the disk space so that when the backups run I can be notified sooner if the disks are starting to run out of space instead of having to check emails regularly (espeically at weekends).

    An Ipad would allow me to check the status of the servers quickly and regularly checking the alerts I would setup within SQL Monitor.
    This would even allow me to check the health of the sql servers whilst walking the fairways of the golf course at the weekend.

  • Our company offers SaaS for inventory and order management that needs to be available as close to 24x7x365 as can be achieved; we must therefore be on top of operations all the time. Besides the usual equipment failures and unanticipated consequences of code changes our users are very good at finding new and exciting ways to disrupt our systems. That’s where monitoring and quick response to exceptions comes in.

    All of our servers reside in a datacenter so it really doesn’t matter where we are — we are equally close to the office whether sitting in a cubicle or a coffeeshop, the only difference being the presence or absence of a desktop/laptop/netbook/smartphone/coworker/boss. I have had very positive experiences using remote connectivity applications (LogMeIn and Citrix) from an iPhone to manage exceptions regardless of where I might be and an iPad would offer the performance and screen size to make responding to problems quicker and easier. Also, having a single console from which to manage responses (and see what needs attention) could greatly simplify the process.

    Our corporate computing fabric lends itself to a tightly-knit weave of life and work; better monitoring tools (both software and hardware, aka SQL Monitor and iPad) would help to smooth out the inevitable tugs and tears that can cause that cloth to come unraveled. Anything that can meet a business need as well as satiate personal desire (entertainment and a coolness factor) would be a welcome addition.

  • David Stockton

    Human error – most often through poor code or mis-configuration.
    Visibility is king and I want simple, understandable metrics conveyed through an intuitive interface I can view on my terms – any device, any place, any time.

    If I get a phone call over the holiday season I don’t want to down-snowballs, bring the kids inside and start a 30min+ terminal session just for a health report. I want my metrics and I want them now!

  • Bad code causes most problems: queries full of temp tables that take over tempdb and its disks; single column queries on tables with 2.5m rows; nifty dbcc checkdb jobs created using t-sql rather than SSMS; cursors…

    Getting an alert before any of these gets to be a showstopper, by taking over the memory and processor gives a DBA more chance to fix and prevent in future.

    And fixing it quickly leaves more time to drink tea: I’m English. A nice cup of tea and a biscuit is the zenith of my ambitions. Anything else is far too decadent to contempla

  • Chandra Khanal

    There could be several causes for server outages. Few of the outages cause are designing poor application and coding them very bad. Many times, developers write the code after getting the business requirement; they cod the application very bad, do not do the series of testing, and do not have expertise in developing a well tuned application. Another first cause of server outage is not having enough information about the system, application functionality, lack of proper planning, and not having proper routine to do regular health check of the database server, network performance, application servers and then not getting up to the speed to have a monitoring tool to keep monitoring the entire enterprise database servers. Another frequent issues with server outage is poor planning of setting up the server infrastructure and getting insufficient and inefficient server specifications, which I have seen very often.

    It is very hard for all DBAs to run to the office at midnight when they got a pager or a call from a server which is not doing good such as transaction log is full and the server is out of disk space. I have had this situation yesterday and need to get up at 2am and fix this issue. There can be several other issues and remote monitoring is essential and these days with different technologies remote database monitoring is becoming easier, and will be super easy with having portable iPad so that I do not have to go to I r home office and open up the desktop or laptop and start working from there. With this small device, I can login to the server and start looking into it. Same is the case while I are travelling or doing family get together when I are not able to get to the home computer, I can login using this iPad and start investigating the issue. I can monitor servers remotely easily from any part of the world provided i have the internet. Being proactive and setting up automated job will ease our job and these can be easily handled remotely. At the mean time, i can use the monitoring tool like SQL Doctor from Red Gate

  • hitesh

    What do you think the most common cause of server outages is, why, and how would being able to monitor your systems remotely help solve this issue, thereby improving the quality of your life?

    The common facors that lead to server outages are:
    no disk space available, power outages, transaction log full, no management operation/archiving defined.

    Remote monitoring can bring peace in the lifes of DBA’s and stakeholders by alerting about an issue before it happens.
    I can still remeber when i had to carry the pager and hated the dreaded beep sound.

    A monitoring tool with appropriate actions to take when an issue happens will immensly simplify the life.

  • For smaller and mid-size companies their SQL server outages are usually caused because the company has employed a “hallway” DBA. These individuals spend most of their time doing one job and get called upon when an outage arises. Being able to remotely monitor their environments can provide more insight into why they are having outages. Knowing more about a system outage allows these individuals to address problems and prevent them from happening in the future, ultimately improving the quality of their life.

  • Be honest, the iPad is the glitzy part of the prize. It really is a nice piece of kit – just too expensive for me to buy.

    I want SQL Monitor – it’ll tell me when (not if!)there is a problem at work – that way I don’t have to proactively check my machines.

    This would leave me more time to help out on ask.sqlservercentral.com or twitter – using the iPad of course! The iPad being in accordance with spousal acceptance guidelines and accepted in other social surroundings, because an iPad is what the cool kids have and strictly not for geeking out.

  • Bhim

    Very common factors that cause server outages are server running out of disk space, power outages, transaction log full, poor database design, faulty RAM and other faulty hardware, and poor application code writing practice which can bring the server down to the knee.
    It is always a good practice to alert DBA before the problem occurs and this is achieved by setting up the alert in the server so that messages will be sent when the problem is about to come. For example, we can set the threshold of the disk space and alert the DBA if that threshold is met and DBAs will start looking into it. This is one of the ways that i monitor database servers remotely and have applied to all of the servers of this kind of issues. Having this kind of approach being implemented, DBAs can get peace of mind and do not have to panic when problems occur. Monitoring tools definitely help to make our life easier.

  • Our most common cause of server outages seems to be configuration type issues. What I mean by configuration is someone had their hand in deciding how something should be setup. The names have been removed to protect the guilty.

    Case in point. We have been dealing with heating and cooling issues in our data center for the past year. The existing cooling system was being taxed, so it was recently upgraded to a larger system. We had an outside contractor come in with a nice new shiny thing and it worked perfectly until… We had an issue on the manufacturing floor where someone turned off an emergency breaker to one of the labs and it caused the main break to trip. This unfortunately was on the same breaker as the nice new cooling system, which then caused our server room to overheat and several servers to crash. After this fiasco the configuration the cooling system was changed and email alerts when temperatures in the data center reach a certain level were established.

    This makes me think of our SQL Servers and what is going on with them and how I can proactively alerted when issues or possible issues arise. Being able to monitor my SQL Servers remotely on an iPad or on an internet-enabled mobile device (or Droid2) is very appealing.

    It would be much more enjoyable sitting on my couch with my iPad watching my SQL Servers yawning than getting a call in the middle of the night letting me know that the Monday morning report the CFO is running is slow. The ability to rewind time to view historical data is also a huge draw. Red-Gate #FTW

  • Gaurav B.

    In my opinion the most common cause of server outages is not testing the backup strategy. More often than not, I have seen backup servers / systems which are supposed to work when disaster strikes. What is missing is periodic testing to make sure they’ll handle the load when needed. This can be caused by multiple factors (in the order of most common occurring) – Not testing fail over, out of sync application code, underpowered machine, unpatched OS / database software.

    To solve these issues, it is necessary for the operations team to test their disaster recovery strategy. To avoid impacting production system, these tests should be done off business hours and that would require remote monitoring / access to the systems.

  • The most common cause of server alerts that could have led to outages has been in our warehouse. Reportwriters producing queries with cartesian products filling up tempdb , the transaction log , and the disk. Proactive monitoring has helped catch these potential problems. SQL Monitor could make that faster and easier – by removing the sole dependency on email alerts. By seeing a trend ahead of time, it helps to nip problems in the butt. Then I can spend 15 minutes fixing it as opposed to damage control and recovery.

  • I believe that the answer is ‘unintended consequences’. For server crashes, it often relates to reactionary activities that are not properly thought out that solve the initial problem, but cause several other problems that might go unnoticed for several days / weeks / months etc. It is often due to these unintended consequences that life gets just a little more difficult with symptoms including: Higher stress, lack of sleep, missing family events, having to cut short social visits, not being able to take a real vacation, etc. With red gate monitor you could help to detect and triage problems by gathering base lines, setting alerts, and “checking in” from anywhere. With red gate monitor and the cool new iPad I would at the same time be more productive and have fun playing with apps all while going places I might not otherwise be able.

  • I come from two very different places.. BIG and those that want to be big.. The really big businesses have the 300 server aircon, ups, fire systems, etc.. the smaller, less so, usually enough servers to need a few small UPS’s a single aircon unit but that’s where it stops..
    The ups’s are monitored, the sites are monitored the databases are monitored even our bandwidth is monitored.. Disks are raided, psu’s are duelled – you get my point..
    So one thing that’s taken me down in almost all the places I have had servers.. The air con..
    Unless you are in the room, you only know when the first few servers go down.. and you know it, they will all go.. I have seen the front of a sun disk array, dripping down the front of the rest of the rack..
    I got the texts – from 4 different systems.. wouldn’t that be more useful if it was all in one system, and I could plumb in a temp gauge..
    I’m talking small server rooms here that grow over a few years.. add the ups to the monitor, add the site monitoring, add the aircon.. as we grow, it grows..

  • accidentaldba

    The most common cause of server outage is me, the dba. We have about 8 datacenters around the country and around 10 databases going live every week.
    It’s a challenge to keep up with issues that cause outage:
    a. disk space issues.
    b. analyzing some query checked in by the developer that is causing grief for the users.
    c. planning for future growth.

    Ability to monitor the systems remotely should help in reducing my work load by 50%, help me to concentrate on tackling issues before they occur and plan for future growth.

  • Joe Overbey

    Often times I find that SQL outages are preventable. There is almost always either an error in a log, a perfmon counter that spiked, or a query that could have been run that would have notified me of any potential issue before it became a problem, or worse caused an outage.

    Most outages occur because something changed, whether it be configuration, or the way that the system is used. Examples of changes would include an Adhoc query was run that never got committed, or it was poorly written query that blew out tempdb. Another example is an index was dropped or fragmented due to a data load and never rebuilt or re-added therefore queries start timing out. As much as DBAs would love to lock down a system completely we have to give people access to do their jobs. Finding a balance is always a tightrope walk.

    While I feel that I can multi-task with the best of them I can’t have eyes on all places at once. We have far too many SQL servers. This is where monitoring tools come into play, and I have yet to find a tool that will monitor my sql servers out of the box. Whether it be System Center, or Splunk these tools can take months to configure scripts, set baselines and thresholds that make sense before I can actually even trust that they are working. If I had a good tool to remotely monitor my environment it would have to do more than just tell me that there is a problem in order to give me a better quality of life. I already have email alerts that can tell me that. This tool would have to help me to actually triage the problem remotely. I’d want to be able to see long running queries or blocking transactions and be able to stop them if I want to. I’d want to be able to manage things like agent jobs remotely. A remote monitoring tool that I can access via a smart phone helps me not be tethered to my house. A remote monitoring tool could allow me to not feel bad if I left my laptop at home.

  • To me it is simple. Proactive monitoring is a must in my single DBA shop. There is no such thing as rotating out on call duties.

    Servers may start out life pristine and perfectly configured but as they grow and see more development and use they don’t stay in that “fresh out of the box” state for long. Having the right tools can be the difference between fixing a disaster after the fact or being ahead of the curve.

    Between SQL Monitor and the easy always connection of the iPad, maybe, just maybe I can have a Christmas dinner this year with my family and not my servers.

  • As a consultant, performing SQL Server audits, I see a lot of badly configured systems. Poor DB design (or even none at all), exploding log files, and no indexing strategy are only a few. So I’d say the most common source for SQL Server outages is people, people, and people. Providing a tool to those people wouldn’t unfortunately help very much. But I, being the consultant, could make great use of a tool like this. I already see myself on vacation, sitting on the white sandy beach of Hawaii, holding my new iPad, and watching my customer’s SQL Servers, before going for my next swim. All in one central console. Easy. Earning money on vacation. Couldn’t be any better, could it?

  • Shiraz

    The most common causes of callouts in my company are for backups failing, batch/job failures, disk/database space related issues and slow performance. All these problems take only a few minutes to either resolve or put a temporary fix to resolve properly during working hours. However as is the norm you get called in the most inconvenient of times meaning you have to drop whatever you were doing, get back home/find somewhere with internet access and get online. On average this would take @45mins. Once logged in and fixed the issue, you then ponder whether there’s any point in getting back to what you were doing and sulk for a while.

    These problems exist and will always do so because servers not being built and configured properly, not getting the disk capacity/server spec right, users running ad-hoc queries/reports, irregular large data imports or exports, bad application code. Also these days with the global nature of companies every database needs to be available 24 hours a day from which you need to balance user access with backups, integrity checks and rebuilding of indexes where required.

    Having SQLMonitor would mean I have a quick and easy visibility of any issues that may be arising on my servers thereby responding to it before it causes a problem.

    Having an Ipad would mean I would be able to spend the same amount of time as you would to make a quick phone call or write a couple of lines of a text message to either look for any issues using SQLMonitor or fix the problem from wherever I may be at the time. This would significantly enhance my life/work balance as I would not need to continue with the disappearing act at every social function or leave the family stranded at the supermarket (has happened plenty of times), thereby giving me more time to enjoy my life and not worry about getting called.

    My employer would also benefit as not only would I be a much happier employee and problems getting fixed more quickly but more importantly as I would stop booking lots of overtime!

  • joe cepeda

    Great commentary on the on-call support. I think the greatest amount of time anyone on-call spends is on connectivity. Either security tokens are disabled, out-of-date, the permissions have been removed, or whatever. If network administrators could understand a little more what this frustration can do to someone’s health and well-being I belive they would be a lot more lenient about the process.
    I think the most frustrated I got with a support call was when I was in the middle of getting ready for the hottest night of my life with the nicest lady I had met and then be interrupted with a call for support from a user that couldn’t wait till the next morning. Much as I enjoyed supporting my customers, that was one time I could easily have traded jobs with a non-union low wage earning street sweeping worker. Worse yet was that the issue was with the user, dropping his dial-up connections (who still uses dial-up, anyway?) and nothing to do with the servers.
    As a support specialist, one of the most frustrating things for me is to get a support request for a database-connection-down and then have to drop everything and go to an internet access location and assist our customers. Often that involves all sorts of driving to get to a location and it ends up being more convenient just to drive to the office and provide the support there. Having an iPad would help a lot because the wifi connectivity is non-intrusive and easier to manage than on a bulkier laptop.
    However, those calls aren’t the primary reason I am looking forward to an iPad. As a volunteer in our Video Ministry at my church here in Chandler, AZ, we use Macs for pretty much any multimedia work we do there. Anywhere from watching a remote camera to keeping track of what camera is going live or getting a video snipet queued up, it all happens pretty quickly and require someone like me to seemingly be in different locations at the same time.
    I’m a producer/director and that means that during a regular church service we keep track of a computer graphics operator, two camera techs, a changer tech, the sound board techs, lights techs, and so on. It also requires constant interaction with the stage crews, singers, and band players to make sure that the process is as smooth as can be.
    Often we end up running back and forth from the production room to the stage and then out to the sound booth and back to the production to make sure that the show is going as planned. It never happens, as thank God, we aren’t perfect, but we strive towards that goal.
    Having an iPad, for me, would kill two birds with the proverbial single stone.
    On the one hand, it would help me provide answers and bring our customers out of their frustration, a lot more smoothly and on the other, it would help me spend more time interacting with individuals a lot more instead of spending my time running around from point to point in order to accomplish my volunteer work.
    Is this something I need, or should pray for? No. I can live without it but oh boy, would it make my life easier? Without a doubt.
    Wishing us all the best.

  • Diana

    Over ~5 years I encountered crashes due to power outages, hardware failures, bugsy code, bad database design, stupid user action, disk/database/log space filling up. Many times these crashes could have been prevented with a proactive attitude. So please let me bring to your attention what I consider a major root cause of the server crashes – the lack of monitoring. “Why should I pay anyone just for MONITORING my database server(s)? MS SQL Server takes care of itself”. Yeah, sure…until the next frantic call, occuring usually in the middle of the night (I live at GMT +2). I remember that last year, on December 31th, at 12:00 AM, I was trying to help one of my customers in L.A. as fast as I could…
    So monitor. Convince your customer that this is a must. Try to be prepared for the worst. You won’t succeed every time, but most of the time it will be better.
    Here being able to work “from distance” is crucial. For the big European cities, traffic is a big issue, even on weekends. And there is no way to go overseas / overocean in a “reasonable” time…
    Grant, indeed the balance should be “life/work”. If the quality of your life is not good, your work won’t be good either.

  • Tony Henley

    I feel the most common cause for failures are small stupid things. Kind of like the pebble that falls down the hillside and starts an avalanche. Most “problems” are planned for, (space problem, have a backup kick off) and can even be setup to be automatically taken care of. But you can’t, wisely, put in conditions to evaluate if every single condition that could happen. That’s where monitoring becomes the hero. When the server gobblins cause that millionth of a second network break that induces SQL to projectile vomiting. Monitoring let’s you know of the problem and allows you to get things back on their feet before a major bottle neck occurs and the off hours support time start adding up. The iPad portion of this allows you to remain a part of the world and not be anchored to a desktop at home. Plus you have a usable screen size that won’t cause major eye strain, unlike other mobile devices. Besides it shows the world that being a geek can be pretty cool.

  • Gary Mazzone

    If we want to talk about Life/Work balance we all change that opinion as we get older, work longer, in the insdustry.

    When I started I would work 60-80 hours a week and not complain. If the boss wanted more hours sure not a problem. I worked for one company where I was putting in 13-14 hours per day and one day was told it was not enough. We are now expected to work 7 days a week. That is not Life/Work balance that is Work balance. I did it but did cut back from 12-14 hours per day to 8 and started looking for something else.

    I don’t mind putting in the extra hours if needed but not every day. I have gotten older (over 50 now) and have a wife and kid. We should be more Life then Work balance at this point.

    What would an iPad do for me? I would be able to use it to proactively monitor systems and not need to spend that extra time resolving issues at 3 in the morning (and my wife will not be screaming at me the next day for the phone call waking her up also).

  • Ronnie Walker

    Cause – Missing the small tell-tale signs that something is up. These small indications eventually come back to bite us.

    Why – Pressures of day to day work, business as usual means there is less and less time to actively manage the SQL environments.

    How – Enable me as a DBA to be more Proactive than Reactive. That should be the DBA motto 😉

  • Michael

    Select Time24, Reason from ServerOutageLog where (DownTime > 30)

    13:53 | firewall mis config
    02:30 | air-con fail
    15:50 | power cut
    20:24 | power outage
    03:00 | Some prat f####ed with my backup ( disk issue )
    02:15 | Windows update caused crash ( disabled again )
    03:00 | New disk not on perc correctly

    From this lot most things happen when i am not there! Remote ( Tick )

    Our monitor tells us when its too late, i.e. its down, and i MUST go in to fix it. Proactive monitiring which understands what is happening,a nd when its heading to an issue, means I can remote login and stop it, at least until the morning. Quality pof life ( tick )

  • The majority of the problem, I think is the lack of unit testing because of the tight deadline. The business wants the new shiny features out of the door and ignore the bug that raised and decide to fix it later. The end result – production DBA will have to support it and deal with the direct impact from it.

    Proactive monitoring allow us to see what ‘could’ go wrong and react before it’s too late. Or just react 🙂

  • Geoff

    My SQL servers fail most of the time because I don’t watch them close enough. An iPad and SQL Monitor would give me an easy way to keep an eye on things. No more weekends filled with SQL rebuilds would make me happy.

  • Barry Melker

    Remote Monitoring would be a wonderful way to potentially diagnose problems and fix them or redirect to someone who can. Many of the “outages” that users experience on database servers at my company are the result of blocking caused either by applications or ad-hoc queries. For example our new CRM system has one application function that will cause open transactions to block several other application functions until the end user finishes their work and the transactions are committed. Several times a user has walked away from their desk before finishing their work and their co-workers will start calling support folks about a system outage. I am more than happy to let them know what the problem is and who is causing it, they usually apologize for bothering me and get the user to finish on their own. There are lots of known culprits that if I could quickly identify and communicate the details would reduce the urgency level of my clients. The ability to diagnosed problems quickly could save me a lot of personal time but also give my clients better overall service.

  • Mark Johnson

    The two root causes of server outage are time and money.

    Why – No money for additional DBA resources
    – No money in the training budget this year
    – Pushing out new release to 300 clients this year instead of 100 in order to speed up ROI of development
    – IT is consolidating servers and need the databases moved
    – etc.

    Equation:
    Already overloaded + More work + no additional help = less time, more rushing =
    less monitoring + missed early signs of trouble = more problems = less time, more rushing = less monitoring…

    Being able to remotely monitor the systems and be alerted to the early signs of trouble would help break the cycle thereby improving the quality of life.

  • Carla Johnson

    To be very honest, I don’t know what the most common causes of server outages are. If I did, I could anticipate and probably prevent outages at least 99% of the time and I wouldn’t need a monitoring tool to proactively “be on the look-out” for me.

    But since I don’t always know what to look for, and (more to the point) because there are SO MANY things to look for, that is precisely WHY I need a good monitoring tool. I need something to be putting in the work when I’m not at work!

    This would de-stress my life by decreasing the “worry” factor. Knowing that as certain metrics move out of the normal range, that there’s something available to give me a “heads-up” means that I don’t have to think about work until I have to think about work. Having remote access means I can check my servers at the press of a button (or screen) from virtually anywhere! And this can buy me time! If I can see what’s going on, I can anticipate the urgency of the situation and then decide if I need to leave my fabulous night out with my girlfriends and get to my laptop asap, or if I can finish my Cosmopolitan first! That, to me, is balance. Because I’m never neglecting what’s important to me: work AND life.

  • There will always be server outages and problems to resolve, but my hope is to get ahead of the curve and proactively solve the problems before they become critical (which always seems to be late in the night). With monitoring software and some clever systems in place, I can resolve most issues when they’re manageable during daylight hours and before they flare up into all-night emergencies. My beauty sleep would appreciate this! 🙂

  • I would say the most common cause of server outages might be cause by unpreparedness. All is dandy, then suddenly no transaction can be written to database because of last night ETL process cause the transaction log to grow like crazy and become full. Or someone running long running queries that lock a lot of tables in the database. Sprinkle those with some occasion hardware failures. It is bad enough to experience such problems, but having the users to notice the problems before the database administrator make it worse.

    Without monitoring, it is like firefighting. Being able to monitor the systems remotely, could really help a DBA to be more pro-active. It will allow me some free time away from the computer watching the performance monitor and SQL activity monitor, and use that time to hit the New England snowy slopes to learn how to snowboard.

  • The reasons why most systems fail is “something changed” and not having the right tools to detect the change. As systems evolve, we address what is apparent, sometimes/commonly without understanding the root cause, and though we, sometimes, monitor for some things, poor/absent baselines make detecting change almost impossible.

    This fall my inlaws moved in with us, and recently one has gone to a nursing home (ALS), and the other in the hospital (Parkinsons). Trying to balance supporting my wife, their needs, and waking up multiple times a night just to make sure everything is running on schedule, doesn’t work. A system that I can remotely monitor my servers with, and alert me when something changes will allow me to sleep at night, and be awake enough to play cards with my inlaws.

  • Brian N

    The most common cause of server outage is due to cicuit overload. As time passes, more equipment is added to the data center without verifying the total amp draw of the installed equipment. At least, that’s what is reported in data center research articles.

    My life would improve immensely with the use of a simple, easy-to-use remote monitoring solution. I would be proactive in preventing issues rather than reactive. I can rest at night.

  • Ken Huddleston

    I believe most failures are from people not monitoring their servers. Without out knowing a hard drive is about to be full, a network card is causing packet errors, etc you end up being blind sided by a failure. We are all busy, and any tool that can assist by alerting there has been an increase in some metric above the baseline is great. But one that would allow me to view that alert on my phone and interact with it to determine if I need to drop everything, or wait until after date night is ove, will get my praise.

  • Holland G Humphrey

    At our shop, the most common reason for server outages is caused by network problems. Our servers (all 200+ of them) are fairly stable now, but our network has some problems.

    When I’m at home the scenario is: my blackberry alerts me something is wrong with X server. I get out my laptop, boot-up, login, connect to the secure vpn, open outlook, then open SSMS (this takes 10-15 minutes each time, and this is time that could better be used for something else). Often times, when we have a network problem, by the time I get logged-in, the network problem has disappeared and I just wasted 15-20 minutes of my family time to look into the issue.

    With an iPad, I would be able to respond much quicker which would allow me to diagnose if the problem is with my server, or with the network. Man, this would be a dream, and I’d be a hero at home.

  • Nikhil S

    Common outages: Depends on how many different kind of features are you using.
    1. Simple OLTP using just Sql server service: You can get blocking, deadlocks, database crashes ( less often).
    2. Above setup plus sql agent: Jobs failing, credential problems.
    3. Above setup plus replication: Network connectivity between pub and subscriber, non-desktop heap memory problems.
    4. Above setup plus cluster: some node crashing.
    5. Insufficient Disk space.

    The superset of problems depending upon ones usage of Sql Server features but over a period of time a DBA knows all the common problems which occur or can occur in his/her organization.

    Remote monitoring tool :
    What i would like to see is some remote monitoring tool which also gives me some interface like ssh/telnet using which i can fix common problems. I can have predefined scripts to fix these problems which i run using this remote tool.

  • AndyG @DBA_ANDY

    The most common cause of server outages is lack of proper planning and management support for support costs (including tools); remote monitoring greatly increases the quality of life of the DBA because it means they are notchained to their desk. In our current world I receive email notifications of problems but then I need to VPN/RDP into my work PC and then RDP to the server to see what is going on – and often by the time I get there the problem has passed. A tool like SQL monitor would greatly improve my effectiveness.

  • Kevin Broughton

    In my experience the most common cause of SQL Server outages is lack of time and attention to detail, but time is a precious commodity of which few have enough. SQL Monitor on the iPad provides instant-on, always ready access to servers, so its easy to fit monitoring into the little blocks of free time that would otherwise be impossible to use with a traditional computer bound tool. SQL Monitor + iPad = Freedom!

  • Mladen Prajdic

    I wouldn’t use an iPad to monitor anything.
    I’d use it for play and relaxation, because I’m already awesome at proactively and reactivley monitoring everything. 🙂

    Ok maybe an occasional remote desktop into a server or two…

  • Dave Thomas

    Outages: flaky WAN; runaway query consuming tempdb disk space; bulk data import/export job running too long and forcing OLTP data buffer purging; incompetent consultant developed SQL2000 TSQL doing something stupid like invoking sp_OACreate without a corresponding call to sp_OADestroy, memory leaks -> MTL starved -> no connection possible -> strangle consultant (major application upgrade planned, dumping consultant, blood coming off the boil).

    The first of the month is critical to the business. So much so, I prefer not to take any leave on the first of the month. Sure I have set up email alerts to my Blackberry, but these are reactive. Having the ability to get proactive away from the office…At the playground with the kids, no problem. On the old steam train with their Grandpa in the driver’s box, no problem. Early morning at the camp site, before a day at the beach…well you get the picture!

  • It Always Starts With Something Simple

    Most common causes are usually things like logs filling up, scheduled jobs failing and not handled properly and dependencies cascading, or bad code which gets deployed.

    If We Had More Time, This Never Would Have Happened

    Why?: Usually ignorance – not knowing baseline behavior, not enough notice about jobs failing, failures reported in too many places to keep on top of them, lack of good proactive design/code reviews or tools. Why?: Usually lack of time allocated to putting those things in place. Why?: Deadlines, productivity pressures, budgets – and these aren’t going away.

    The Cause of the Problem is Always the Last Place You Look

    Being able to get into systems remotely is nothing (we’ve had RDP for a long time) without a tool like SQL Monitor which brings relevant information together (including history!) and gives you visibility to troubleshoot problems – debugging is always stressful and if it’s supposed to be non-work time, it’s doubly stressful.

    Get Off the Death Spiral of Just Putting Out Fires

    The improvements to the monitoring and troubleshooting process and improved uptime and overall quality would allow me to spend more time with development teams getting through the Five Whys – improving the databases and applications even more instead of just putting out fires – and that virtuous cycle would lead to better peace of mind, less stress and better balance – not just for me but for everyone on the teams I work with.

  • Jason Saltsman

    Humans cause server outages. Be it through negligence, ignorance, or just being stupid, we are the problem. Primarily I would place the blame on negligence. Too often we are busy with other task and don’t notice the memory is racing out of control due to poor programming. If the server goes down, it is NOT the developers fault (even though I will blame him). If I had proper monitoring, I could have caught it and saved the outage from happening. If I could use and iPad to remotely monitor at all times, well then I CAN neglect it…as long as I pay attention to the iPad!
    Having a device like the iPad would allow me to stay connected in a way that my smartphone just can’t do. The 3G access gives the opportunity to be truly mobile. I think that being able to monitor my SQL servers with RedGate from the beach has got to be the ultimate pinnacle in geekness. Also, my girls will love playing Angry Birds on a big screen!

  • Andrey Lipatkib

    Well here outages are mostly about network failures and developer mistakes. As we don’t have a 24×7 system administrator and some of our team members are working from home we have troubles sometimes. If I’d have an iPad, I’d be able to immediately that it happens, let everyone to know about the problem and fix it remotely when possible.

  • Hardik

    I believe common cause of the server outage is improper monitoring and not having enough tools to identify those problems well before in advance specially where company/management is not willing to or afford to invest their money in monitoring tools like SQL Monitor. If we have proper monitoring tool like SQL Monitor then I would able to identify those nasty problems in advance and I can remotely solve problems. That means I can spend more time with my family, more sleep, take vacations rather than always go out to the office to solve issues in the night.

  • Times are still tough. Departments still don’t want to increase storage when it’s sorely needed. The most prevalent issue I’ve dealt with is running out of disk space. This can and should be managed ahead of time, alas, without the proper resources (more space!) no amount of tuning/configuring is going to solve it when you’re just plain out of space on the server. When you need to play file tetris to keep your apps running, an iPad and the RedGate monitoring tool would make this possible anytime, anywhere. I could relax at Sunday night dinner with the family, watch my son’s sporting events and take my daughter to the museum – all thanks to technology!

  • Robert

    Like many companies, my company is running 24 hours a day. And unfortunately, I am the only DBA. The combination of tools RedGate and Apple would be an asset for a balanced work-life familly. I am the father of 4 wonderful children. I sit on several executive committees and in addition to volunteer regularly. Not to mention, my wife, my parents and my friends … Free time is rare.

    Look, act and prevent: the keywords to be used to provide data at all times to my users. I am applying bests pratices, but iPpad available to solve the little unexpected problem in the middle of the night or weekend on the road, to the ski center or arena.

  • John Stafford

    There is never enough time for a DBA to do everything they would like to do with their SQL instances, so anything that can make us more efficient has to be good. And if the DBA spends less of their home time dialled in to work they will be in a better state to face the next working day.

    On the odd occasion that something is amiss, SQL Monitor will be able to alert me early so I can take preemptive rather than reactive action and my iPad will enable me to connect faster than ever before

    And, don’t forget, there are at least 29 other people out there using an iPad and SQL Monitor that we can swap ideas and best practices with – perhaps the start of a new PASS Virtual Chapter?

  • Most of my responsibility for monitoring SQL comes from ERP applications my company installs for clients. Some clients have a very close relationship with us and we use generic monitoring tools for several of their servers while some clients only call us after they have tried their best to alleviate any issue. I think the major cause of outages is user action, usually due to lack of knowledge.

    For some clients, the “fix” they apply to a slowdown is a reboot (sometimes erasing evidence which may have lead to the real problem and solution). Sometimes the “fix” is to apply some modification found via Google which may or may not relate to the real issue. I usually get the call after what they have done doesn’t work or makes the problem worse.

    A true SQL monitoring system would help point clients and myself to the real issue before end-user complaints force the client to make rash and uneducated decisions.

  • Russ Priesing

    Throughout my experiences the biggest cause of outages have been disk related, space, improper configuration with a topping of bad queries causing high I/O. Being able to monitor remotely (at the kids recital/game/etc) and gather information on what’s happening to the server at any point and having the means to address an issue in a timely manner is important to minimize customer and managment calls. This also allows us to spend more time with family and friends instead of hardware and software.

  • Steve Cusick

    The most common cause of server outages are mistakes (lack of knowledge) made by off hour support staff. Having the ability to remotely monitor my servers health through the evening and foresee problems that are sneaking up and can possibly come to fruition through the night, will allow me to have proper action plans in place. Which in turn allows me more time with my family and to get a full nights sleep thus improving my way of life tremendously.

  • Steve Wunsch

    The common denominator in every outage that I’ve been involved with is that it could have been prevented.

    In my own (limited) experience, there hasn’t been one common cause of outages. Once I recover from an outage, it’s pretty easy to justify spending the time to create tools & procedures to prevent that particular problem from occurring again. Over the years, I’ve built up a toolbox of scripts, applications and checklists that help me monitor for a variety red-flags. Yet, I still spend a lot of time in the “toil and drudgery” of proactively monitoring my servers.

    That’s because, ultimately, the world is not a perfect place. Things that I haven’t planned for, or given thought to, are bound to occur.

    That’s where a monitoring tool would come in really handy. Red Gate’s SQL Monitor would allow me to be proactive while freeing me from much of the repetitive drudgery associated with monitoring.

    Instead of running through checklists on my laptop, I could spend my Saturday mornings making pancakes for my wife & daughter (while glancing appreciatively at the SQL Monitor dashboard on my iPad).

  • Meir Dudai

    Dan is a big guy, and a bit clumsy. He works in our company and is responsible for 90% of our system outages. Whenever he walks next to the server in our office, he accidently somehow manages to pull the plug and our portal DB goes down.
    He is a great guy, but the thing is that he really doesn’t have anything to do near the server. It’s just that he has nothing better to do so he hangs around.

    Personally, I have an iPad.
    If I win an iPad, I will do the only logical thing to resolve our outages issues: give the iPad to Dan. That way he can play “angry birds” on the iPad instead of hanging around and pulling cables.
    Plus, I will be able to install “SQL Monitor” on my iPad, I’m dying to try this one.

    If I win, I will not only send a picture. I will make a video footage of live “Dan incident” as he goes to the server room, accidently pools the plug, and I can immediately monitor it and get an alert (while playing angry birds!).

  • Tim

    There once was a developer named Bill
    His transactions and cursors made me ill
    Things would crash when I slept
    Documentation was not kept
    But through my iPad I could send the command kill

  • In our environment, hardware failures cause most server outages. I often learn of such outages through SQL Agent alert emails that I receive on my phone. Unfortunately, the alerts themselves rarely tell the whole story and tend to arrive while I am out of town at my son’s gymnastics meet or otherwise unable to do much other than worry about them until I get home to my laptop and connectivity. If a free iPad with 3G connectivity coupled with a monitoring tool such as SQL Monitor would enable me to easily pop in to diagnose the exact problem and either resolve it myself or pass along what I found to the server engineers so that they can address it and I could get back to enjoying the meet, then I am sold.

  • Where I work there is an invisible line that divides up our servers. We have limited access to monitor our servers. Having SQL Monitor and a remote platform to use it on (IPad) I can sit in the front row at my son’s Christmas concert instead of the back (in case the phone rings with an ‘urgent’ issue). Life gets greater weight in the life/work balance, since I can identify and correct issues while they are small.

  • What brings our servers down? Running Profiler with Show Plan event.

    Now, that was me…HaHaHaHa.

    Having a “consultant” run PSSDiag while the CPUS are above 80%.

    Monitoring tools are best at keeping cummulative statistics to show upwards treads. That is the best part.

    Second, is alerts – Job failing, PerfMon statistics and increases in query or job times.

    Index fragmentation is another good one. Alerts for SQL Server Agent failing. Failed logins. Failover to another node on cluster.

    But, if it could predict alot of this stuff, I could fix it while working 8-5, rather alerts in the night. That would be perfect.

    God Bless,
    Thomas LeBlanc
    TheSmilingDBA

  • I think, the most common cause for a SQL server outage is that the server runs out of resources, either disk space, memory or CPU. If you have a large environment and no proper monitoring solution (or a solution which “looks” at the servers in a more common way e.g. just looking into event logs or only alerting after a threshold was reached) it is quite difficult to keep an eye on every server. So you have to focus on the systems which are more critical than other systems. First of all you HAVE TO start with getting performance baselines of your servers and review these in a regular manner. Having a proper monitoring solution like RedGate SQL Monitor makes it a lot easier to keep an eye on your systems. Not that you can get notifications if a threshold is reached, NO, you can pro-actively react if a system starts to consume more resources. If your monitoring solution also provides historical data, it is much easier for you to compare the actual performance with your first baseline.
    Being able to take a look remotely saves your time (e.g. you do not need to “crawl” through many events every morning) because you can quickly take a look whenever you want and react faster. Therefore you will have more time for your hobbies, you improve the reputation of the database team and you have the good feeling: Yes, I have fixed the problem or better: YES, I could improve the system even before a problem occurs.

    Regards
    Dirk

  • With the proper monitoring tools & checks for sql metrics & resouce consumption in place, the majority of server outages should be caused by things external to the SQL engine itself.

    Of the external issues, the most common causes of the outages I’ve encountered have been the result of hardware problems, network issues, and power failures.

    Being able to rapidly and remotely check on the health of my systems can’t solve all problems, but it gives me peace of mind that I can proactively and reactively identify issues without being glued to a desk, and peace of mind leads improved quality of life.

  • Robert Biddle

    I’ve been really wanting to get my hands on SQL Monitor. I actually mostly use tools which I’ve developed myself to do a lot of my monitoring but they fall short in a lot of categories. I think SQL Monitor supplements what I currently use very well and would be a great addition to my monitoring process.

  • Natalie Briscoe

    Remote monitor. Absolutely no benefit to my quality of life at all.

    Well apart from mocking the DBA on call, when I get in the next morning, and I knew what was wrong before them. Thank the CTO for 3 8 hour shifts!

    Oh hang on. remote from my desk over being in the meat locker, they call the server room freezing my T@#s off!

    However what monitoring will really bring me is the ability to set thresholds that are different to the “corporate” standard, I have strict thoughts on some and don’t care about others. Don’t tell my boss, but the real reason this would be the best result for me is there are some parts I am not that good at. Defaults that are sane out of the box, and descriptions from a company that knows sql server upside down. As they say, priceless.

    Oh, and an iPad – the games I can play when I am away from the office, there is the quality of life bit.. 🙂

  • Andreas Kartawidjaja

    The most common cause of server outages would be a combination of bad db/system/server management and lack of monitoring. Monitoring your system is the proactive approach to resolve some of these outages. You would be able to easily pinpoint/identify the part that requires attention. This would make sure that the server is at tip top shape and runs at optimum performance. Avoiding these outages would spare you from the effort required to do cleanup and stress/pressure from company/client and thus improving the quality of your life.

  • I have a long commute to work, a combination of metro rail and buses. It’s 2 hours a day that I could use to get ahead on my dba tasks, but it’s just not practical to fire up a laptop while standing on the bus. Enter the iPad. I’d love to win the iPad! It would change my life. I’d have an extra hour or two a day where I can get a jump on any issues in production. I’d be able to focus more on strategic work once I get to the office and hopefully leave for home one of these nights while the sun is still up. 🙂

  • The most common cause of server outages is lack of thinking. Specifically believing that you can install a server and set it and forget it. There are far too many SMB and departmental servers running out there with no one in charge of them. Ever so slowly their C drive fills up with log files or the server is reporting error messages that go un-noticed. A tool like Red Gate SQL Monitor is great because it encourages thinking about the server and makes it interesting to monitor it ensuring that it does get monitored.

    For the person who has DBA responsibility this improves their life because they can either monitor servers remotely, or enable SMB or departmental users to take on more of the monitoring duties.

  • Dennis McMahon

    In a nutshell, quality of life is instantly improved by taking away constant worry.

    In my mind it’s a real negative on the quality of life scale when I have a good bunch of folks together in my workshop on the occasional sleepy winter Friday evening, beers in hand or on the table (and maybe a few down the hatch), and I have to say every half hour “hey gang, put down your drawknives for a minute, I have to go upstairs to check my servers”. And sometimes I may stay away for an hour or so and people start wandering off into the night wondering what happened to their host. This is especially not good in the middle of a tense tillering and ‘first pull’ session on a new selfbow or nearing the end of the sparge on a fresh batch of homebrew.

    I have to admit to being only a part-time contract DBA for a mid-size company that can’t (or refuses to) hire a real DBA, and what’s important to them is that a series of jobs runs successfully every evening and kicks out a set of nice reports to land in all the top floor folks’ inboxes every morning. These jobs for one reason or the other fail intermittently or the SQL Agent has stopped running and the end result is that I have to go and fix it – this is the most common SQL Server outage for me.

    The most frustrating part is that I have to physically go out to my office to check on things on a regular basis just to have peace of mind that things are chugging along smoothly. How I see a combination of tools like remote server monitor software and a small connected device to display it adding to my quality of life hugely, is simply the ability to take this toolset with me into the shop (the environment is not conducive to larger devices), hang it on a wall, and glance at it occasionally to see how things are ticking. In this manner I would be able to see what has stopped or what might stop the Agent and be ahead of the game. Also a big factor is being able to see immediately if something needs to be attended to right away or can wait until after funneling the boil into the carboy.

    Now if such a heaven-sent toolset were to fall into my hands, I would be left with the challenge of choosing the best material to make a protective cover for the device – buffalo hide or otter? Snakeskin or gator? Too many choices…

    Cheers and Merry XMas to all.
    Maddog

  • Dave Dustin

    The most common cause of server outages is an undiagnosed issue that hasn’t raised its head, or if it has it wasn’t in such a way that caused problems in a repeatable or identifiable.

    Quite often, the issue isn’t present in the test configurations (Development, QA, testing etc…)
    It could be something as simple (and complex) as the day the server was built (which hotfixes were installed in one hit, and which were installed later on) or the order in which applications and extensions were installed.

    That issue becomes a problem when something innocuous is performed on the server, such as applying a hotfix to fix another issue. it works fine in dev and in the labs, but causes CLR assemblies to be unloaded all the time due to memory constraints.

    But this sort of thing can’t really be solved by monitoring. Ok, it can be used to help diagnose the issue because you can more accurately see over time the status and configuration of your environments, and the impact that a change to a system has.

    You monitor series of boxes. Apply the change to one of them and note what happens with the stats vs. the other boxes. Do the next one. Do the changes on that box match the first one? If so, yay. If not, then it’s time to dig deeper and see if there is an issue. Maybe risk updating a third box and see if it matches either of the other two. It can help narrow down which box needs closer attention.

    All of this means that you can reduce your stress levels when planning upgrades, deployments and roll-outs, because you have tools in your arsenal to help quickly identify if something has gone tits up.

    And less stress at work, means a better life at home…

  • The most common cause of outages is human error – typically induced by being tired and off my game (stress doesn’t help this either, of course).

    The thing that I could really use monitoring would be my blood pressure and heart rate, as well as all the usual technical metrics like disk space. I’m sure that monitoring remotely would help this too, and being able to have extra confidence in the health of systems would help me sleep better, and soon I’d be at the point of scheduling stressful work for when I’m at my most relaxed and rested times, and the cycle would continue to help me lift my game and keep it up.

    Oh, and I think if I could monitor my level of contentment, it would probably show Murphy’s Law to be true – that things always go wrong at the most inopportune moments, those times when I’ve just got into a relaxing bath, or managed to find a babysitter, that kind of thing. But hopefully that would make me laugh, helping reduce my stress too!

    Rob

  • In the world of blind men, one-eye is king. So it is with most DBA’s out there. We are the people that “do something” with data, and try to get enormous amounts of budgets for our very very big servers. Most of the times management doesn’t like this, and of course when it is considered that a very big server can do almost the same as a very very big server.

    To me, using inferior hardware for big databases is the most annoying part in my job. And there comes the fact that you can monitor a server remotely to be very very very handy. How many times didn’t we, the DBA’s, got an email or phonecall late at night, that this or that server is running slow, and they really need our help, with nothing in return. So you get into your car, drive to the office, walk to server room, startup an old monitoring system, and wow, someone ran a query that was locking to much tables and therefore nobody can do their things…. Happened to me, happened to you, we all know who a life with that is.

    A life where we are DBA’s 24 hours a day is normal. a DBA is responsible for all data, but we wanna drink a beer once in a while to! We are also humans! And when we can monitor our processes remotely on a nice tablet, that would be the best part. Hanging around in a local bar with your girlfriend, your beer and your iPad with a SQL Monitoring tool installed on it.. that would be life pur sang!

  • Fahim Ahmed

    Server outage is like getting a RED traffic light on the road. Before the light goes Red, there is a warning period i.e. the amber light. The MOST common cause of outage is failing to see the light going from green to amber and then completely ignoring the amber light.

    My current monitoring using alerts and emails involves manual interventions. SQL Monitor on iPad will allow me to see and respond on the warnings hence avoid the server downtime. I can spend more time with family rather than fixing the server.

  • I think the worst type of server outage ar loss of basic resources – CPU pegged by an app outside SQL, HDD swallowed up by transaction etc, rogue app (IIS anyone?) taking all RAM. SQL Monitor will alert me the resource is low and show me what took it and when so I can call the sys admin guy and make him treck in to the office to fix it.
    iPad+SQL Monitor = DBA with better homelife

  • Technically I don’t watch for server outages. But I do need to monitor our ETL process overnight. And the biggest cause of ETL failures is lack of testing. On the surface it looks like the ETL failed because of some outrageous value in the data or because of formatting issues but we need to write our code to handle that sort of thing elegantly and the only way we can do that is to test, test and test some more!

  • Gary L

    On the day before Christmas vacation, my network admin gave to me…
    A lame joke about a “precious” GOLDEN RING….as he tried to cover up (ba da ba bum)…
    Four misapplied firmware patches…
    Three failed SQL jobs…
    Two dead UPSes….

    And, that’s why I need one iPad RedGate SQL Monitor for me!

  • The most common cause of server outages is the volume of work we face. It could even be said that outages lead to more outages by stealing time that would have been spent doing the things that prevent outages. A good monitoring solution saves time on tracking down things that are about to cause a problem. Taking it a step further, being able to monitor systems remotely would allow a DBA to use the wasted time in their day, like waiting for an oil change, to free large chunks of time for things they enjoy doing.

  • Roy Ernest

    The most common cause for Database going down is Hardware issues. If you have full control over the database, you could always keep an eye on what is being released. (Atleast in my case, no one other than me and my junior DBA has access to the SQL Server). This helps me in controlling what goes to my production environment. That does not mean it is fool proof. A simple stored proc can be called in a loop 1000 times a sec to bring your DB down to its poor knees. But the chances are low for that to happen.
    But you have no control over what happens with the hardware. SAN can fail, IO controllers can fail are couple of HW issues that you are likely to face.
    What can you do to prevent that? Nothing. Only thing you can do is react at a quicker pace. How do we do that? Monitoring the Server (Both HW and SQL Server) and set alerts. It is always better practice to have two levels of alerts. One would be a warning and the other would be the fatal alert.
    How do I manage my life and work? Simple. I try to forget about work as soon as I get out of the office. I have a trick for that. Everyday after work I stop somewhere and have a nice cold beer and relax. Then I go home to my wife and Kids. I can go out with them and relax and have fun. I am free to relax till an alert is triggered. At that time if I have a monitoring tool that I can access easily, I can take a look at what is happening and then decide whether I should drop everything and fix it or just make a note of it and fix it at a later point. This is where a monitoring tool that can be accessed by hand held device is going to be helpful. Hint Hint an iPad.

    Roy (@RumblingDBA)

  • Andy Wolfe

    Info overload
    masks patterns and hides problems
    that have quick fixes.

    In last five days, a
    bad SAN, low space, failed jobs —
    Missed until too late!

    Remote monitors
    With exception reporting
    Would help me zero in.

  • In my experience the most commons outage of database servers are tempdb disk full, or table locks caused by night processes of data loading.
    In the PAST to get monitoring or fixing database fails out was necessary to go to the company where were the servers to analyze and fix the problems.
    In the PRESENT to monitor the databases I can utilize SQL Monitor and if necessary some intervention, I could make to remote access via RDP connection in my home. Thanks technological development! Now I can to spend more time with my family in our home sweet home!
    In the FUTURE (that I wish It will come soon), I can monitor the databases via SQL Monitor and make some intervention by my IPad (that I´m going to get in this promotion). Or better, I can “take a walk” with my family quietly … because I´ll be accompanied of my IPad and SQL Monitor!

  • On our systems, outages most often equate to slow-or-no-performing application code. One particular mission-critical app comes to mind; the vendor’s programmers are, well… learning good database programming. While they learn, I do as much performance tuning as I know how. Problem there is, I’m learning, too. 🙂
    Working for a nonprof, I’ve always had to get pretty creative to get the tools or hardware I need. I don’t have a dedicated monitoring system, except what I’ve scratched up on my own. I watch alerts, keep a database of wait stats, use DMVs to look into issues, etc. Unfortunately, most of that is reactive. If I had something like SQL Monitor, perhaps I could cross over to being a proactive dba. Then I could proactively avoid an outage or other issue while at work, rather than reactively fix it while at home.
    And having an iPad would sure help me read that perf tuning book. There are so many visual aids that just don’t look good on my Android’s 4″ screen!

  • Steven Cush

    Hands down, people not thinking is the most common cause of server outages. I have, unfortunately, seen data centers go down due to this problem. In a business environment, people are under tons of pressure to do a lot – with a little time. They want to go home, or they need to move onto the next issue. As a result they don’t fully vet their solutions – and then the fun ensues when the changes occur.

    By being able to remotely monitor – especially when these changes take place – means that one can be:
    * at a child’s basketball game,
    * spending quality time with the family,
    * enjoying the beach,
    * going to the dog park with the dogs,
    * or just decompressing around the house
    while the changes to the work environment happen vs what happens today – being at the office – or stuck behind some desk tied to the internet – to do this coordination.

    So by being able to remotely monitor – ones, and one’s family’s quality of life improves, while the human risk factor in software is decreased – and perhaps best of all – the customer’s experience is improved since work is a quick tap away.

  • Stephen Dyckes

    The majority of outages are caused by lack of proper monitoring and notification. SQL Monitor is just the tool to provide the monitoring ability, while an iPad is just the device to allow me to monitor my servers from around the world (or my back yard, Barbeque anyone?)!

  • Chris Gallelli

    In the spirit of the holidays, I am going to say that server outages are most commonly caused by HARDWARE. I know… people buy the hardware, put exceptional load on the hardware and sometimes do really stupid things with the hardware but it is nice to have a scapegoat.

    The reality, however, is that we need to closely monitor the hardware and the things that use the it so that we can effectively prevent that “bad old” hardware from causing us grief. The trick is making the monitoring easy for the person doing it. Remote access and the trials and tribulations of network access are a challenge… period. A solid tool like SQL Monitor and its Web interface combined with an IPAD sounds like just the right combination to make my monitoring life a lot easier!!!!

  • From my experience in a small shop (one DBA and 50 developers located around the world), I would say the biggest reason for outage is what I would call neglect. Neglect in that operating procedures/best practices are not followed by the development team and not enough time in the day to do everything I know and need to get done in a day to protect my servers and properly maintain them. These two are really the root cause of all other issues such as high cpu utilization, log files filling up and running out of disk space,…well you get the picture. In this state of affairs, the scale of work/life balance is heavy on work with some time left over for life.

    Will the iPad and SQL Monitor fix that? Not entirely, but it will help. How? It’s a process and a chain of events. Ok let’s suppose I have a new tool (say SQL Monitor) that helps me do more of what I need to do and this starts a chain of being more proactive, being able to access the servers quicker and wherever, and have historical data on what happened. This new free time can now be used to spend time with the family (my lovely wife and 4 yr old daughter) and to work with the developers on performance training initiatives and better understand the consequences of their actions as they can learn from the Root Cause Analysis that I have to do after something breaks. The family is happier that I don’t have to run back home or to the office so I can finally see an entire dance recital. Now, the developers are better educated and more knowledgeable and this can build upon itself and create more enjoyment for themselves and our customers. So it starts a win win for everyone involved and the scale of work life balance swings more towards life work balance.

  • Norman Kelm

    Mistakes and failures, they’re our biggest bane.
    SQL Monitor, that’s what keeps us sane.
    Monitoring from home or on the roam can quickly ease the pain.
    Copious free time, the benefit we all gain.

  • I think the most common cause of database outages is not implementing all recommended preventative measures and also failing to address in a timely fashion, notifications of issues from preventative measures that were set-up. If an issue is allowed to fester then simple methods of resolving the issue may no longer be possible and the issue may escalate to an outage. Being able to monitor my systems remotely would allow me to review the serious/critical alerts in real-time and address the issues that require immediate attention.

  • Personally, I think it is not understanding your systems in a manner that leads you to proactively manage them instead of react to the problems after they cause an outage. From my research, SQL Monitor combined with an iPad gives me the greatest opportunity to learn my system in a manner that makes me a better DBA. It also would help me proactively tweak my servers and game plan so that I have the time to be a better husband, father, and person. I believe these two tools would also allow me to be a better evangelist for SQL and the community as I attempt to give back what others have given me.

  • Richie Rump

    What I’ve experienced is a lack of communication between the different IT groups is major factor for system outages. I’ve seen the network team deploy a windows patch to a database server that completely brings down the server. If the database team was aware of the patch they could have motored the situation more closely and fixed the issue before the users showed up for work the next day. I’ve also seen application teams deploy patches that bring down the database as well. If we can get everyone in a room and discuss the system as a whole we can support the system and our users a whole lot better.

    Of course, if the system did go down for whatever reason being able to whip out the iPad and fix the issue from my daughter’s soccer game or at the movie theater with my wife would be phenomenal. If the system were to go down now I would have to stop whatever I was doing, drive home, boot up, log in and then fix the issue. But with the iPad that would all go away and let me focus on what’s truly important, my family. After all I don’t live to work, I work to live.

  • Server outages are caused by humans
    Humans arent to be trusted and must be monitored
    Remote monitoring would allow me to spend less time at the office monitoring people and more time monitoring what really matters, my teenaged daughter!!

  • I would attribute “lack of pro-activity” to be the most common cause for server outages. To be pro-active, I would get a good monitoring tool to obtain periodically an understandable server health status even while I am going high on life. With an iPad & SQL Monitor in the equation, it would take me a step closer towards becoming a truly-complete-DBA…connected with family & friends and still never away from my servers!

  • Gianluca Sartori

    The key to a successful life/work balance is prevention: you have to keep your servers away from trouble, exactly how you would do with your children.
    There’s only one way to achieve this, and it is always being one step ahead of problems, and seeing dangers far before they start to harm.
    SQLMonitor on the shiny screen of your brand new iPad can grab your attention when the counters exceed the threshold YOU decided, not when your boss yells in your phone at the restaurant on your wedding anniversary.
    I don’t know about you, but I hate to hear things like “Sometimes I ask myself if you’re married with me or with your boss”.

  • Pei Zhu

    I think it is simple for me.
    Average rate (total compensation/total hours ) should be close to your expectation.
    Average working hours/day should be hours/day you want to work, you should be paid by 1.2+ * Average rate and You should have option to reject the extra work.

  • Sylvester Carstarphen

    I understand that my job requires a certain level of after hour support. I generally don’t mind receiving phone calls for hardware failures. The common server issues that seem to interrupt my life/work balance the most appear to come from human problems instead of machine problems. Most of the time, I’m stuck dealing with a poorly implemented process, from a developer or a DBA, that’s utilizing more resources ( CPU, Memory, Disk Space, causing blocking, etc…) than it should. That process affects users and my phone rings. Other times, I have to prove to management that everything is ok on the database server(s). The database system is running normal, the Batches to CPU ratio is the same, memory consumption and usage looks normal, disk IO looks like it does every other day and etc…. Since the database is always blamed for all issues, I am interrupted for no reason often. Every issue, even when it’s not the database, takes so long to prove or disprove when you have to pull out your laptop.
    I own a boat in Destin, Florida where I spend time hanging out with family and friends grilling and making frozen drinks with my Margarittaville. You do not know how often I have to pull out the laptop as soon as I launch the boat or get underway to solve an issue or prove that there are no database issues. With an IPAD and SQL Monitor software, I can stop being the party pooper. I will quickly be able to review all of my systems before we launch the boat and know that everything is ok. With an IPAD and the SQL Monitor software, I can quickly respond to management and support phone calls when asked questions surrounding database issues and only dial in if necessary. Quality of life is not only comes from physically participating in the activities you enjoy the most, but also from mentally being able to participate in those activities. Being there with your family and friends and wondering if your systems are running correctly is just as bad as not being there at all. An IPAD with SQL Monitor software will allow me to visually see what’s occurring on my servers while doing the things that I enjoy the most. Ultimately allowing me to relax and really enjoy my family time.
    Picture this, me sitting on my boat anchored in the bay watching the sunset with family and friends. I’m enjoying some freshly grilled food, sipping on a frozen drink monitoring the Weather on my Garmin Navigation System and my Servers on my IPAD using SQL Monitor. I’m smiling just thinking about the opportunity.

  • RedRT

    When things go wrong they bring me in. I am the closer.

    I believe that WE cause the majority of our outages. WE improperly configure our server environments and over time these systems will fail.

    I am lazy, I don’t like getting calls when I am with family to go look at something. Any tool that will help me spend more time with family is a WIN. I am excited to use my new iPad with Red-Gate SQL Monitor to do just that. Helping me be more proactive to identify potential issues before they become the next outage.

    You are a lazy
    Look over your shoulder I must
    to keep my paycheck

  • The most common cause of server outages is disk resource issues. Disks fill up, fail, or are misconfigured. These outages tend to happen at the most inconvenient times like dinner, when you’re out and about, or sleeping. Remotely monitoring and addressing these issues saves time for you and your customer and allows you to live your life.

Please let me know what you think about this article or any questions:

This site uses Akismet to reduce spam. Learn how your comment data is processed.