This question comes up constantly in different venues. I see it sometimes 2-3 times a day on SQL Server Central. I know it pops up at least once a week on Ask SSC. I’m sure I’ve seen it on Twitter #sqlhelp. “Why is my log file growing?” and “Why is my log file full” are heard regularly. Or, the variation, “I ran a full backup but the log file is still full/growing.” occasionally comes up. The people asking these questions are frequently, even appropriately, frantic. I’m writing this blog post for two reasons. First, to try to add a little bit of weight to what must surely be one of the most searched for phrases on the internet when it comes to SQL Server. Second, just to have a shorthand to answer the question, “Here, check my blog post.”
Right off the bat, database backups are not backing up the log. There. That’s out of the way. Yes, I’m serious. Database backups do not backup the log. They backup all committed data in the database (which might include some or all of what might be in your log file) and they backup some transactions that completed during the backup process as part of the cleanup at the end of the backup (that would be, by definition, be in the log file). But they do not backup the log file. Why is this important?
The transaction log on the database represents the work you are doing to that database. It records the row you inserted, the twenty you deleted and the five that were updated. It records the fact that you dropped a table or truncated a table (and yes, truncation is a logged operation). All of these logged events are written to the log. They are written to the log regardless of the recovery model. What recovery model you ask? Ah, and there we begin to hit our problem.
There are three recovery models, full, bulk-logged and simple. The first and third are the ones that most people use, so they’re all I’m going to worry about for this post. Full recovery means that committed transactions (transactions that have been successfully completed) that are written to the log are retained, even after a checkpoint operation. Simple recovery means that committed transactions are removed from the log when the checkpoint operation runs. A checkpoint is basically when everything in memory gets written out to disk (yes, there’s more to it, but that’s enough for our current conversation). Checkpoints occur at irregular intervals.
Assuming your database is in simple recovery, the log only needs to be big enough to hold the transactions that are uncommitted between checkpoints. Depending on the size and number of your transactions, this could mean a small log, or a huge one. Note, I have not said that your log shrinks at checkpoint. It does not. In simple recovery your log is emptied of committed transactions at checkpoint (and yes, I’m repeating myself). This means any space allocated for the log remains allocated for the log. This is one of the reasons that your log file grows and does not shrink. Now let’s talk about the big one.
I first mentioned full recovery and said that the logs are kept, even beyond a checkpoint. This means they stay in the log, waiting. More transactions are run, and assuming your log is growing by default, it gets bigger and more transactions are in the log waiting. This continues until you finally run a log backup process. This is completely independent from your full backups (although a marker for the last full backup is maintained as a part of the recovery process). The basic syntax looks like this:
BACKUP LOG MyDatabaseName
This is a problem for two basic reasons. By default, new databases are created in full recovery mode AND by default, no one has set up log backups on your system. That means it’s up to you. You have to set up log backups. You have to set them and you have to schedule them and you have to ensure they run if you want to recover your database to a point in time, which is also known as, Full Recovery.
This is the important point, yes, setting the database to simple recovery can largely eliminate the problem of the ever-growing log file. But, simple recovery takes away your ability to recover to a point in time, meaning, crash occurs at 4:57PM. Your last full backup was at 6:00AM. In simple recovery mode, you just lost almost eleven hours worth of business because you can only recovery to the full backup. With Full Recovery, you can do what is known as a tail log backup, and take that in combination with all the other backups and recover your database, at least up to the last log backup, but possibly even right up to when the error occurred.
You need to ask your business, how much data are they prepared to lose. If they can deal with eleven hours, like our example above, great, you don’t need log backups. If, like most businesses I’ve worked with, they expect to recover everything, all the time, you’d better have log backups and full recovery.