If you’re like me, your first experience with SQL Server database recovery models was when you tried to to take a log backup on a database in SIMPLE recovery mode, and got an error for your troubles. In fairness, back in SQL Server 2000, we didn’t have the handy “Ignore database in SIMPLE recovery mode” checkbox, so we’ve got an excuse there… So what do these recovery models really mean, and what can we do or not do with them?
In short, recovery models have as much to do with how the transaction log is used as how we can back up the database. Log backups are driven both by the frequency with which space in the transaction log is marked for re-use, and with how we want to recover the database. So let’s look at it from both of those perspectives.
Transaction Log Re-use
First, let’s look at exactly what populates the transaction log, and how we use that space. When you modify something in a database, that data modification happens in memory. First, a record of the details of that modification are written to the transaction log, so that we can recover that transaction if the system crashes. Once a CHECKPOINT occurs, those modified or “dirty” pages in the buffer are written to the data file. However, we only need to keep the copy in the transaction log if we want to be able to recover the database to a specific point in time, or to roll back (or forward) a transaction. So the space in the transaction log can be cleared to keep it low.
In SIMPLE recovery mode, when a CHECKPOINT occurs, space in the transaction log that isn’t being used for an active transaction is marked for reuse, or “truncated”. The space is then available for use by other transactions.
In FULL or BULK_LOGGED recovery mode, CHECKPOINT still writes dirty pages to the disk files, but does not mark that space in the transaction log as free. That space is still considered in use.
As far as the transaction log is concerned, there are two types of backups. Full or Differential Backups, and Transaction Log backups.
When a FULL or DIFFERENTIAL backup occurs, a CHECKPOINT is issued, and the data file or files are backed up, along with just enough transaction log to be able to go through the recovery phase of restore. However, the transaction log is not truncated, meaning the space holding the transaction information is not marked for reuse.
When a LOG backup starts, a CHECKPOINT is issued and the pages in the log file that have been flushed from cache are backed up. Log backups ignore the data files. Once the log backup is complete, the space in the log file that is not being used for active transactions, is then marked for reuse.
With that in mind, let’s look at what the different recovery models offer.
In SIMPLE recovery mode, you can take FULL or DIFFERENTIAL backups, but not log backups. Log space is automatically marked for reuse by CHECKPOINT, so size of the transaction log is kept under control without log backups. However, since that is the case, any changes done since the last FULL or DIFFERENTIAL backup are lost in the event of a disaster. You’re only as good as your last backup. All work since that last backup would be lost.
In the FULL recovery mode, you can take FULL and DIFFERENTIAL backups, and you must also take LOG backups as well. LOG backups allow you to do two things: First, they mark the transaction log space available for reuse, keeping the log size down. Second, LOG backups allow you to recover to a specific point in time between backups, so as long as the log itself is not damaged, no work would be lost. The only down side to this is that very large operations can still grow the log very quickly.
BULK_LOGGED recovery mode splits the difference between SIMPLE and FULL recovery. Certain large operations are not logged, including bulk inserts, SELECT INTO queries, and some index maintenance operations. You can take FULL or DIFFERENTIAL backups, and must also take LOG backups to mark the log space for reuse. The upside is that not logging those bulk operations contributes to keeping the log size down. Due to that, however, you cannot recover to a point in time like you can with the FULL recovery model. You can still take a final log backup and recover to the most recent backup point, but you cannot recover to a point in between backups like you can with the FULL recovery model.
Backup strategy is driven first by the business recovery requirements, and second by the necessity to manage the transaction log size.
Scenario 1 – The few users that use this data only make minimal modifications to the database throughout the day, and none of those change much data. The work can easily be re-done. This is a good situation for using the SIMPLE recovery model, since the amount of transaction log use is low, and there is no major loss of time should the data need to be re-entered.
Scenario 2 – Business users make many modifications to the database throughout the day. There is a legal requirement to be able to view the state of the database at any point in time. Due to the volume of users, the transaction log can grow rapidly if left unchecked. This is a good situation for the FULL recovery model, mostly due to the requirement to be able to restore the database to any point in time. Frequent log backups will also keep the transaction log size down.
Scenario 3 – A small development group uses this database for testing queries on large amounts of data which is loaded into the database via BULK INSERT. The group would like to lose minimal amounts of data in the case of a disaster, but can accept about 4 hours worth of re-work. This is a great case for using the BULK_LOGGED recovery model, since point-in-time recovery is not required, and the database is a frequent target of large operations.
Another way I have seen recovery models used is to switch the database into BULK_LOGGED recovery mode for index maintenance, then put it back in FULL once maintenance is complete. In fact, this can also be good if you have a nightly process that does bulk inserts into the database. You get to run the large inserts without bloating the log, and still get point in time recovery for daily user operations.
One last thing, now that I have mentioned switching recovery models. Note that switching from the FULL or BULK_LOGGED recovery model to the SIMPLE recovery model will break the log backup chain. This is fine if you no longer need to recover from log backups, but if you do, then once you switch back to FULL or BULK_LOGGED, you will need to take a FULL backup to get back to a state that will allow recovery from log backups.
Thanks for reading.