DBCC DBINFO

Often, I like to poke around inside databases just for fun. Just to look at the structures. It’s kind of like wandering around unexplored territory, and recognizing familiar architectures – appealing to my sense of curiosity. My first tool for the task is usually the undocumented DBCC DBINFO(). While I normally don’t encourage the use of undocumented commands, this one is safe, and doesn’t appear to be going anywhere.

Let’s look at the results for a sample database, and discuss the output. Run the following code on the database of your choice. The DB results shown here are from a demonstration database from my corruption presentation:

USE [master];
GO
/* Send output to console, required by DBINFO(). */
DBCC TRACEON(3604);
GO
/* Using TABLERESULTS for nicely formatted output. */
DBCC DBINFO(DemoRestoreOrRepair)
WITH TABLERESULTS;
GO
DBINFO Results
Click to enlarge.

We get a lot of information out of this. Much of it can also be obtained by querying sys.databases as well, but some cannot. The most interesting bits to me are:

  • dbi_Version and dbi_CreateVersion. – Tells you what version of SQL Server the database was created in, and is currently. A mismatch here indicates the DB has been upgraded.
  • dbi_DBCCFlags – A value of 2 tells you that data purity checks are being run by default when CHECKDB runs on this database. 0 means they are not, and you’ll have to specify that as an option for the CHECKDB command next time you run it. (If you *want* data purity checks, that is.)
  • dbi_DBCCLastKnownGood – The last time DBCC CHECKDB was run successfully on this database.
  • dbi_LastLogBackupTime – Just what it says on the label, the last time the log was backed up.

There are also a lot of entries about the various log sequence numbers that the database was on at specific points in time. Such as:

  • dbi_dbbackupLSN for the last backup.
  • dbi_DifferentialBaseLSN for the last full backup, which would be required to start the restore sequence.
  • dbi_checkptLSN for the last checkpoint.
  • dbi_DirtyPageLSN for the last time a transaction was committed.

Interesting stuff, I think.  What about you?

Thanks for reading.

-David.

sp_update_job and Editing System Tables.

The initial problem: We have a particular situation that involves some scheduling. At the end of the month, we need to do some additional ETL processing for financial reporting. At that time, we want to avoid competing for resources with the other, scheduled jobs that would normally run during that time of day.

The first solution: So that we don’t compete for server resources with other jobs, we create a job that disables jobs at the end of the month, and re-enables them after the month-end processing is done.

IF DATEPART(day,
    DATEADD(day,1,CURRENT_TIMESTAMP)
   ) = 1 /* If tomorrow is the first of the month */
BEGIN
  UPDATE msdb.dbo.sysjobs
  SET enabled = 0
  WHERE name = 'Daily ETL Master'
END

The second problem: After scheduling the above code to run on the last day of the month, we open up the Job Activity Monitor and see that the jobs are disabled. However, we also notice that the jobs have run on schedule, despite being disabled. What happened?

The diagnosis: Two things went wrong here.

First, note that when the SQL Server Agent starts up, it reads the scheduled task information in from the sysjobs table, and caches it. In order to know that the system tables have changed, we need to notify the Agent that the cache needs to be refreshed. By now, we should be thinking stored procedure, instead of single statement. This is good, because SQL Server already supplies a stored procedure to do just that.

sp_update_job will take several parameters, one of which is @enabled. This will either enable or disable the job, when set to 1 or 0, respectively. The procedure will also cause the Agent to refresh it’s cache, if a cached parameter has been updated. The enabled flag is cached, so in this case, the code should read like this:

IF DATEPART(day,
    DATEADD(day,1,CURRENT_TIMESTAMP)
   ) = 1 /* If tomorrow is the first of the month */
BEGIN
  EXECUTE msdb.dbo.sp_update_job
    @job_name = 'Daily ETL Master',
    @enabled = 0
END

Now the job’s status will be updated, and the agent’s cache will be refreshed appropriately.

Second, and this is even more important, since we’ve scheduled this code to run on the last day of the month, is there a time when the IF statement would ever evaluate false? Nope. We’ve written an IF statement that will only ever evaluate one way. Which means, this IF statement isn’t even necessary in this case.

The final solution: This job only needs to run twice; on the last and first days of the month. The SQL Server Agent lets us define both of those schedules. So, we can create two job schedules, “Last Day of Month”, and “First Day of Month”, and assign those to a job that runs the following:

IF DATEPART(day,CURRENT_TIMESTAMP)= 1 /* First of the month */
BEGIN
  EXECUTE msdb.dbo.sp_update_job
    @job_name = 'Daily ETL Master',
    @enabled = 1
END
ELSE 
BEGIN
  EXECUTE msdb.dbo.sp_update_job
    @job_name = 'Daily ETL Master',
    @enabled = 0
END

Thanks for reading.

-David.

Shrink Is A Four Letter Word

Recently, I’ve been noticing some large-scale log growth on some of my developer SQL servers, which has been causing them to run out of disk space. Though shrinking files is almost universally a bad idea, there is a time and a place for doing so provided the ramifications of shrinking are understood. Needing to shrink a log file indicates that either we’re operating on more data than we expected, or operating inefficiently. Either way, it’s almost always the result of unplanned circumstances.  Under such circumstances, you can’t always use the option you want to. Sometimes you have to do something regrettable.

When *NOT* to shrink:

  • When the growth happens in a Data file. Shrinking is going to rob you of free space required for index maintenance, not to mention fragmenting the hell out of your existing indexes, making them near useless.
  • When the growth is in TempDB.  It can cause corruption. Either add TempDB files, or kill the transaction and recreate tempdb. You’ll need to restart SQL Server for that, and killing said transaction that big may cause a very long rollback. You’ve been warned.
  • As part of regular maintenance.  If your log files are growing repeatedly, then they’re trying to get to the size they need to be in order to handle your workload. Why are you shrinking them? What do you have against them? They have dreams, too…  Fix the workload, not the log files.

Alternatives to shrinking:

When to shrink. I can only think of two situations off the top of my head

  • Under an emergency where other workloads on the same disk are affected and there’s no additional disk elsewhere for another log file.
  • When a database is being archived / retired, and will be set to read-only. Even in that case, you wouldn’t just shrink the DB. See Retiring a Database.

Benefits to shrinking:

  • Free disk space.
  • That’s about it.

Caveats to shrinking:

  • Massive performance hit during re-growth, as all transactions must be suspended.
  • Massive performance hit due to file fragmentation on re-growth.
  • Massive performance hit due to internal fragmentation on shrink.
  • Massive perfor…. You get the idea.

So really, we want to avoid shrinking log files wherever possible. The best way to avoid shrinking is to properly size files to begin with.   Make sure you have enough room for your transactions to run. If you have large transactions or volumes of data to work with, one good way to avoid file growth is to batch large transactions into smaller groups. i.e. Instead of inserting 5M rows, insert 100k rows at a time, 50 times.  This can actually be faster than a single large insert, as you’re not stuck waiting for the log file to grow to accommodate 5M records.

Some growth may still occur since, let’s face it, we have a lot of data to work with. What we want to avoid is large, recurring growth. Remember that when the log file must grow, all transactions are suspended.  If you are only growing slightly, that may not be a problem, but when you have 100GB of growth, that’s going to stop you in your tracks for the duration of the growth.

If you must shrink a log file due to uncontrolled or unexpected growth, you can use the SHRINKFILE command, but with one caveat – I would recommend shrinking the file by half instead of to its smallest size.  This way, future growth may be avoided while queries are tuned to use less log space.   For example, if a log were to grow to 40GB, after averaging a size of 2GB, I would shrink the log to 20GB, like this:

USE <DatabaseName>;
DBCC SHRINKFILE(<file ID>,<target size>);

Example:

USE BigLogDB;
DBCC SHRINKFILE(2,20000);

2 is the ID of the log file for most databases.  You can use sp_helpfile on a database to check that.  20000 is the target size, which is always in MB. Shrinking requires sysadmin or diskadmin permissions, which some developers do have.  I can also assist you with these tasks as needed.  Another advantage to this is that it gives us an idea of how large the production log files will need to be to accommodate new SQL code.

Also, I would like to stress that shrinking is used as a temporary fix, not as a workaround, and *never* as part of regular maintenance. If you have a job on a server that is regularly shrinking log files, then you either need more disk space or smaller transactions. Regular shrinking is a sign that something bad is going on. Also, while shrinking can be considered on log files, it is almost never a good idea to shrink a data file.  Avoid using SHRINKDATABASE.

Thanks for reading.
-David.