On a typical day a few months ago, I received an automated warning in my inbox that a particular server was low on disk space. This came along with the requisite “panic” emails I usually get from other areas who are also on that notification list. I responded that I was looking into it in order to keep the email volume down. :-)
I mapped the volume in question to my local machine and fired up SpaceSniffer to get a quick read on what was taking up all the space. It was a dedicated SQL Server data file drive and the usual suspects were taking up the lion’s share of the space. (SpaceSniffer is a fantastic tool, by the way. You should check it out, if you haven’t already.)
Now, when it comes to disk space issues, there are three basic options:
1. Do nothing. This should always be considered as an approach, though you may rarely use this as a solution. Sometimes you really don’t need to do anything.
2. Reduce the space required. You can relocate files, clean up un-needed data, etc. You can even shrink files if you understand what the consequences would be and have a plan for dealing with that.
3. Increase the space available. In this case, we were on a SAN drive that had available space left, so we could grow the drive to make more space available.
Another quick check showed that most of the data files were over 90% used, and none of them would be able to grow unless more space was made available. That eliminated option 1. There were no other files on that drive to clean up or delete, i.e., no databases to drop, so we couldn’t reduce the amount of space required for option 2. That left us with option 3: Grow the drive. This was an online operation handled off-hours by our SAN admin, and was successful.
After resolving any issue, it’s a good idea to go back and do some root-cause analysis. What caused the disk space issue? One of the first things I do when taking on a new client or job is to set up a monitoring job on every server that keeps a few month’s worth of history on the database files. Using the sys.dm_io_virtual_file_stats DMV I take a snapshot of the disk activity for each data file, as well as it’s size on disk, every 15 minutes. This is then logged to a table in my administrative database on each server.
With the stats in place, I was able to put together a quick query to see which database had grown the most over the last few months. It turned out that one of the smaller databases had now become the second biggest database, all within the space of the last 8 weeks. Checking with the owner of the application, I found out a lot of new data had recently been imported, causing the database to grow by over 600%. After that, it didn’t take much more growth on the part of the other data files to push that disk over its threshold, and send an alert.
Here are the scripts I use, in case you’re interested.
- DDL for FileIOStatsRaw table and vw_FileIOByHour view. – Created in the administrative DB. (In my case, [SQLAdmin].)
- GetFileIOStats stored procedure – Run this as frequently as you want and output to the FileIOStatsRaw table. – Also created in the admin DB. Will keep 90 days by default, but call it how you want in an agent job.
- Trend Query:
SELECT DBName, Yr, Mo, CalculatedGrowthInMBPerMonth = SUM(FileGrowthBytes) / 1024.0 FROM dbo.vw_FileIOByHour WHERE FileType = 'ROWS' GROUP BY DBName, Yr, Mo ORDER BY DBName, Yr, Mo; GO
How do you monitor, measure and alert on disk space usage?
Thanks for reading.