Menu Bar

Tuesday, December 27, 2011

Examining what's causing blocking


"I prepared Explosive Ru ..." **BOOM** - Vaarsuvius, Order of the Stick webcomic
We have had some fairly basic monitoring for blocking that lasts for an extended period of time running on our primary servers for some time now. In the last few days, we've started getting some alerts in the wee hours of the morning, every night, pretty much like clockwork.

It doesn't last long, but it's routine - and at 1 AM each morning, it's annoying.

I wasn't quite sure how to track this down, so I threw out a quick tweet to #sqlhelp and @SQLSoldier suggested capturing some wait information.

First, I created a table that mirrored sys.dm_os_waiting_tasks, with an extra column up front for the Date/Time. Next, I created the following little stored procedure.

create procedure [dbo].[GetWaitingTasks]
as
begin
insert into prod_waiting_tasks 
select getdate(), * from prod.sys.dm_os_waiting_tasks
end;

I scheduled that to run, every 1 minute in SQL Server Agent, between 12:30 AM and 1:30 AM.

When I queried this table the next morning, I saw a bunch of these messages:

pagelock fileid=3 pageid=8969941 dbid=11 id=lock1b2c6500 mode=IX associatedObjectId=72057594959429632

Checking into sys.databases, I confirmed that dbid=11 was my prod database.
Checking into sys.master_files, I confirmed that the file was file containing most of the data files in the darabase.

I also had a page number in the datafile. The problem was how to work out what object it belonged to.

That's where an undocumented feature came to my rescue. Paul Randall blogged about DBCC PAGE here.

dbcc traceon (3604)
dbcc page (11, 3, 8969941, printopt=1) 

On of the pieces of output from this was the object id.

Follow that up with:

select * from sys.objects where object_id = 738101670

Once I had run that and had an object that was the cause of my nightly blocking, I was able to backtrack through the job history tables that the application provides to work out where my locking was coming from.