dbTalk Databases Forums  

Ongoing purging of active records causes deadlocks

comp.databases.ms-sqlserver comp.databases.ms-sqlserver


Discuss Ongoing purging of active records causes deadlocks in the comp.databases.ms-sqlserver forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
bobdurie@gmail.com
 
Posts: n/a

Default Ongoing purging of active records causes deadlocks - 07-19-2007 , 03:13 PM






Hi,

We have a Java application that runs against a variety of backends
including Oracle and MSSql 2000 and 2005. Our application has a
handful of active tables that are constantly being inserted into, and
some of which are being selected from on a regular basis.

We have a purge job to remove unneeded records that every hour "DELETE
FROM <table> WHERE <datafield> < <sometimestamp>". This is how we are
purging because we need 100% up time, so we do so every hour. For
some tables the timestamp is 2 weeks ago, others its 2 hours ago. The
date field is indexed in some cases, in others it is not... the
DELETE is always done off of a transaction (autoCommit on), but
experimentation has shown doing it on one doesn't help much.

This task normally functions fine, but every once in a while the
inserts or counts on this table fail with deadlocks during the purge
job. I'm looking for thoughts as to what we could do differently or
other experience doing this type of thing, some possibilities include:
- doing a select first, then deleting one by one. This is a
possibility, but its SLOW and may take over an hour to do this so we'd
be constantly churning deleting single records from the db.
- freezing access to these tables during the purge job... our app
cannot really afford to do this, but perhaps this is the only option.
- doing an update of an "OBSOLETE" flag on the record, then deleting
by that flag... i'm not sure we'd avoid issues doing this, but its'
an option.

The failures happen VERY infrequently on sql2005 and much more
frequently on sql2000. Any help or guidance would be most
appreciated, thanks!

Bob


Reply With Quote
  #2  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Ongoing purging of active records causes deadlocks - 07-19-2007 , 03:57 PM






bobdurie (AT) gmail (DOT) com (bobdurie (AT) gmail (DOT) com) writes:
Quote:
We have a Java application that runs against a variety of backends
including Oracle and MSSql 2000 and 2005. Our application has a
handful of active tables that are constantly being inserted into, and
some of which are being selected from on a regular basis.

We have a purge job to remove unneeded records that every hour "DELETE
FROM <table> WHERE <datafield> < <sometimestamp>". This is how we are
purging because we need 100% up time, so we do so every hour. For
some tables the timestamp is 2 weeks ago, others its 2 hours ago. The
date field is indexed in some cases, in others it is not... the
DELETE is always done off of a transaction (autoCommit on), but
experimentation has shown doing it on one doesn't help much.

This task normally functions fine, but every once in a while the
inserts or counts on this table fail with deadlocks during the purge
job. I'm looking for thoughts as to what we could do differently or
other experience doing this type of thing, some possibilities include:
- doing a select first, then deleting one by one. This is a
possibility, but its SLOW and may take over an hour to do this so we'd
be constantly churning deleting single records from the db.
- freezing access to these tables during the purge job... our app
cannot really afford to do this, but perhaps this is the only option.
- doing an update of an "OBSOLETE" flag on the record, then deleting
by that flag... i'm not sure we'd avoid issues doing this, but its'
an option.

The failures happen VERY infrequently on sql2005 and much more
frequently on sql2000. Any help or guidance would be most
appreciated, thanks!
I would suggest that you start with posting the CREATE TABLE and
CREATE INDEX table for you table, and the exact statement that
you use. Please also indicate how the typical INSERT operation
looks like, assuming that the INSERT operation is the on the purge
is on conflict with.

One option you could consider is to a SET DEADLOCK_PRIORITY LOW to
the purge job. If there is a deadlock, the purge is the one to go,
and not any other process.


--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
  #3  
Old   
Alex Kuznetsov
 
Posts: n/a

Default Re: Ongoing purging of active records causes deadlocks - 07-20-2007 , 07:57 AM



On Jul 19, 3:13 pm, "bobdu... (AT) gmail (DOT) com" <bobdu... (AT) gmail (DOT) com> wrote:
Quote:
Hi,

We have a Java application that runs against a variety of backends
including Oracle and MSSql 2000 and 2005. Our application has a
handful of active tables that are constantly being inserted into, and
some of which are being selected from on a regular basis.

We have a purge job to remove unneeded records that every hour "DELETE
FROM <table> WHERE <datafield> < <sometimestamp>". This is how we are
purging because we need 100% up time, so we do so every hour. For
some tables the timestamp is 2 weeks ago, others its 2 hours ago. The
date field is indexed in some cases, in others it is not... the
DELETE is always done off of a transaction (autoCommit on), but
experimentation has shown doing it on one doesn't help much.

This task normally functions fine, but every once in a while the
inserts or counts on this table fail with deadlocks during the purge
job. I'm looking for thoughts as to what we could do differently or
other experience doing this type of thing, some possibilities include:
- doing a select first, then deleting one by one. This is a
possibility, but its SLOW and may take over an hour to do this so we'd
be constantly churning deleting single records from the db.
- freezing access to these tables during the purge job... our app
cannot really afford to do this, but perhaps this is the only option.
- doing an update of an "OBSOLETE" flag on the record, then deleting
by that flag... i'm not sure we'd avoid issues doing this, but its'
an option.

The failures happen VERY infrequently on sql2005 and much more
frequently on sql2000. Any help or guidance would be most
appreciated, thanks!

Bob
Read "Analyzing deadlocks with SQL Server Profiler", "How to resolve a
deadlock", "Resolving Deadlocks in SQL Server 2000". Consider
clustering your tables on your timestamp columns.
BTW, unlike Erland, we set deadlock priority to high so that our
purging interferes less with other activities.

Alex Kuznetsov, SQL Server MVP
http://sqlserver-tips.blogspot.com/



Reply With Quote
  #4  
Old   
bobdurie@gmail.com
 
Posts: n/a

Default Re: Ongoing purging of active records causes deadlocks - 07-20-2007 , 10:00 AM



On Jul 20, 8:57 am, Alex Kuznetsov <AK_TIREDOFS... (AT) hotmail (DOT) COM> wrote:
Quote:
On Jul 19, 3:13 pm, "bobdu... (AT) gmail (DOT) com" <bobdu... (AT) gmail (DOT) com> wrote:



Hi,

We have a Java application that runs against a variety of backends
including Oracle and MSSql 2000 and 2005. Our application has a
handful of active tables that are constantly being inserted into, and
some of which are being selected from on a regular basis.

We have a purge job to remove unneeded records that every hour "DELETE
FROM <table> WHERE <datafield> < <sometimestamp>". This is how we are
purging because we need 100% up time, so we do so every hour. For
some tables the timestamp is 2 weeks ago, others its 2 hours ago. The
date field is indexed in some cases, in others it is not... the
DELETE is always done off of a transaction (autoCommit on), but
experimentation has shown doing it on one doesn't help much.

This task normally functions fine, but every once in a while the
inserts or counts on this table fail with deadlocks during the purge
job. I'm looking for thoughts as to what we could do differently or
other experience doing this type of thing, some possibilities include:
- doing a select first, then deleting one by one. This is a
possibility, but its SLOW and may take over an hour to do this so we'd
be constantly churning deleting single records from the db.
- freezing access to these tables during the purge job... our app
cannot really afford to do this, but perhaps this is the only option.
- doing an update of an "OBSOLETE" flag on the record, then deleting
by that flag... i'm not sure we'd avoid issues doing this, but its'
an option.

The failures happen VERY infrequently on sql2005 and much more
frequently on sql2000. Any help or guidance would be most
appreciated, thanks!

Bob

Read "Analyzing deadlocks with SQL Server Profiler", "How to resolve a
deadlock", "Resolving Deadlocks in SQL Server 2000". Consider
clustering your tables on your timestamp columns.
BTW, unlike Erland, we set deadlock priority to high so that our
purging interferes less with other activities.

Alex Kuznetsov, SQL Server MVPhttp://sqlserver-tips.blogspot.com/
Thanks for the responses.

Alex, i will go back and re-read those articles, but we've implemented
as much as we can from them. As for the clustering of the tables
based on the timestamps, thats something i'm totally unfamiliar
with!!! Our schema is created straight from java with as much
database agnostic code as possible, could you perhaps hint at how to
do this for a noob like me?

I've included as much sql and deadlock graphs as i can below from a
test app used to reproduce the issues. Keep in mind to reproduce
these issues we need upwards of 40 select/count/insert threads running
concurrently, and one purge thread almost continuously purging. This
is "basically" how our app works when under heavy load, and we're
hoping we can keep it that way.

Heres the creation cmds:
CREATE TABLE loads (
LoadID BIGINT NOT NULL IDENTITY PRIMARY KEY,
Type INTEGER NOT NULL,
CreateDate DATETIME NOT NULL,
OtherDate DATETIME,
ObjectID INTEGER
)
CREATE INDEX loads_objectid ON loads (ObjectID ASC)
CREATE INDEX loads_type ON loads (Type ASC)

I'll state for the record i'm not sure what the jtds driver is doing
under the hood, so there might be accompanying statements for some of
the below, ie "exec sp_execute 1" and such.

The insert looks likes this:
exec sp_executesql N'INSERT INTO loads (Type, CreateDate, OtherDate,
ObjectID) VALUES ( @P0 , @P1 , @P2 , @P3 )',N'@P0 int,@P1
datetime,@P2 datetime,@P3 int',2,''2007-07-20
09:38:15:477'',NULL,10

Heres a select count:
declare @p1 int
set @p1=1
exec sp_prepare @p1 output,N'',N'SELECT COUNT(*) FROM loads',1
select @p1

Here's a normal select:
declare @p1 int
set @p1=1073741825
declare @p5 int
set @p5=8
declare @p6 int
set @p6=1
exec sp_cursorprepare @p1 output,N'@P0 int',N'SELECT LoadID, Type,
CreateDate, OtherDate, ObjectID FROM loads WHERE Type >= @P0 ORDER
BY loads.LoadID DESC',1,@p5 output,@p6 output
select @p1, @p5, @p6

Heres a purge:
declare @p1 int
set @p1=2
exec sp_prepare @p1 output,N'',N'DELETE FROM loads WHERE CreateDate <
dateadd(SECOND, -40, CURRENT_TIMESTAMP)',1
select @p1


Theres 2 xml deadlock graph details below - the first is one where
many selects and purges are being done and they're both hung up on
each other's page locks. The second is a little more complex but only
involves selects and inserts.
DEADLOCK GRAPH 1:
<deadlock-list>
<deadlock victim="process8cd978">
<process-list>
<process id="process6d9b58" taskpriority="0" logused="8948600"
waitresource="PAGE: 6:1:10567" waittime="5000" ownerId="44602"
transactionname="DELETE" lasttranstarted="2007-07-20T10:17:15.337"
XDES="0xbf8e768" lockMode="IX" schedulerid="1" kpid="4864"
status="suspended" spid="92" sbid="0" ecid="0" priority="0"
transcount="2" lastbatchstarted="2007-07-20T10:17:14.617"
lastbatchcompleted="2007-07-20T10:17:14.617" clientapp="loadtest"
hostname="BDURIE" hostpid="123" loginname="sa" isolationlevel="read
committed (2)" xactid="44602" currentdb="6" lockTimeout="4294967295"
clientoption1="671088672" clientoption2="128056">
<executionStack>
<frame procname="adhoc" line="1" stmtstart="4"
sqlhandle="0x020000003a58ce373592e196e136605d0abbd 1342997e024">
DELETE FROM loads WHERE CreateDate &lt; dateadd(SECOND, -40,
CURRENT_TIMESTAMP) </frame>
</executionStack>
<inputbuf>
()DELETE FROM loads WHERE CreateDate &lt; dateadd(SECOND, -40,
CURRENT_TIMESTAMP) </inputbuf>
</process>
<process id="process8cd978" taskpriority="0" logused="0"
waitresource="PAGE: 6:1:10566" waittime="4906" ownerId="44192"
transactionname="DECLARE CURSOR"
lasttranstarted="2007-07-20T10:17:13.647" XDES="0x3c8b728"
lockMode="S" schedulerid="2" kpid="6180" status="suspended" spid="55"
sbid="0" ecid="0" priority="0" transcount="0"
lastbatchstarted="2007-07-20T10:17:13.647"
lastbatchcompleted="2007-07-20T10:17:13.633" clientapp="loadtest"
hostname="BDURIE" hostpid="123" loginname="sa" isolationlevel="read
committed (2)" xactid="44192" currentdb="6" lockTimeout="4294967295"
clientoption1="671088672" clientoption2="128056">
<executionStack>
<frame procname="adhoc" line="1" stmtstart="18"
sqlhandle="0x0200000035aa872d0616b9c3cde31b14d2698 b1961a31bf1">
SELECT LoadID, Type, CreateDate, OtherDate, ObjectID FROM loads WHERE
Type &gt;= @P0 ORDER BY loads.LoadID DESC </frame>
<frame procname="unknown" line="1"
sqlhandle="0x0000000000000000000000000000000000000 00000000000">
unknown </frame>
</executionStack>
<inputbuf>
SELECT 1 </inputbuf>
</process>
</process-list>
<resource-list>
<pagelock fileid="1" pageid="10567" dbid="6"
objectname="test.dbo.loads" id="lock4e44f40" mode="SIU"
associatedObjectId="72057594099728384">
<owner-list>
<owner id="process6d9b58" mode="IU"/>
<owner id="process8cd978" mode="S"/>
</owner-list>
<waiter-list>
<waiter id="process6d9b58" mode="IX" requestType="convert"/>
</waiter-list>
</pagelock>
<pagelock fileid="1" pageid="10566" dbid="6"
objectname="test.dbo.loads" id="lock104c2a80" mode="IX"
associatedObjectId="72057594099728384">
<owner-list>
<owner id="process6d9b58" mode="IX"/>
</owner-list>
<waiter-list>
<waiter id="process8cd978" mode="S" requestType="wait"/>
</waiter-list>
</pagelock>
</resource-list>
</deadlock>
</deadlock-list>



DEADLOCK GRAPH 2:
<deadlock-list>
<deadlock victim="process6d8898">
<process-list>
<process id="process6d87a8" taskpriority="0" logused="0"
waitresource="PAGE: 6:1:1652" waittime="328" ownerId="5124"
transactionname="SELECT" lasttranstarted="2007-07-20T09:39:11.380"
XDES="0x3c8bad8" lockMode="S" schedulerid="1" kpid="1504"
status="suspended" spid="63" sbid="0" ecid="0" priority="0"
transcount="0" lastbatchstarted="2007-07-20T09:39:11.380"
lastbatchcompleted="2007-07-20T09:39:11.350" clientapp="loadtest"
hostname="BDURIE" hostpid="123" loginname="sa" isolationlevel="read
committed (2)" xactid="5124" currentdb="6" lockTimeout="4294967295"
clientoption1="671088672" clientoption2="128056">
<executionStack>
<frame procname="adhoc" line="1" stmtstart="18"
sqlhandle="0x020000008cf9c12603db8361e8d9c732563e9 2f02d4ef6a0">
SELECT COUNT(*) FROM loads WHERE Type &gt;= @P0 </frame>
</executionStack>
<inputbuf>
(@P0 int)SELECT COUNT(*) FROM loads WHERE Type &gt;= @P0 </
inputbuf>
</process>
<process id="process6d8898" taskpriority="0" logused="0"
waitresource="PAGE: 6:1:1697" waittime="390" ownerId="5131"
transactionname="SELECT" lasttranstarted="2007-07-20T09:39:11.410"
XDES="0x47fc9a8" lockMode="S" schedulerid="1" kpid="6856"
status="suspended" spid="60" sbid="0" ecid="0" priority="0"
transcount="0" lastbatchstarted="2007-07-20T09:39:11.410"
lastbatchcompleted="2007-07-20T09:39:11.397" clientapp="loadtest"
hostname="BDURIE" hostpid="123" loginname="sa" isolationlevel="read
committed (2)" xactid="5131" currentdb="6" lockTimeout="120000"
clientoption1="671088672" clientoption2="128056">
<executionStack>
<frame procname="adhoc" line="1" stmtstart="18"
sqlhandle="0x020000008cf9c12603db8361e8d9c732563e9 2f02d4ef6a0">
SELECT COUNT(*) FROM loads WHERE Type &gt;= @P0 </frame>
</executionStack>
<inputbuf>
(@P0 int)SELECT COUNT(*) FROM loads WHERE Type &gt;= @P0 </
inputbuf>
</process>
<process id="process8cd1f8" taskpriority="0" logused="5724"
waitresource="PAGE: 6:1:1697" waittime="375" ownerId="5117"
transactionname="implicit_transaction"
lasttranstarted="2007-07-20T09:39:11.363" XDES="0x62fc960"
lockMode="IX" schedulerid="2" kpid="3408" status="suspended" spid="57"
sbid="0" ecid="0" priority="0" transcount="2"
lastbatchstarted="2007-07-20T09:39:11.363"
lastbatchcompleted="2007-07-20T09:39:09.800" clientapp="loadtest"
hostname="BDURIE" hostpid="123" loginname="sa" isolationlevel="read
committed (2)" xactid="5117" currentdb="6" lockTimeout="120000"
clientoption1="671088672" clientoption2="128058">
<executionStack>
<frame procname="adhoc" line="1" stmtstart="86"
sqlhandle="0x020000006dd1c217ee7ebab08b2f41f4d5b63 4523d2edfc0">
INSERT INTO loads (Type, CreateDate, OtherDate, ObjectID) VALUES
( @P0 , @P1 , @P2 , @P3 ) </frame>
<frame procname="unknown" line="1"
sqlhandle="0x0000000000000000000000000000000000000 00000000000">
unknown </frame>
</executionStack>
<inputbuf>
(@P0 int,@P1 datetime,@P2 datetime,@P3 int)INSERT INTO loads (Type,
CreateDate, OtherDate, ObjectID) VALUES ( @P0 , @P1 , @P2 ,
@P3 ) </inputbuf>
</process>
<process id="process8cd2e8" taskpriority="0" logused="5936"
waitresource="PAGE: 6:1:1652" waittime="328" ownerId="5118"
transactionname="implicit_transaction"
lasttranstarted="2007-07-20T09:39:11.363" XDES="0x62fcf08"
lockMode="IX" schedulerid="2" kpid="7504" status="suspended" spid="62"
sbid="0" ecid="0" priority="0" transcount="2"
lastbatchstarted="2007-07-20T09:39:11.363"
lastbatchcompleted="2007-07-20T09:39:09.800" clientapp="loadtest"
hostname="BDURIE" hostpid="123" loginname="sa" isolationlevel="read
committed (2)" xactid="5118" currentdb="6" lockTimeout="120000"
clientoption1="671088672" clientoption2="128058">
<executionStack>
<frame procname="adhoc" line="1" stmtstart="86"
sqlhandle="0x020000006dd1c217ee7ebab08b2f41f4d5b63 4523d2edfc0">
INSERT INTO loads (Type, CreateDate, OtherDate, ObjectID) VALUES
( @P0 , @P1 , @P2 , @P3 ) </frame>
<frame procname="unknown" line="1"
sqlhandle="0x0000000000000000000000000000000000000 00000000000">
unknown </frame>
</executionStack>
<inputbuf>
(@P0 int,@P1 datetime,@P2 datetime,@P3 int)INSERT INTO loads (Type,
CreateDate, OtherDate, ObjectID) VALUES ( @P0 , @P1 , @P2 ,
@P3 ) </inputbuf>
</process>
</process-list>
<resource-list>
<pagelock fileid="1" pageid="1652" dbid="6"
objectname="test.dbo.loads" id="lock3b5c780" mode="IX"
associatedObjectId="72057594099269632">
<owner-list>
<owner id="process8cd1f8" mode="IX"/>
</owner-list>
<waiter-list>
<waiter id="process6d87a8" mode="S" requestType="wait"/>
<waiter id="process8cd2e8" mode="IX" requestType="wait"/>
</waiter-list>
</pagelock>
<pagelock fileid="1" pageid="1697" dbid="6"
objectname="test.dbo.loads" id="lock5bbd740" mode="IX"
associatedObjectId="72057594099269632">
<owner-list>
<owner id="process8cd2e8" mode="IX"/>
</owner-list>
<waiter-list>
<waiter id="process6d8898" mode="S" requestType="wait"/>
<waiter id="process8cd1f8" mode="IX" requestType="wait"/>
</waiter-list>
</pagelock>
</resource-list>
</deadlock>
</deadlock-list>






Reply With Quote
  #5  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Ongoing purging of active records causes deadlocks - 07-20-2007 , 04:50 PM



bobdurie (AT) gmail (DOT) com (bobdurie (AT) gmail (DOT) com) writes:
Quote:
Alex, i will go back and re-read those articles, but we've implemented
as much as we can from them. As for the clustering of the tables
based on the timestamps, thats something i'm totally unfamiliar
with!!! Our schema is created straight from java with as much
database agnostic code as possible, could you perhaps hint at how to
do this for a noob like me?
CREATE CLUSTERED INDEX CreateDateIX ON loads(CreateDate)

and change this to:

LoadID BIGINT NOT NULL IDENTITY PRIMARY KEY NONCLUSTERED,

Quote:
The insert looks likes this:
exec sp_executesql N'INSERT INTO loads (Type, CreateDate, OtherDate,
ObjectID) VALUES ( @P0 , @P1 , @P2 , @P3 )',N'@P0 int,@P1
datetime,@P2 datetime,@P3 int',2,''2007-07-20
09:38:15:477'',NULL,10
And CreateDate is typically now?

Quote:
Heres a purge:
declare @p1 int
set @p1=2
exec sp_prepare @p1 output,N'',N'DELETE FROM loads WHERE CreateDate
dateadd(SECOND, -40, CURRENT_TIMESTAMP)',1
select @p1
40 seconds? Egads, why do you save the data at all? :-)

There is no index at all on CreateDate, so it will have to scan the
full table. It's surprising that you need that many concurrent processes
for it to happen.

If CreateDate is the time for insertion and never changed before the
purge, it would be a good idea to cluster on that column. That would
minimize the point of conflict between the DELETE and the INSERT (but
the non-clustered indexes could still be a conflict area.)

But the SELECT COUNT(*) and the "normal" SELECT is still problematic.
Particularly the normal SELECT looks like it would table scan to me
since the WHERE clause is "Type >= @P0". But that depends on the
values actually used and the distribution in the table.

You could avoid this if you use snapshot isolation, but that is only
available to SQL 2005.

I don't really know what Alex meant that they set their deadlock priority
to high. What I meant is that the purging process sould have SET
DEADLOCK_PRIORITY LOW to declare itself a voluntary victim. In any
case, on SQL 2000 the only deadlock priorties are LOW and NORMAL. SQL
2005 also has HIGH and numbered priorities.



--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
  #6  
Old   
Alex Kuznetsov
 
Posts: n/a

Default Re: Ongoing purging of active records causes deadlocks - 07-20-2007 , 05:05 PM



On Jul 20, 4:50 pm, Erland Sommarskog <esq... (AT) sommarskog (DOT) se> wrote:
Quote:
bobdu... (AT) gmail (DOT) com (bobdu... (AT) gmail (DOT) com) writes:
Alex, i will go back and re-read those articles, but we've implemented
as much as we can from them. As for the clustering of the tables
based on the timestamps, thats something i'm totally unfamiliar
with!!! Our schema is created straight from java with as much
database agnostic code as possible, could you perhaps hint at how to
do this for a noob like me?

CREATE CLUSTERED INDEX CreateDateIX ON loads(CreateDate)

and change this to:

LoadID BIGINT NOT NULL IDENTITY PRIMARY KEY NONCLUSTERED,

The insert looks likes this:
exec sp_executesql N'INSERT INTO loads (Type, CreateDate, OtherDate,
ObjectID) VALUES ( @P0 , @P1 , @P2 , @P3 )',N'@P0 int,@P1
datetime,@P2 datetime,@P3 int',2,''2007-07-20
09:38:15:477'',NULL,10

And CreateDate is typically now?

Heres a purge:
declare @p1 int
set @p1=2
exec sp_prepare @p1 output,N'',N'DELETE FROM loads WHERE CreateDate
dateadd(SECOND, -40, CURRENT_TIMESTAMP)',1
select @p1

40 seconds? Egads, why do you save the data at all? :-)

There is no index at all on CreateDate, so it will have to scan the
full table. It's surprising that you need that many concurrent processes
for it to happen.

If CreateDate is the time for insertion and never changed before the
purge, it would be a good idea to cluster on that column. That would
minimize the point of conflict between the DELETE and the INSERT (but
the non-clustered indexes could still be a conflict area.)

But the SELECT COUNT(*) and the "normal" SELECT is still problematic.
Particularly the normal SELECT looks like it would table scan to me
since the WHERE clause is "Type >= @P0". But that depends on the
values actually used and the distribution in the table.

You could avoid this if you use snapshot isolation, but that is only
available to SQL 2005.

I don't really know what Alex meant that they set their deadlock priority
to high. What I meant is that the purging process sould have SET
DEADLOCK_PRIORITY LOW to declare itself a voluntary victim. In any
case, on SQL 2000 the only deadlock priorties are LOW and NORMAL. SQL
2005 also has HIGH and numbered priorities.

--
Erland Sommarskog, SQL Server MVP, esq... (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 athttp://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books...
Books Online for SQL Server 2000 athttp://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Erland,

Oops, my bad, you were right about low deadlock priority. Sorry for
that...

Alex Kuznetsov, SQL Server MVP
http://sqlserver-tips.blogspot.com/



Reply With Quote
  #7  
Old   
bobdurie@gmail.com
 
Posts: n/a

Default Re: Ongoing purging of active records causes deadlocks - 07-23-2007 , 09:03 AM



On Jul 20, 5:50 pm, Erland Sommarskog <esq... (AT) sommarskog (DOT) se> wrote:
Quote:
bobdu... (AT) gmail (DOT) com (bobdu... (AT) gmail (DOT) com) writes:
Alex, i will go back and re-read those articles, but we've implemented
as much as we can from them. As for the clustering of the tables
based on the timestamps, thats something i'm totally unfamiliar
with!!! Our schema is created straight from java with as much
database agnostic code as possible, could you perhaps hint at how to
do this for a noob like me?

CREATE CLUSTERED INDEX CreateDateIX ON loads(CreateDate)

and change this to:

LoadID BIGINT NOT NULL IDENTITY PRIMARY KEY NONCLUSTERED,

The insert looks likes this:
exec sp_executesql N'INSERT INTO loads (Type, CreateDate, OtherDate,
ObjectID) VALUES ( @P0 , @P1 , @P2 , @P3 )',N'@P0 int,@P1
datetime,@P2 datetime,@P3 int',2,''2007-07-20
09:38:15:477'',NULL,10

And CreateDate is typically now?

Heres a purge:
declare @p1 int
set @p1=2
exec sp_prepare @p1 output,N'',N'DELETE FROM loads WHERE CreateDate
dateadd(SECOND, -40, CURRENT_TIMESTAMP)',1
select @p1

40 seconds? Egads, why do you save the data at all? :-)

There is no index at all on CreateDate, so it will have to scan the
full table. It's surprising that you need that many concurrent processes
for it to happen.

If CreateDate is the time for insertion and never changed before the
purge, it would be a good idea to cluster on that column. That would
minimize the point of conflict between the DELETE and the INSERT (but
the non-clustered indexes could still be a conflict area.)

But the SELECT COUNT(*) and the "normal" SELECT is still problematic.
Particularly the normal SELECT looks like it would table scan to me
since the WHERE clause is "Type >= @P0". But that depends on the
values actually used and the distribution in the table.

You could avoid this if you use snapshot isolation, but that is only
available to SQL 2005.

I don't really know what Alex meant that they set their deadlock priority
to high. What I meant is that the purging process sould have SET
DEADLOCK_PRIORITY LOW to declare itself a voluntary victim. In any
case, on SQL 2000 the only deadlock priorties are LOW and NORMAL. SQL
2005 also has HIGH and numbered priorities.

--
Erland Sommarskog, SQL Server MVP, esq... (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 athttp://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books...
Books Online for SQL Server 2000 athttp://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Thanks again for the thoughts, a few followups:
- CreateDate is typically the time the record was created, so now
yes.
- 40 seconds is just a sample, the date typically goes between 2 hours
and 2 weeks after the record was created.
- Sorry again, the type >= is just a way of getting more data, those
queries are typically = based, and theres never more than 20 unique
values for type.

I probably should've used this analogy before, but consider these
tables like "audit" tables. We simply record lots of audits, do funky
searches on these audits, and then purge these audits at a given
schedule when they're no longer necessary.

We can try the clustered index, but i'm NOT convinced why this is
going to resolve the page lock issue... i've checked the execution
plan for some of our queries and it looks like it sorts first, then
does a clustered index scan of the ID. Is this really going to have
more page locks than if the index was on the date?

About the deadlock priority - i don't think it would be appropriate if
the purge job was the one thats failing, mainly because if it starts
getting behind in its work it will probably just have more and more
difficulty purging the future... the guys doing the inserts/selects
while important, can fail and will retry again later, but its ugly.
I'm (obviously) no dba and am finding it difficult that sql server can
kill clients because it's discovered a deadlock, when just looking at
the competing sql could never lead me to that conclusion.

One final thing that i don't understand at the db level is transaciton
models - using jdbc we're normally using READ_COMMITTED - we could use
SERIALIZABLE but i'm not sure how either of these translates at the db
level. If someone has a pointer to this info with respect to SQL
server (and/or oracle) i'd love to hear it.

Thanks a million again!!!



Reply With Quote
  #8  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Ongoing purging of active records causes deadlocks - 07-23-2007 , 04:59 PM



bobdurie (AT) gmail (DOT) com (bobdurie (AT) gmail (DOT) com) writes:
Quote:
- Sorry again, the type >= is just a way of getting more data, those
queries are typically = based, and theres never more than 20 unique
values for type.
For a query like:

SELECT LoadID, Type, CreateDate, OtherDate, ObjectID
FROM loads
WHERE Type = @P0
ORDER BY loads.LoadID DESC

An index on (Type, LoadID DESC) appears to be a good idea. If the
query is selective as you say, the optimizer would use the index and
bookmark lookups. And with the ID in the index, the result is already
sorted.

Quote:
We can try the clustered index, but i'm NOT convinced why this is
going to resolve the page lock issue... i've checked the execution
plan for some of our queries and it looks like it sorts first, then
does a clustered index scan of the ID. Is this really going to have
more page locks than if the index was on the date?
The sort is usually performed at the end of the query plan - you should
read it from right to left. Sorts can appear earlier in the plan, if
the optimizer settles for a merge join, which requires sorted input.

I cannot guarantee that the clustered index on CreateDate saves the day,
but I think it will be difficult without it. The way the query is written,
without a parameter, the optimizer will assume a 30% hit-rate which is far
beyond what is good NC-index seek + bookmark lookup. Of course, you could
force the index with an index hint, but I don't think that would be a
good thing.

What is a possible alternative, though, is that the purge first performs:

SELECT @loadid = MAX(LoadID)
FROM loads
WHERE CreateDate < dateadd(SECOND, -40, CURRENT_TIMESTAMP)

And the issues "DELETE loads WHERE LoadID <= @loadid". For this to be
successful, you still need an index on (CreateDate, LoadID DESC).

Quote:
One final thing that i don't understand at the db level is transaciton
models - using jdbc we're normally using READ_COMMITTED - we could use
SERIALIZABLE but i'm not sure how either of these translates at the db
level. If someone has a pointer to this info with respect to SQL
server (and/or oracle) i'd love to hear it.
READ_COMMITTED means that once you have read data, you release the lock
for it. Thus, if you read the same data twice in the same transaction,
you can get different result.

SERIALIZABLE means that you do not tolerate that anyone adds or changes
rows that could affect data you have read. I don't think serializable
will work here. First of all, things will slow down, because they are
just that: serialized. Second, serializable appears to often lead to
deadlocks, because SQL Server is not very good at always taking the
range locks in the same order, or so it seems.

There are a few more possibilities: On SQL 2005, you can use any of the
two varities of SNAPSHOT isolation. With this isolation level, readers
read from the version store, and cannot block writers. Yes, just like
on Oracle. But watch out! If you issue a SELECT COUNT(*) while the purge
is running, do you want the number of rows before or after the purge?
Snapshot will give you the former, that is old and out-dated data.

Then you can go for READ UNCOMMITTED, and do dirty reads. Then your
SELECT COUNT(*) will give you something between before and after. Finally
there is the READPAST hint which means that you skip locked rows. In
this case two, you may also get something in between before and after,
although it's more likely to be closer to after.

....and a completely different approach is to use partitioned tables
(requires SQL 2005 Enterprise). In that case, you would only know and
then alter your partition scheme and partition function. A purge would
be a pure metadata operation. Switch out the old data, and then drop
that table. In this case, you would purge far less often, but when you
do it, it would be snap.

--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
  #9  
Old   
Alex Kuznetsov
 
Posts: n/a

Default Re: Ongoing purging of active records causes deadlocks - 07-23-2007 , 05:22 PM



On Jul 23, 9:03 am, "bobdu... (AT) gmail (DOT) com" <bobdu... (AT) gmail (DOT) com> wrote:
Quote:
On Jul 20, 5:50 pm, Erland Sommarskog <esq... (AT) sommarskog (DOT) se> wrote:



bobdu... (AT) gmail (DOT) com (bobdu... (AT) gmail (DOT) com) writes:
Alex, i will go back and re-read those articles, but we've implemented
as much as we can from them. As for the clustering of the tables
based on the timestamps, thats something i'm totally unfamiliar
with!!! Our schema is created straight from java with as much
database agnostic code as possible, could you perhaps hint at how to
do this for a noob like me?

CREATE CLUSTERED INDEX CreateDateIX ON loads(CreateDate)

and change this to:

LoadID BIGINT NOT NULL IDENTITY PRIMARY KEY NONCLUSTERED,

The insert looks likes this:
exec sp_executesql N'INSERT INTO loads (Type, CreateDate, OtherDate,
ObjectID) VALUES ( @P0 , @P1 , @P2 , @P3 )',N'@P0 int,@P1
datetime,@P2 datetime,@P3 int',2,''2007-07-20
09:38:15:477'',NULL,10

And CreateDate is typically now?

Heres a purge:
declare @p1 int
set @p1=2
exec sp_prepare @p1 output,N'',N'DELETE FROM loads WHERE CreateDate
dateadd(SECOND, -40, CURRENT_TIMESTAMP)',1
select @p1

40 seconds? Egads, why do you save the data at all? :-)

There is no index at all on CreateDate, so it will have to scan the
full table. It's surprising that you need that many concurrent processes
for it to happen.

If CreateDate is the time for insertion and never changed before the
purge, it would be a good idea to cluster on that column. That would
minimize the point of conflict between the DELETE and the INSERT (but
the non-clustered indexes could still be a conflict area.)

But the SELECT COUNT(*) and the "normal" SELECT is still problematic.
Particularly the normal SELECT looks like it would table scan to me
since the WHERE clause is "Type >= @P0". But that depends on the
values actually used and the distribution in the table.

You could avoid this if you use snapshot isolation, but that is only
available to SQL 2005.

I don't really know what Alex meant that they set their deadlock priority
to high. What I meant is that the purging process sould have SET
DEADLOCK_PRIORITY LOW to declare itself a voluntary victim. In any
case, on SQL 2000 the only deadlock priorties are LOW and NORMAL. SQL
2005 also has HIGH and numbered priorities.

--
Erland Sommarskog, SQL Server MVP, esq... (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 athttp://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books...
Books Online for SQL Server 2000 athttp://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

Thanks again for the thoughts, a few followups:
- CreateDate is typically the time the record was created, so now
yes.
- 40 seconds is just a sample, the date typically goes between 2 hours
and 2 weeks after the record was created.
- Sorry again, the type >= is just a way of getting more data, those
queries are typically = based, and theres never more than 20 unique
values for type.

I probably should've used this analogy before, but consider these
tables like "audit" tables. We simply record lots of audits, do funky
searches on these audits, and then purge these audits at a given
schedule when they're no longer necessary.

We can try the clustered index, but i'm NOT convinced why this is
going to resolve the page lock issue... i've checked the execution
plan for some of our queries and it looks like it sorts first, then
does a clustered index scan of the ID. Is this really going to have
more page locks than if the index was on the date?

About the deadlock priority - i don't think it would be appropriate if
the purge job was the one thats failing, mainly because if it starts
getting behind in its work it will probably just have more and more
difficulty purging the future... the guys doing the inserts/selects
while important, can fail and will retry again later, but its ugly.
I'm (obviously) no dba and am finding it difficult that sql server can
kill clients because it's discovered a deadlock, when just looking at
the competing sql could never lead me to that conclusion.

One final thing that i don't understand at the db level is transaciton
models - using jdbc we're normally using READ_COMMITTED - we could use
SERIALIZABLE but i'm not sure how either of these translates at the db
level. If someone has a pointer to this info with respect to SQL
server (and/or oracle) i'd love to hear it.

Thanks a million again!!!
I described how snapshot isolation affects data purging here:

http://www.devx.com/dbzone/Article/32957

Alex Kuznetsov, SQL Server MVP
http://sqlserver-tips.blogspot.com/



Reply With Quote
  #10  
Old   
bobdurie@gmail.com
 
Posts: n/a

Default Re: Ongoing purging of active records causes deadlocks - 07-24-2007 , 06:39 AM



On Jul 23, 5:59 pm, Erland Sommarskog <esq... (AT) sommarskog (DOT) se> wrote:
Quote:
bobdu... (AT) gmail (DOT) com (bobdu... (AT) gmail (DOT) com) writes:
- Sorry again, the type >= is just a way of getting more data, those
queries are typically = based, and theres never more than 20 unique
values for type.

For a query like:

SELECT LoadID, Type, CreateDate, OtherDate, ObjectID
FROM loads
WHERE Type = @P0
ORDER BY loads.LoadID DESC

An index on (Type, LoadID DESC) appears to be a good idea. If the
query is selective as you say, the optimizer would use the index and
bookmark lookups. And with the ID in the index, the result is already
sorted.

We can try the clustered index, but i'm NOT convinced why this is
going to resolve the page lock issue... i've checked the execution
plan for some of our queries and it looks like it sorts first, then
does a clustered index scan of the ID. Is this really going to have
more page locks than if the index was on the date?

The sort is usually performed at the end of the query plan - you should
read it from right to left. Sorts can appear earlier in the plan, if
the optimizer settles for a merge join, which requires sorted input.

I cannot guarantee that the clustered index on CreateDate saves the day,
but I think it will be difficult without it. The way the query is written,
without a parameter, the optimizer will assume a 30% hit-rate which is far
beyond what is good NC-index seek + bookmark lookup. Of course, you could
force the index with an index hint, but I don't think that would be a
good thing.

What is a possible alternative, though, is that the purge first performs:

SELECT @loadid = MAX(LoadID)
FROM loads
WHERE CreateDate < dateadd(SECOND, -40, CURRENT_TIMESTAMP)

And the issues "DELETE loads WHERE LoadID <= @loadid". For this to be
successful, you still need an index on (CreateDate, LoadID DESC).

One final thing that i don't understand at the db level is transaciton
models - using jdbc we're normally using READ_COMMITTED - we could use
SERIALIZABLE but i'm not sure how either of these translates at the db
level. If someone has a pointer to this info with respect to SQL
server (and/or oracle) i'd love to hear it.

READ_COMMITTED means that once you have read data, you release the lock
for it. Thus, if you read the same data twice in the same transaction,
you can get different result.

SERIALIZABLE means that you do not tolerate that anyone adds or changes
rows that could affect data you have read. I don't think serializable
will work here. First of all, things will slow down, because they are
just that: serialized. Second, serializable appears to often lead to
deadlocks, because SQL Server is not very good at always taking the
range locks in the same order, or so it seems.

There are a few more possibilities: On SQL 2005, you can use any of the
two varities of SNAPSHOT isolation. With this isolation level, readers
read from the version store, and cannot block writers. Yes, just like
on Oracle. But watch out! If you issue a SELECT COUNT(*) while the purge
is running, do you want the number of rows before or after the purge?
Snapshot will give you the former, that is old and out-dated data.

Then you can go for READ UNCOMMITTED, and do dirty reads. Then your
SELECT COUNT(*) will give you something between before and after. Finally
there is the READPAST hint which means that you skip locked rows. In
this case two, you may also get something in between before and after,
although it's more likely to be closer to after.

...and a completely different approach is to use partitioned tables
(requires SQL 2005 Enterprise). In that case, you would only know and
then alter your partition scheme and partition function. A purge would
be a pure metadata operation. Switch out the old data, and then drop
that table. In this case, you would purge far less often, but when you
do it, it would be snap.

--
Erland Sommarskog, SQL Server MVP, esq... (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 athttp://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books...
Books Online for SQL Server 2000 athttp://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Thank you very much for the insight!! More followups:

For this query/purge idead:
SELECT @loadid = MAX(LoadID)
FROM loads
WHERE CreateDate < dateadd(SECOND, -40, CURRENT_TIMESTAMP);
DELETE loads WHERE LoadID <= @loadid;

Why would i need an index on CreateDate,LoadiD ? Is it just for the
first query, or would the DELETE actually need it too? I'd be tempted
to do the first query with READ_UNCOMMITTED transaction level, then do
the DELETE on another default transaction isolation level connection.
The performance of the DELETE can be quite bad/slow, etc.., just so
long as it doesn't lock the system up.

To add some color i'm scared to add indexes because the system has
been thoroughly tested and minus the problems we're discussing here
its functioning quite well. Further to that, we support sql server
2000, 2005, and oracle, and have same/similar schemas for all, hence
its testing*3. Let me know if you think i could get away with NOT
adding the index you describe here.

To the point about the SNAPSHOT isolation, i didn't know about either
of those... we could experiment with those, but we'd still be in
trouble with sqlserver2000. Ahhh the joys of a database 'agnostic'
application!!!

Thanks again!!!



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.