Strange Deadlock Issue -
10-09-2007
, 02:30 PM
Hello all,
I am running DB2 UDB v8.1 with fixpack 14 on AIX 5.3. This is a multi-
thread environment. We received a -911 rc=2. Basically, the series
of updates on these two threads do not touch the same rows. Here is
the scenario:
Thread1 <previous transaction>
Thread1 COMMIT
Thread2 <previous transaction>
Thread2 COMMIT
Thread1 UPDATE table1 WHERE col2 = 111111
Thread1 UPDATE successful
Thread2 UPDATE table1 WHERE col2 = 222222
Thread2 UPDATE successful
Thread1 READ table2 (no holds)
Thread1 READ successful
Thread1 READ table1 (no holds)
Thread1 READ successful
Thread2 READ table1 (no holds)
Thread2 READ successful
Thread2 READ table1 (no holds)
Thread2 READ successful
Thread2 UPDATE table1 WHERE col2 = 222222
Thread2 UPDATE successful
Thread1 UPDATE table1 WHERE col2 = 111111 <--------seems to be where
contention started for thread1
Thread2 READ table2 WHERE col1 = ZZZZZZ
Thread2 READ successfull
Thread2 UPDATE table2 WHERE col1 = ZZZZZZ
Thread2 UPDATE successful
Thread2 UPDATE table1 WHERE col2 = 222222
Thread1 UPDATE successful
Thread2 UPDATE FAILED sqlcode -911 <----------------------------
thread 2 fails
As you can see, none of the rows between the two threads are
connected. I dont understand why we would get a deadlock but here is
what I see in the deadlock monitor:
67) Deadlock Event ...
Deadlock ID: 1
Number of applications deadlocked: 2
Deadlock detection time: 10/05/2007 02:07:53.408257
Rolled back Appl participant no: 2
Rolled back Appl Id: *LOCAL.hdbuser.071004215757
Rolled back Appl seq number: : 0008
68) Connection Header Event ...
Appl Handle: 22
Appl Id: *LOCAL.xxuser.071004215757
Appl Seq number: 0008
DRDA AS Correlation Token: *LOCAL.xxuser.071004215757
Program Name : xxserver
Authorization Id: XXUSER
Execution Id : root
Codepage Id: 819
Territory code: 1
Client Process Id: 233142
Client Database Alias: XXALIAS
Client Product Id: SQL08027
Client Platform: AIX
Client Communication Protocol: Local
Client Network Name: xxnetwork
Connect timestamp: 10/04/2007 21:57:57.648385
69) Deadlocked Connection ...
Deadlock ID: 1
Participant no.: 2
<-----------------------------------------------------------------------------------
Here is P2
Participant no. holding the lock: 1
Appl Id: *LOCAL.xxuser.071004215757
Appl Seq number: 0008
Appl Id of connection holding the lock: *LOCAL.xxuser.071004215753
Seq. no. of connection holding the lock: 0009
Lock wait start time: 10/05/2007 02:07:47.192943
Lock Name : 0x00030004000080040000000052
Lock Attributes : 0x00000000
Release Flags : 0x00000004
Lock Count : 1
Hold Count : 0
Current Mode : none
Deadlock detection time: 10/05/2007 02:07:53.476248
Table of lock waited on : TABLE1
Schema of lock waited on : XXUSER
Tablespace of lock waited on : TABLE1_TS
Type of lock: Row
Mode of lock: X - Exclusive
Mode application requested on lock: U - Update
Node lock occured on: 0
Lock object name: 32772
<--------------------------------------------This should be P1's
object (we are not updating it!)
Application Handle: 22
Deadlocked Statement:
Type : Dynamic
Operation: Fetch
Section : 1
Creator : DB2USER
Package : UPDATREC
Cursor : C1
<--------------------------------------------------------------- uses
cursor C1
Cursor was blocking: FALSE
Text : SELECT * FROM TABLE1 WHERE col1 = '111111' FOR UPDATE
OF xxxx
List of Locks:
Lock Name : 0x000002FA0000000100014A0056
Lock Attributes : 0x00000000
Release Flags : 0x40000000
Lock Count : 1
Hold Count : 0
Lock Object Name : 0
Object Type : Internal - Variation
Mode : S - Share
Lock Name : 0x0005000400008E010000000052
Lock Attributes : 0x00000000
Release Flags : 0x00000002
Lock Count : 1
Hold Count : 0
Lock Object Name : 36353
Object Type : Row
Tablespace Name : TABLE2_TS
Table Schema : XXUSER
Table Name : TABLE2
Mode : X - Exclusive
Lock Name : 0x00030004000080050000000052
Lock Attributes : 0x00000000
Release Flags : 0x00000001
Lock Count : 1
Hold Count : 0
Lock Object Name : 32773 <----------------------this
is the object it SHOULD be trying to lock again
Object Type : Row
Tablespace Name : TABLE1_TS
Table Schema : XXUSER
Table Name : TABLE1
Mode : X - Exclusive
Lock Name : 0x555044415452454318F89A3041
Lock Attributes : 0x00000000
Release Flags : 0x40000000
Lock Count : 1
Hold Count : 0
Lock Object Name : 0
Object Type : Internal - Plan
Mode : S - Share
Lock Name : 0x5245414452454320A008FD2641
Lock Attributes : 0x00000000
Release Flags : 0x40000000
Lock Count : 1
Hold Count : 0
Lock Object Name : 0
Object Type : Internal - Plan
Mode : S - Share
Lock Name : 0x00030004000000000000000054
Lock Attributes : 0x00000000
Release Flags : 0x40000005
Lock Count : 2
Hold Count : 1
Lock Object Name : 4
Object Type : Table
Tablespace Name : TABLE1_TS
Table Schema : XXUSER
Table Name : TABLE1
Mode : IX - Intent Exclusive
Lock Name : 0x00050004000000000000000054
Lock Attributes : 0x00000000
Release Flags : 0x00000002
Lock Count : 1
Hold Count : 0
Lock Object Name : 4
Object Type : Table
Tablespace Name : TABLE2_TS
Table Schema : XXUSER
Table Name : TABLE2
Mode : IX - Intent Exclusive
Locks Held: 7
Locks in List: 7
70) Connection Header Event ...
Appl Handle: 20
Appl Id: *LOCAL.xxuser.071004215753
Appl Seq number: 0009
DRDA AS Correlation Token: *LOCAL.xxuser.071004215753
Program Name : xxserver
Authorization Id: XXUSER
Execution Id : root
Codepage Id: 819
Territory code: 1
Client Process Id: 233142
Client Database Alias: XXALIAS
Client Product Id: SQL08027
Client Platform: AIX
Client Communication Protocol: Local
Client Network Name: xxnetwork
Connect timestamp: 10/04/2007 21:57:53.758032
71) Deadlocked Connection ...
Deadlock ID: 1
Participant no.: 1
<---------------------------------------------------------------------
Here is P1
Participant no. holding the lock: 2
Appl Id: *LOCAL.xxuser.071004215753
Appl Seq number: 0009
Appl Id of connection holding the lock: *LOCAL.xxuser.071004215757
Seq. no. of connection holding the lock: 0008
Lock wait start time: 10/05/2007 02:07:46.779807
Lock Name : 0x00030004000080050000000052
Lock Attributes : 0x00000000
Release Flags : 0x00000001
Lock Count : 1
Hold Count : 0
Current Mode : none
Deadlock detection time: 10/05/2007 02:07:53.496921
Table of lock waited on : TABLE1
Schema of lock waited on : XXUSER
Tablespace of lock waited on : TABLE1_TS
Type of lock: Row
Mode of lock: X - Exclusive
Mode application requested on lock: U - Update
Node lock occured on: 0
Lock object name: 32773
<------------------------------------------------------------but,
but..this is P2's object
Application Handle: 20
Deadlocked Statement:
Type : Dynamic
Operation: Fetch
Section : 1
Creator : DB2USER
Package : UPDATREC
Cursor : C1
<--------------------------------------------------------------------------------
C1 again (is this not a static?)
Cursor was blocking: FALSE
Text : SELECT * FROM PDEV WHERE device = '/dev/rmt20' FOR
UPDATE OF su
ccessive_error_count
List of Locks:
Lock Name : 0x00030004000080040000000052
Lock Attributes : 0x00000000
Release Flags : 0x00000001
Lock Count : 1
Hold Count : 0
Lock Object Name : 32772
Object Type : Row
Tablespace Name : TABLE1_TS
Table Schema : XXUSER
Table Name : TABLE1
Mode : X - Exclusive
Lock Name : 0x00000284000000010001330056
Lock Attributes : 0x00000000
Release Flags : 0x40000000
Lock Count : 1
Hold Count : 0
Lock Object Name : 0
Object Type : Internal - Variation
Mode : S - Share
Lock Name : 0x555044415452454318F89A3041
Lock Name : 0x555044415452454318F89A3041
Lock Attributes : 0x00000000
Release Flags : 0x40000000
Lock Count : 1
Hold Count : 0
Lock Object Name : 0
Object Type : Internal - Plan
Mode : S - Share
Lock Name : 0x5245414452454320A008FD2641
Lock Attributes : 0x00000000
Release Flags : 0x40000000
Lock Count : 1
Hold Count : 0
Lock Object Name : 0
Object Type : Internal - Plan
Mode : S - Share
Lock Name : 0x00030004000000000000000054
Lock Attributes : 0x00000000
Release Flags : 0x00000001
Lock Count : 1
Hold Count : 1
Lock Object Name : 4
Object Type : Table
Tablespace Name : TABLE1_TS
Table Schema : XXUSER
Table Name : TABLE1
Mode : IX - Intent Exclusive
Locks Held: 5
Locks in List: 5
So, both of these threads call the same APIs (One that does a SELECT
to read a record and one that does an UPDATE. Both APIs are set up for
multithread. The UPDATE API uses the cursor C1 continuously. Could
this somehow cause this sort of problem between process/threads? Is
the cursor name not static.
I dont think it is necessarily acting like a global because this code
has been running for a year and a half and I havent seen this type of
problem before. However, we dont have very many functions that need
to perform a series of updates before it can commit. This particular
setup is busier in regards to this process so perhaps that is why we
are just now seeing the problem.
Any observations, comments, cajoling ? |