dbTalk Databases Forums  

MERGE statement consuming all available CPU

comp.databases.ibm-db2 comp.databases.ibm-db2


Discuss MERGE statement consuming all available CPU in the comp.databases.ibm-db2 forum.



Reply
 
Thread Tools Display Modes
  #11  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: MERGE statement consuming all available CPU - 02-25-2011 , 01:35 AM






On 2011-02-24 23:20, Michel Esber wrote:
Quote:
Roughly 2000 logical index reads per execution, does not look that
extreme. What is the definition of the index used on
ASSET.TBL_ASSET_NET_DETAIL (SYSIBM.SQL110214080842620 ?) and what is
FIRSTKEYCARD, etc for that index?

/Lennart

Hi Lennart,

The index definition is (+MACHINE_ID) only, reverse_scans = 'Y'

Here are some other fields:

PCTFREE NLEAF NLEVELS FIRSTKEYCARD
FIRST2KEYCARD FIRST3KEYCARD FIRST4KEYCARD
FULLKEYCARD
------- -------------------- ------- --------------------
-------------------- -------------------- --------------------
--------------------
-1 432 3
75912 -1 -1
-1 75912

I'm a bit puzzled. The index is almost unique, so I don't understand the
amount of index logical reads (it's not biljons of them, so I'm not sure
that they are the real problem). Do you have a testsystem where you can
try to exchange:

WHEN MATCHED
AND ((E.MACHINE_HOSTNAME <> INDATA.MACHINE_HOSTNAME OR
E.MACHINE_IPDOMAIN <> INDATA.MACHINE_IPDOMAIN OR
E.LOGIN_NAME <> INDATA.LOGIN_NAME OR
E.LAN_GROUP_TYPE <> INDATA.LAN_GROUP_TYPE OR
E.LAN_GROUP_NAME <> INDATA.LAN_GROUP_NAME OR
E.FIRE_LOGIN_TRIGGERS <> INDATA.FIRE_LOGIN_TRIGGERS))
THEN

with

WHEN MATCHED
AND NOT ((E.MACHINE_HOSTNAME = INDATA.MACHINE_HOSTNAME AND
E.MACHINE_IPDOMAIN = INDATA.MACHINE_IPDOMAIN AND
E.LOGIN_NAME = INDATA.LOGIN_NAME AND
E.LAN_GROUP_TYPE = INDATA.LAN_GROUP_TYPE AND
E.LAN_GROUP_NAME = INDATA.LAN_GROUP_NAME AND
E.FIRE_LOGIN_TRIGGERS = INDATA.FIRE_LOGIN_TRIGGERS))
THEN

do a testrun and execute:

select NUM_EXECUTIONS, ROWS_READ, POOL_DATA_L_READS, POOL_DATA_P_READS,
POOL_INDEX_L_READS, POOL_INDEX_P_READS, TOTAL_EXEC_TIME,
TOTAL_USR_CPU_TIME
from sysibmadm.snapdyn_sql
where cast(stmt_text as varchar(2000)) like 'MERGE INTO
ASSET.TBL_ASSET_NET_DETAIL AS E USING (VALUES (%';

once again?

Reply With Quote
  #12  
Old   
Michel Esber
 
Posts: n/a

Default Re: MERGE statement consuming all available CPU - 02-25-2011 , 06:29 PM






Quote:
with

WHEN MATCHED
*AND NOT ((E.MACHINE_HOSTNAME = INDATA.MACHINE_HOSTNAME AND
* * * * * *E.MACHINE_IPDOMAIN = INDATA.MACHINE_IPDOMAIN AND
* * * * * *E.LOGIN_NAME = INDATA.LOGIN_NAME AND
* * * * * *E.LAN_GROUP_TYPE = INDATA.LAN_GROUP_TYPE AND
* * * * * *E.LAN_GROUP_NAME = INDATA.LAN_GROUP_NAME AND
* * * * * *E.FIRE_LOGIN_TRIGGERS = INDATA.FIRE_LOGIN_TRIGGERS))
THEN

do a testrun and execute:

select NUM_EXECUTIONS, ROWS_READ, POOL_DATA_L_READS, POOL_DATA_P_READS,
* * * *POOL_INDEX_L_READS, POOL_INDEX_P_READS, TOTAL_EXEC_TIME,
* * * *TOTAL_USR_CPU_TIME
from sysibmadm.snapdyn_sql
where cast(stmt_text as varchar(2000)) like 'MERGE INTO
ASSET.TBL_ASSET_NET_DETAIL AS E USING (VALUES (%';

once again?

Hi Lennart,

We have updated our application to use the NOT (A and B) instead of (A
or B), and we will send it to production servers very soon.

Also, we have opened a service request with IBM to help troubleshoot
the problem.

I will post more info about this very soon.

Thanks again,

Reply With Quote
  #13  
Old   
Michel Esber
 
Posts: n/a

Default Re: MERGE statement consuming all available CPU - 02-28-2011 , 11:42 AM



Quote:
We have updated our application to use the NOT (A and B) instead of (A
or B), and we will send it to production servers very soon.
Hi,

I have updated the application and here's the new info:

NUM_EXECUTIONS ROWS_READ POOL_DATA_L_READS
POOL_DATA_P_READS POOL_INDEX_L_READS POOL_INDEX_P_READS
TOTAL_EXEC_TIME TOTAL_USR_CPU_TIME
-------------------- -------------------- --------------------
-------------------- -------------------- --------------------
-------------------- --------------------
612 3414
3814 178 17019
125 51 0

Something I have noticed in the new access plan (which is similar to
the previous one I posted here):

Plan Details:
-------------


1) RETURN: (Return Result)
Cumulative Total Cost: 102.029
Cumulative CPU Cost: 117586
Cumulative I/O Cost: 4.08001
Cumulative Re-Total Cost: 27.0102
Cumulative Re-CPU Cost: 42551.5
Cumulative Re-I/O Cost: 1.08
Cumulative First Row Cost: 102.029
Estimated Bufferpool Buffers: 4.08001


Cumulative Total Cost: 102
Cumulative CPU cost: 117586 ...

Any thoughts?

Reply With Quote
  #14  
Old   
Serge Rielau
 
Posts: n/a

Default Re: MERGE statement consuming all available CPU - 02-28-2011 , 01:20 PM



Michael,

The plan for this MERGE statement looks squeaky clean as far as I can tell.

Sometimes people run into issues with MEREG because
a) DB2 can't prove that only one row in teh USINg mathes any given row
in the target.. You would see an ugly stack of operators above the
bottom join and the join with the UNION ALL in that case

b) DB2 decides to do HSJOIN which scans and U-locks the target table

c) Temps to avoid Halloween

You plan has none of these.
I do not believe this to be a compiler/optimizer problem.

Could it be that your updates kick of some nasty index maintenance?
Home many indexes on the table. (Doesn't explain maxing out CPU though)

Cheers
Serge


--
Serge Rielau
SQL Architect DB2 for LUW, IBM Toronto Lab
Blog: tinyurl.com/SQLTips4DB2
Wiki: tinyurl.com/Oracle2DB2Wiki
Twitter: srielau

Reply With Quote
  #15  
Old   
Sanjuro
 
Posts: n/a

Default Re: MERGE statement consuming all available CPU - 02-28-2011 , 02:34 PM



Try using the following statement as the RUNSTATS for
ASSET.TBL_ASSET_NET_DETAIL

RUNSTATS ON TABLE ASSET.TBL_ASSET_NET_DETAIL ON ALL COLUMNS WITH
DISTRIBUTION AND DETAILED INDEXES ALL ;

See if this changes the plan. If it has the desired effect, I would
expect the query cost to go up slightly, but the actual CPU usage
might go down.

Keep us posted on the final findings.

Cheers -
Sanjuro

On Feb 28, 12:42*pm, Michel Esber <smes... (AT) gmail (DOT) com> wrote:
Quote:
We have updated our application to use the NOT (A and B) instead of (A
or B), and we will send it to production servers very soon.

Hi,

I have updated the application and here's the new info:

NUM_EXECUTIONS * * * ROWS_READ * * * * * *POOL_DATA_L_READS
POOL_DATA_P_READS * *POOL_INDEX_L_READS * POOL_INDEX_P_READS
TOTAL_EXEC_TIME * * *TOTAL_USR_CPU_TIME
-------------------- -------------------- --------------------
-------------------- -------------------- --------------------
-------------------- --------------------
* * * * * * * * *612 * * * * * * * * 3414
3814 * * * * * * * * *178 * * * * * * * *17019
125 * * * * * * * * * 51 * * * * * * * * * *0

Something I have noticed in the new access plan (which is similar to
the previous one I posted here):

Plan Details:
-------------

* * * * 1) RETURN: (Return Result)
* * * * * * * * Cumulative Total Cost: * * * * *102.029
* * * * * * * * Cumulative CPU Cost: * * * * * *117586
* * * * * * * * Cumulative I/O Cost: * * * * * *4.08001
* * * * * * * * Cumulative Re-Total Cost: * * * 27.0102
* * * * * * * * Cumulative Re-CPU Cost: * * * * 42551.5
* * * * * * * * Cumulative Re-I/O Cost: * * * * 1..08
* * * * * * * * Cumulative First Row Cost: * * *102..029
* * * * * * * * Estimated Bufferpool Buffers: * 4.08001

Cumulative Total Cost: 102
Cumulative CPU cost: 117586 ...

Any thoughts?

Reply With Quote
  #16  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: MERGE statement consuming all available CPU - 02-28-2011 , 02:37 PM



On 2011-02-28 18:42, Michel Esber wrote:
Quote:
We have updated our application to use the NOT (A and B) instead of (A
or B), and we will send it to production servers very soon.

Hi,

I have updated the application and here's the new info:

NUM_EXECUTIONS ROWS_READ POOL_DATA_L_READS
POOL_DATA_P_READS POOL_INDEX_L_READS POOL_INDEX_P_READS
TOTAL_EXEC_TIME TOTAL_USR_CPU_TIME
-------------------- -------------------- --------------------
-------------------- -------------------- --------------------
-------------------- --------------------
612 3414
3814 178 17019
125 51 0

Something I have noticed in the new access plan (which is similar to
the previous one I posted here):

Plan Details:
-------------


1) RETURN: (Return Result)
Cumulative Total Cost: 102.029
Cumulative CPU Cost: 117586
Cumulative I/O Cost: 4.08001
Cumulative Re-Total Cost: 27.0102
Cumulative Re-CPU Cost: 42551.5
Cumulative Re-I/O Cost: 1.08
Cumulative First Row Cost: 102.029
Estimated Bufferpool Buffers: 4.08001


Cumulative Total Cost: 102
Cumulative CPU cost: 117586 ...

Any thoughts?
Do you still experience that the stmt is hogging your cpu? Looking at
the snapshot data we now have 6 rows_read per execution compared to 340
before. Logical data reads went from 341 to 6 and logical index reads
from 2000 to 28. Also total_exec_time went from 1.54 to 0.08, so I would
have guessed that the statement is more cpu friendly by now?

/Lennart

Reply With Quote
  #17  
Old   
Michel Esber
 
Posts: n/a

Default Re: MERGE statement consuming all available CPU - 03-02-2011 , 03:46 PM



Quote:
Do you still experience that the stmt is hogging your cpu? Looking at
the snapshot data we now have 6 rows_read per execution compared to 340
before. Logical data reads went from 341 to 6 and logical index reads
from 2000 to 28. Also total_exec_time went from 1.54 to 0.08, so I would
have guessed that the statement is more cpu friendly by now?
Hi Lennart,

Not exactly. I still get CPU ~100%.

I've received a reply from IBM. Just thought I'd share it with the
group:

"
Plan for the query looks good. I see no issues with the plan itself.

Going back and looking @ the data from last Friday, vmstat shows tons
of threads in run queue and blocked queue.

There are 4 merges and 1 grant on table TBL_ASSET_NET_DETAIL seen in
the dynamic sql snapshots. None of those are using a lot of CPU.

There are a total of 322 applications. Of those 120 are in Compiling
state, while 75 executing, 29 in disconnect pending. Rest in uow-
waiting or connect completed.

The 120 compiling apps do show some CPU usage.

Looking @ the stacks collected for all the edus from that time,

(i) I see a set of stacks in latch contention primarily
when allocating memory and during compilation. This may explains the
apps in compiling state.

0x00000035F46BA057 __sched_yield + 0x0007
0x00002AAAABF8DF55 sqloSpinLockConflict + 0x0159
0x00002AAAABFA6C1C _ZN7SMemSet12captureLatchEv + 0x00ba
0x00002AAAABF9B6E8 sqloCreateMemoryPoolEx + 0x0254
0x00002AAAABDE05C7
_Z10sqlnn_cmplP8sqeAgentP11sqlrrstrings17sqlnn_com pileModesP14sqlrr_cmpl
_enviiPP9sqlnq_qur + 0x040d

0x00002AAAABF8DF55 sqloSpinLockConflict + 0x0159
0x00002AAAABFA6C1C _ZN7SMemSet12captureLatchEv + 0x00ba
0x00002AAAABFABD30
_ZN12SMemBasePool19getNewChunkSubgroupEmjmP13SQLO_ MEM_POOLP12SMemLogEven
tPP17SqloChunkSubgroupbPP11SRawChunkCB
+ 0x016c
0x00002AAAADE63FA9
_ZN13SQLO_MEM_POOL19allocateMemoryBlockEmmmjmPPvPP 17SqloChunkSubgroupPP1
2SMemLostNodeP12SMemLogEvent + 0x029d
0x00002AAAADE61B1F sqlogmblkEx + 0x0461
0x00002AAAACB8105F _Z15sqlno_alloc_aubP13sqlno_globalsP10sqlno_apcbm
+
0x00d5
0x00002AAAACB809AD _Z15sqlno_grow_poolP13sqlno_globalsP10sqlno_apcbmi
+
0x0071

(ii) I see various EDUs in the OS system calls.

0x00000035F46CFF7A mmap64 + 0x000a
0x00000035F4660B3B _IO_file_doallocate + 0x007b
0x00000035F466D53C _IO_doallocbuf + 0x002c
0x00000035F466CC50 _IO_file_underflow + 0x0180
0x00000035F466D57E _IO_default_uflow + 0x000e
0x00000035F4668E3B getc + 0x00ab
0x00002AAAABF994A0 sqloGetCPUUsage + 0x031e
0x00002AAAAC92F0BA
_ZN10sqlmon_acb16agent_stmt_startEjPKhS1_S1_tjtttj jiP8sqlrr_cb +
0x0746



I suspect the threads in the OS system call is what is causing the
slowdown and threads in run queue and block queue.

Please ask the customer to test the following :-

1. unset [i] DB2_SYSTEM_MONITOR_SETTINGS=NULL
2. Disable statement monitor switch
Statement (DFT_MON_STMT) = ON

db2set DB2_SYSTEM_MONITOR_SETTINGS=
db2 update dbm cfg using DFT_MON_STMT OFF
db2 terminate
db2stop
db2start
Run the workload and let us know the results.

"

I did follow the instructions, but the CPU usage was still around
100% ...

Reply With Quote
  #18  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: MERGE statement consuming all available CPU - 03-02-2011 , 11:28 PM



On 2011-03-02 22:46, Michel Esber wrote:
Quote:
Do you still experience that the stmt is hogging your cpu? Looking at
the snapshot data we now have 6 rows_read per execution compared to 340
before. Logical data reads went from 341 to 6 and logical index reads
from 2000 to 28. Also total_exec_time went from 1.54 to 0.08, so I would
have guessed that the statement is more cpu friendly by now?

Hi Lennart,

Not exactly. I still get CPU ~100%.

I've received a reply from IBM. Just thought I'd share it with the
group:

Michael, thanks for sharing the info. If you manage to solve the
problem, it would be interesting to hear about it.


/Lennart

[...]

Reply With Quote
  #19  
Old   
whatever
 
Posts: n/a

Default Re: MERGE statement consuming all available CPU - 03-03-2011 , 08:18 AM



On Mar 3, 12:28*am, Lennart Jonsson <erik.lennart.jons... (AT) gmail (DOT) com>
wrote:
Quote:
On 2011-03-02 22:46, Michel Esber wrote:

Do you still experience that the stmt is hogging your cpu? Looking at
the snapshot data we now have 6 rows_read per execution compared to 340
before. Logical data reads went from 341 to 6 and logical index reads
from 2000 to 28. Also total_exec_time went from 1.54 to 0.08, so I would
have guessed that the statement is more cpu friendly by now?

Hi Lennart,

Not exactly. I still get CPU ~100%.

I've received a reply from IBM. Just thought I'd share it with the
group:

Michael, thanks for sharing the info. If you manage to solve the
problem, it would be interesting to hear about it.

/Lennart

[...]
Another Point to note....If you have that Many applications in
COMPILING State ...means....you need to Look at pckcachesz parameter.
Is that set to AUTOMATIC ?

The following monitor elements can help you determine whether you
should adjust this configuration parameter:
pkg_cache_lookups (package cache lookups)
pkg_cache_inserts (package cache inserts)
pkg_cache_size_top (package cache high water mark)
pkg_cache_num_overflows (package cache overflows)

Cheers...
Shashi Mannepalli

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.