dbTalk Databases Forums  

How to determine wait time?

comp.databases.ibm-db2 comp.databases.ibm-db2


Discuss How to determine wait time? in the comp.databases.ibm-db2 forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
oddbande
 
Posts: n/a

Default How to determine wait time? - 03-21-2011 , 06:19 AM






We have a stored procedure that seems to have poor performance. In
order to find out what is happening we
have used the monitor function MON_GET_PKG_CACHE_STMT_DETAILS to find
out the details. And what puzzles us is that there seems to be a lot
of wait time, but we are not able to find out where the wait time is
spent.

An example:
total_act_time = 5085
total_act_wait_time: 5051

So it seems to us that the procedure spends 99% of the time waiting.
But when we add up all the other
output elements concerned with wait it is like this:

fcm_tq_recv_wait_time = fcm_message_recv_wait_time =
fcm_tq_send_wait_time = fcm_message_send_wait_time =
lock_wait_time = log_buffer_wait_time = log_disk_wait_time =
audit_file_write_wait_time =
audit_subsystem_wait_time = diaglog_write_wait_time =
fcm_send_wait_time = fcm_recv_wait_time = 0

All off them are 0.
So where do the value for total_act_wait_time come from? And how can
we find out where the time is spent?

We are using DB2 9.7 on z/Linux.

Regards
Odd B Andersen, EDB ErgoGroup AS, Oslo, Norway

Reply With Quote
  #2  
Old   
Serge Rielau
 
Posts: n/a

Default Re: How to determine wait time? - 03-21-2011 , 03:40 PM






Comments from backstage:

Not all the wait times are being added below (missing IOs) so not
necessarily an error. I don't think he's using an appropriate interface
tho to diagnose stored procedure slowdown. The metrics reported for a
given section in the package cache only describe the processing of that
section - they are not the aggregates across all nested sections
(assuming the procedure he's investigating is issuing SQL statements).

This topic in the info center (in particular Figure 3) describes the
hierarchy of time-spent metrics available for statements, and also
indicates that the wait times reported do not include the wait times of
nested statements (see the final "Other" bullet point in the table):

http://publib.boulder.ibm.com/infoce.../c0055434.html


Something like the following topic in the info center may be more useful.

http://publib.boulder.ibm.com/infoce.../c0058761.html

This topic describes using an activity event monitor to capture
information about each stament a procedure executes and then aggregates
elapsed time across all children. Using this you could first identify
which statement(s) are taking the most time in the procedure. Then could
drill down on a given statement by feeding the statement metrics (in the
details_xml column of the activity table in the activity event monitor)
into the MON_FORMAT_XML_TIMES_BY_ROW table function to see a breakdown
of where that statement is spending it's time (could also choose to
explain the statements that are taking the most time).
---------
Also if you follow teh link in my footer (Oracle2DB2Wiki) you will find
a profiler which you may find useful.

--
Serge Rielau
SQL Architect DB2 for LUW, IBM Toronto Lab
Blog: tinyurl.com/SQLTips4DB2
Wiki: tinyurl.com/Oracle2DB2Wiki
Twitter: srielau

Reply With Quote
  #3  
Old   
oddbande
 
Posts: n/a

Default Re: How to determine wait time? - 03-22-2011 , 02:25 AM



On Mar 21, 10:40*pm, Serge Rielau <srie... (AT) ca (DOT) ibm.com> wrote:
Quote:
Comments from backstage:

Not all the wait times are being added below (missing IOs) so not
necessarily an error. I don't think he's using an appropriate interface
tho to diagnose stored procedure slowdown. The metrics reported for a
given section in the package cache only describe the processing of that
section - they are not the aggregates across all nested sections
(assuming the procedure he's investigating is issuing SQL statements).

This topic in the info center (in particular Figure 3) describes the
hierarchy of time-spent metrics available for statements, and also
indicates that the wait times reported do not include the wait times of
nested statements (see the final "Other" bullet point in the table):

http://publib.boulder.ibm.com/infoce...dex.jsp?topic=....

Something like the following topic in the info center may be more useful.

http://publib.boulder.ibm.com/infoce...dex.jsp?topic=....

This topic describes using an activity event monitor to capture
information about each stament a procedure executes and then aggregates
elapsed time across all children. Using this you could first identify
which statement(s) are taking the most time in the procedure. Then could
drill down on a given statement by feeding the statement metrics (in the
details_xml column of the activity table in the activity event monitor)
into the MON_FORMAT_XML_TIMES_BY_ROW table function to see a breakdown
of where that statement is spending it's time (could also choose to
explain the statements that are taking the most time).
---------
Also if you follow teh link in my footer (Oracle2DB2Wiki) you will find
a profiler which you may find useful.

--
Serge Rielau
SQL Architect DB2 for LUW, IBM Toronto Lab
Blog: * *tinyurl.com/SQLTips4DB2
Wiki: * *tinyurl.com/Oracle2DB2Wiki
Twitter: srielau
Thank you!
A very useful response (I hope :-))
I will look into this further and maybe find out more.

I also forgot to mention that this is not particular for one stored
procedure - all of them wriiten in SQL PL. There seems to be a very
high precentage wait time for all of them.

Regards
Odd B Andersen

Reply With Quote
  #4  
Old   
oddbande
 
Posts: n/a

Default Re: How to determine wait time? - 03-24-2011 , 12:55 PM



On 22 Mar, 09:25, oddbande <odd.ander... (AT) ergo (DOT) no> wrote:
Quote:
On Mar 21, 10:40*pm, Serge Rielau <srie... (AT) ca (DOT) ibm.com> wrote:





Comments from backstage:

Not all the wait times are being added below (missing IOs) so not
necessarily an error. I don't think he's using an appropriate interface
tho to diagnose stored procedure slowdown. The metrics reported for a
given section in the package cache only describe the processing of that
section - they are not the aggregates across all nested sections
(assuming the procedure he's investigating is issuing SQL statements).

This topic in the info center (in particular Figure 3) describes the
hierarchy of time-spent metrics available for statements, and also
indicates that the wait times reported do not include the wait times of
nested statements (see the final "Other" bullet point in the table):

http://publib.boulder.ibm.com/infoce...dex.jsp?topic=....

Something like the following topic in the info center may be more useful.

http://publib.boulder.ibm.com/infoce...dex.jsp?topic=....

This topic describes using an activity event monitor to capture
information about each stament a procedure executes and then aggregates
elapsed time across all children. Using this you could first identify
which statement(s) are taking the most time in the procedure. Then could
drill down on a given statement by feeding the statement metrics (in the
details_xml column of the activity table in the activity event monitor)
into the MON_FORMAT_XML_TIMES_BY_ROW table function to see a breakdown
of where that statement is spending it's time (could also choose to
explain the statements that are taking the most time).
---------
Also if you follow teh link in my footer (Oracle2DB2Wiki) you will find
a profiler which you may find useful.

--
Serge Rielau
SQL Architect DB2 for LUW, IBM Toronto Lab
Blog: * *tinyurl.com/SQLTips4DB2
Wiki: * *tinyurl.com/Oracle2DB2Wiki
Twitter: srielau

Thank you!
A very useful response (I hope :-))
I will look into this further and maybe find out more.

I also forgot to mention that this is not particular for one stored
procedure - all of them wriiten in SQL PL. There seems to be a very
high precentage wait time for all of them.

Regards
Odd B Andersen
I now have been told that pool_read/write_time was included in
wait_time. Sorry for not remembering that :-)

Most of the wait_time is just pool_read_time (97%). That explains why
wait_time is so high, but then I ask myself - why is pool_read_time so
high? I assume this is syncron I/O. And I have to look into it more in
detail now if there are some settings that need to be changed.

We have turned on AUTOMATIC wherever possible, including all
bufferpools, and I assume that this is alright. But I wonder if one
could have too many bufferpools, and if that could result in too much
syncron I/O? We have 40 bufferpools, and 454 user-defined tablespaces.
And if there is a limited amount of memory (12 GB in total) each
bufferpool is fairly small and there will be almost continuous reading-
in and writing-out of data. The database is 315 GB (test) and 1,3 TB
(production).

Just some thoughts on my part :-) But what I would like is some
suggestions on where to start looking?

Regards
Odd B Andersen

Reply With Quote
  #5  
Old   
Ian
 
Posts: n/a

Default Re: How to determine wait time? - 03-24-2011 , 07:17 PM



On Mar 24, 11:55*am, oddbande <odd.ander... (AT) ergo (DOT) no> wrote:


Quote:
I now have been told that pool_read/write_time was included in
wait_time. Sorry for not remembering that :-)

Most of the wait_time is just pool_read_time (97%). That explains why
wait_time is so high, but then I ask myself - why is pool_read_time so
high? I assume this is syncron I/O. And I have to look into it more in
detail now if there are some settings that need to be changed.

We have turned on AUTOMATIC wherever possible, including all
bufferpools, and I assume that this is alright. But I wonder if one
could have too many bufferpools, and if that could result in too much
syncron I/O? We have 40 bufferpools, and 454 user-defined tablespaces.
And if there is a limited amount of memory (12 GB in total) each
bufferpool is fairly small and there will be almost continuous reading-
in and writing-out of data. The database is 315 GB (test) and 1,3 TB
(production).

Just some thoughts on my part :-) But what I would like is some
suggestions on where to start looking?
You're spending 5.051 seconds waiting for physical I/O out of a total
of 5.085 seconds.

What are your average physical page read times? What is the
bufferpool hit ratio for the bufferpools affected by this stored
procedure?

I would hope that physical IOs are being serviced in <= 2 ms – which
would mean that you're doing ~2,500 physical IOs. If you have higher
IO service times (> 8ms) perhaps there is something wrong with the
disk subsystem?

Why do you have so many bufferpools? 40 sounds a little crazy to me
-- I tend to have 1 bufferpool per database page size, and make
exceptions on that depending on the workload. It's probably
reasonable to have 5-7 bufferpools, but I have a hard time imagining a
situation where 40 is necessary. As you say, you end up with a lot of
small bufferpools, which may be more trouble than it's worth. STMM
won't be your friend in a situation like this, either.

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.