dbTalk Databases Forums  

Identify Optimizer issues

comp.databases.informix comp.databases.informix


Discuss Identify Optimizer issues in the comp.databases.informix forum.



Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old   
mohitanchlia@gmail.com
 
Posts: n/a

Default Identify Optimizer issues - 01-02-2008 , 11:45 AM






Version: IDS 10

Recently I saw some inconsistencies in optimizer taking incorrect
index path. This Bug has been reported. Now I am attempting to do the
following:

1. Write perl script to get all the "Select" from the code. This seems
to be easy.
2. Execute the select statement with explain on.
3. Parse the results in sqexplain file to determine if it took correct
index path. Basically, for large tables it should take index path if
head of the index is part of the where clause. Is there a good way of
doing this ? Has anybody done this before ? I am trying to proactively
identify those queries that has problems, so that we can use Index
directives, if required, while bug is being fixed.

Reply With Quote
  #2  
Old   
Art S. Kagel (Oninit LLC)
 
Posts: n/a

Default Re: Identify Optimizer issues - 01-02-2008 , 12:16 PM






mohitanchlia (AT) gmail (DOT) com wrote:
Quote:
Version: IDS 10

Platform and exact version/patch level are helpful.
Recently I saw some inconsistencies in optimizer taking incorrect
index path. This Bug has been reported. Now I am attempting to do the
following:

1. Write perl script to get all the "Select" from the code. This seems
to be easy.
2. Execute the select statement with explain on.
3. Parse the results in sqexplain file to determine if it took correct
index path. Basically, for large tables it should take index path if
head of the index is part of the where clause. Is there a good way of
doing this ? Has anybody done this before ? I am trying to proactively
identify those queries that has problems, so that we can use Index
directives, if required, while bug is being fixed.

First question: Are the data distributions up-to-date? How were they
gathered (ie what tool or sequence of UPDATE STATISTICS commands did you
use to create the distributions?

Second question: What settings are in the ONCONFIG file for OPTCOMPIND
and OPTGOAL?

Third question: Do the sessions/apps execut SET OPTIMIZATION [LOW|HIGH],
[ALL_ROWS|FIRST_ROWS]?

General comment, IDS feels free to decide if using another index will
improve performance over using the obvious one. If you have a join
between the large table and another on an index leading column and a
filter on a different index leading column the engine may decide to use
the latter if its filter value is greater then the filter value of
joining to the second table at eliminating rows from the result set.

Art S. Kagel
Oninit



Reply With Quote
  #3  
Old   
mohitanchlia@gmail.com
 
Posts: n/a

Default Re: Identify Optimizer issues - 01-02-2008 , 12:49 PM



On Jan 2, 9:16*am, "Art S. Kagel (Oninit LLC)" <a... (AT) oninit (DOT) com>
wrote:
Quote:
mohitanch... (AT) gmail (DOT) com wrote:
Version: IDS 10

Platform and exact version/patch level are helpful.

Recently I saw some inconsistencies in optimizer taking incorrect
index path. This Bug has been reported. Now I am attempting to do the
following:

1. Write perl script to get all the "Select" from the code. This seems
to be easy.
2. Execute the select statement with explain on.
3. Parse the results in sqexplain file to determine if it took correct
index path. Basically, for large tables it should take index path if
head of the index is part of the where clause. Is there a good way of
doing this ? Has anybody done this before ? I am trying to proactively
identify those queries that has problems, so that we can use Index
directives, if required, while bug is being fixed.

First question: *Are the data distributions up-to-date? *How were they
gathered (ie what tool or sequence of UPDATE STATISTICS commands did you
use to create the distributions?

Second question: What settings are in the ONCONFIG file for OPTCOMPIND
and OPTGOAL?

Third question: Do the sessions/apps execut SET OPTIMIZATION [LOW|HIGH],
[ALL_ROWS|FIRST_ROWS]?

General comment, IDS feels free to decide if using another index will
improve performance over using the obvious one. *If you have a join
between the large table and another on an index leading column and a
filter on a different index leading column the engine may decide to use
the latter if its filter value is greater then the filter value of
joining to the second table at eliminating rows from the result set.

Art S. Kagel
Oninit
Yes I evaluated all the options that you asked for and also saw the
difference in the execution time of SQL.

OPTCOMPIND is set to 0 so that it chooses Index path, OPT_GOAL is set
to -1 to return ALL_ROWS. Update Statistics were run before running
the SQL. Update stats are run as high for all the Indexes that are
head of index.

Sessions don't execute SET OPTIMIZATION statement.


Reply With Quote
  #4  
Old   
Art S. Kagel (Oninit LLC)
 
Posts: n/a

Default Re: Identify Optimizer issues - 01-02-2008 , 01:07 PM



mohitanchlia (AT) gmail (DOT) com wrote:
Quote:
On Jan 2, 9:16 am, "Art S. Kagel (Oninit LLC)" <a... (AT) oninit (DOT) com
wrote:

mohitanch... (AT) gmail (DOT) com wrote:

Version: IDS 10

Platform and exact version/patch level are helpful.


Recently I saw some inconsistencies in optimizer taking incorrect
index path. This Bug has been reported. Now I am attempting to do the
following:

1. Write perl script to get all the "Select" from the code. This seems
to be easy.
2. Execute the select statement with explain on.
3. Parse the results in sqexplain file to determine if it took correct
index path. Basically, for large tables it should take index path if
head of the index is part of the where clause. Is there a good way of
doing this ? Has anybody done this before ? I am trying to proactively
identify those queries that has problems, so that we can use Index
directives, if required, while bug is being fixed.

First question: Are the data distributions up-to-date? How were they
gathered (ie what tool or sequence of UPDATE STATISTICS commands did you
use to create the distributions?

Second question: What settings are in the ONCONFIG file for OPTCOMPIND
and OPTGOAL?

Third question: Do the sessions/apps execut SET OPTIMIZATION [LOW|HIGH],
[ALL_ROWS|FIRST_ROWS]?

General comment, IDS feels free to decide if using another index will
improve performance over using the obvious one. If you have a join
between the large table and another on an index leading column and a
filter on a different index leading column the engine may decide to use
the latter if its filter value is greater then the filter value of
joining to the second table at eliminating rows from the result set.

Art S. Kagel
Oninit


Yes I evaluated all the options that you asked for and also saw the
difference in the execution time of SQL.

OPTCOMPIND is set to 0 so that it chooses Index path, OPT_GOAL is set
to -1 to return ALL_ROWS. Update Statistics were run before running
the SQL. Update stats are run as high for all the Indexes that are
head of index.

OPTCOMPIND 2 is a better setting for servers that mainly handle simple
queries where only a small percentage of the data in larger tables is
typically retrieved, but that shouldn't result in the wrong index being
selected.

Just updating HIGH stats on only the lead columns of indexes (I assume
that's what you mean above) may not be sufficient, especially if there
are multiple indexes beginning with each column. VERY IMPORTANT! Did
you also run UPDATE STATISTICS LOW on the full key of EVERY index? If
not, the optimizer may be using the low level stats from some index's
original creation time which will always indicate that the index has
only one level and one node, making that index VERY attractive. If all
of your indexes don't have their LOW stats updated then they're all
equal in the optimizer's eyes and it will simply select the first one
that is useful in the order they were created! And this reliance on the
low level stats is used to resolve ambiguities if two indexes have
similar filter values, so if you have multiple indexes starting with the
same column, or several indexes that based only on the distributions of
their lead columns are of mearly equal value for filtering, and you
don't have at least MEDIUM level stats on the remaining keys in the
index, the optimizer will give great weight to the LOW level stats (and
there's no record before IDS 11.10 of how old or out-of-date those stats
are). I suspect that this is your problem here.

If you don't have my dostats utility, consider getting and using it,
otherwise, read the Performance Guide and John Miller's white paper on
Update Statistics and run the full recommended suite of commands in
those documents as dostats does. Then see what happens. Especially if
you look at the SET EXPLAIN output and the row count estimate there is
off by more than a trivial amount. Also you might post the queries that
are giving you trouble we may be able to help reformat the SQL so that
the optimizer does a better job with it.

Dostats is included in the package, utils2_ak, available for download
from the IIUG Software Repository.

Quote:
Sessions don't execute SET OPTIMIZATION statement.

Art S. Kagel
Oninit



Reply With Quote
  #5  
Old   
mohitanchlia@gmail.com
 
Posts: n/a

Default Re: Identify Optimizer issues - 01-02-2008 , 02:22 PM



On Jan 2, 10:07*am, "Art S. Kagel (Oninit LLC)" <a... (AT) oninit (DOT) com>
wrote:
Quote:
mohitanch... (AT) gmail (DOT) com wrote:
On Jan 2, 9:16 am, "Art S. Kagel (Oninit LLC)" <a... (AT) oninit (DOT) com
wrote:

mohitanch... (AT) gmail (DOT) com wrote:

Version: IDS 10

Platform and exact version/patch level are helpful.

Recently I saw some inconsistencies in optimizer taking incorrect
index path. This Bug has been reported. Now I am attempting to do the
following:

1. Write perl script to get all the "Select" from the code. This seems
to be easy.
2. Execute the select statement with explain on.
3. Parse the results in sqexplain file to determine if it took correct
index path. Basically, for large tables it should take index path if
head of the index is part of the where clause. Is there a good way of
doing this ? Has anybody done this before ? I am trying to proactively
identify those queries that has problems, so that we can use Index
directives, if required, while bug is being fixed.

First question: *Are the data distributions up-to-date? *How were they
gathered (ie what tool or sequence of UPDATE STATISTICS commands did you
use to create the distributions?

Second question: What settings are in the ONCONFIG file for OPTCOMPIND
and OPTGOAL?

Third question: Do the sessions/apps execut SET OPTIMIZATION [LOW|HIGH],
[ALL_ROWS|FIRST_ROWS]?

General comment, IDS feels free to decide if using another index will
improve performance over using the obvious one. *If you have a join
between the large table and another on an index leading column and a
filter on a different index leading column the engine may decide to use
the latter if its filter value is greater then the filter value of
joining to the second table at eliminating rows from the result set.

Art S. Kagel
Oninit

Yes I evaluated all the options that you asked for and also saw the
difference in the execution time of SQL.

OPTCOMPIND is set to 0 so that it chooses Index path, OPT_GOAL is set
to -1 to return ALL_ROWS. Update Statistics were run before running
the SQL. Update stats are run as high for all the Indexes that are
head of index.

OPTCOMPIND 2 is a better setting for servers that mainly handle simple
queries where only a small percentage of the data in larger tables is
typically retrieved, but that shouldn't result in the wrong index being
selected.

Just updating HIGH stats on only the lead columns of indexes (I assume
that's what you mean above) may not be sufficient, especially if there
are multiple indexes beginning with each column. *VERY IMPORTANT! *Did
you also run UPDATE STATISTICS LOW on the full key of EVERY index? *If
not, the optimizer may be using the low level stats from some index's
original creation time which will always indicate that the index has
only one level and one node, making that index VERY attractive. *If all
of your indexes don't have their LOW stats updated then they're all
equal in the optimizer's eyes and it will simply select the first one
that is useful in the order they were created! *And this reliance on the
low level stats is used to resolve ambiguities if two indexes have
similar filter values, so if you have multiple indexes starting with the
same column, or several indexes that based only on the distributions of
their lead columns are of mearly equal value for filtering, and you
don't have at least MEDIUM level stats on the remaining keys in the
index, the optimizer will give great weight to the LOW level stats (and
there's no record before IDS 11.10 of how old or out-of-date those stats
are). *I suspect that this is your problem here.

If you don't have my dostats utility, consider getting and using it,
otherwise, read the Performance Guide and John Miller's white paper on
Update Statistics and run the full recommended suite of commands in
those documents as dostats does. *Then see what happens. *Especially if
you look at the SET EXPLAIN output and the row count estimate there is
off by more than a trivial amount. *Also you might post the queries that
are giving you trouble we may be able to help reformat the SQL so that
the optimizer does a better job with it.

Dostats is included in the package, utils2_ak, available for download
from the IIUG Software Repository.

Sessions don't execute SET OPTIMIZATION statement.

Art S. Kagel
Oninit- Hide quoted text -

- Show quoted text -
I am using dostats utility to generate update stats commands. Still I
see inconsistency, and all I am trying to do is develop an automated
mechanism that will let me identify the queries that are taking wrong
path. This will help me in looking at different things more closely
and also help me in identifying if it's because of distributions or if
it's a bug or sql itself. Is there a way I can parse sqexplain output
or some other mechanism that will help me in identifying those queries
that are not taking correct index path. It may not necessarily be a
problem or a bug, it may just be because of distributions or sql
itself but atleast it will tell me that something need to be looked
at. From my original posting, this is what I am trying to do:

1. Write perl script to get all the "Select" from the code. This seems
to be easy.
2. Execute the select statement with explain on.
3. Parse the results in sqexplain file to determine if it took correct
index path. Basically, for large tables it should take index path if
head of the index is part of the where clause. Is there a good way
of doing this ? Has anybody done this before ? I am trying to
proactively
identify those queries that has problems, so that we can use Index
directives, if required, while bug is being fixed.

Is it doable ? Is it bad idea ?


Reply With Quote
  #6  
Old   
Jonathan Leffler
 
Posts: n/a

Default Re: Identify Optimizer issues - 01-03-2008 , 03:38 AM



mohitanchlia (AT) gmail (DOT) com wrote:
Quote:
Version: IDS 10

Recently I saw some inconsistencies in optimizer taking incorrect
index path. This Bug has been reported. Now I am attempting to do the
following:

1. Write perl script to get all the "Select" from the code. This seems
to be easy.
2. Execute the select statement with explain on.
3. Parse the results in sqexplain file to determine if it took correct
index path. Basically, for large tables it should take index path if
head of the index is part of the where clause. Is there a good way of
doing this ? Has anybody done this before ? I am trying to proactively
identify those queries that has problems, so that we can use Index
directives, if required, while bug is being fixed.
I've seen Art's august discussions, but you cycled around to reiterating
your quesstion, so I'll start answering the original question again...

Step 1 is easy until and unless you start to deal with dynamically
constructed queries - for example, where the user specifies search
criteria to you, as in the CONSTRUCT statement in I4GL. Then you cannot
predict what the SQL will look like. (In the days of yore, I wrote code
that analyzed the output from CONSTRUCT and included more or fewer
tables in the FROM clause and join conditions in the WHERE clause
depending on the search criteria specified by the user.) Also,
depending on the language you are looking at, it may be more or less
easy to detect the end of the SQL statement. Finally, if your
applications generate temp tables for intermediate results, then
reproducing the contents of those accurately can be hard. (Note that
you can run UPDATE STATISTICS on a temp table, and it can be a good idea
to do so - though the latest versions of IDS do so automatically,
speaking a little loosely.)

Given the validity of Step 1 and the generated list of SQL, Step 2 is
doable.

Step 3 is the hard part - how do you determine whether it took the
correct index path? Presumably, you'll need to analyze the tables in
each query (which itself may not be trivial), then determine the
available indexes, and decide which of those indexes gives the biggest
bang for the buck. If you're not careful, you end up with a
non-negligible portion of an SQL optimizer!

So, it 'can' be done: IDS does it all the time. It is non-trivial.

--
Jonathan Leffler #include <disclaimer.h>
Email: jleffler (AT) earthlink (DOT) net, jleffler (AT) us (DOT) ibm.com
Guardian of DBD::Informix v2007.0914 -- http://dbi.perl.org/

publictimestamp.org/ptb/PTB-2182 haval 2008-01-03 06:00:04
C70CF9C9C126042A89BA3C218E19F609


Reply With Quote
Reply




Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.