dbTalk Databases Forums  

CBO cost and avg_row_len

comp.databases.oracle.server comp.databases.oracle.server


Discuss CBO cost and avg_row_len in the comp.databases.oracle.server forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
dias
 
Posts: n/a

Default CBO cost and avg_row_len - 07-09-2003 , 10:34 AM






Hi,

Can someone explain when the CBO takes into account the value of
avg_row_len to estimate an access plan cost ?

I suppose that the CBO uses this value to estimate the cost for a
table scan, in combination with the number of blocks and the value of
db_file_multiblock_read_count ?

Thanks

Reply With Quote
  #2  
Old   
Sybrand Bakker
 
Posts: n/a

Default Re: CBO cost and avg_row_len - 07-09-2003 , 12:40 PM






On 9 Jul 2003 08:34:56 -0700, ydias (AT) hotmail (DOT) com (dias) wrote:

Quote:
Hi,

Can someone explain when the CBO takes into account the value of
avg_row_len to estimate an access plan cost ?

I suppose that the CBO uses this value to estimate the cost for a
table scan, in combination with the number of blocks and the value of
db_file_multiblock_read_count ?

Thanks
It doesn't. It would also be unnecessary, as a block is a block is a
block and a table scan is a table scan is a table scan. Oracle will
always read a complete block. The number of blocks in use is taken
into account.


Sybrand Bakker, Senior Oracle DBA

To reply remove -verwijderdit from my e-mail address


Reply With Quote
  #3  
Old   
dias
 
Posts: n/a

Default Re: CBO cost and avg_row_len - 07-10-2003 , 05:08 AM



Thanks Sybrand,

But why DBMS_STATS.GATHER_TABLE_STATS generates 3 statistics :
num_rows, blocks and avg_row_len ?

Dias

Sybrand Bakker <gooiditweg (AT) sybrandb (DOT) demon.nl> wrote

Quote:
On 9 Jul 2003 08:34:56 -0700, ydias (AT) hotmail (DOT) com (dias) wrote:

Hi,

Can someone explain when the CBO takes into account the value of
avg_row_len to estimate an access plan cost ?

I suppose that the CBO uses this value to estimate the cost for a
table scan, in combination with the number of blocks and the value of
db_file_multiblock_read_count ?

Thanks

It doesn't. It would also be unnecessary, as a block is a block is a
block and a table scan is a table scan is a table scan. Oracle will
always read a complete block. The number of blocks in use is taken
into account.


Sybrand Bakker, Senior Oracle DBA

To reply remove -verwijderdit from my e-mail address

Reply With Quote
  #4  
Old   
Mark D Powell
 
Posts: n/a

Default Re: CBO cost and avg_row_len - 07-10-2003 , 08:41 AM



ydias (AT) hotmail (DOT) com (dias) wrote in message news:<55a68b47.0307100208.30ffd38 (AT) posting (DOT) google.com>...
Quote:
Thanks Sybrand,

But why DBMS_STATS.GATHER_TABLE_STATS generates 3 statistics :
num_rows, blocks and avg_row_len ?

Dias

Sybrand Bakker <gooiditweg (AT) sybrandb (DOT) demon.nl> wrote

On 9 Jul 2003 08:34:56 -0700, ydias (AT) hotmail (DOT) com (dias) wrote:

Hi,

Can someone explain when the CBO takes into account the value of
avg_row_len to estimate an access plan cost ?

I suppose that the CBO uses this value to estimate the cost for a
table scan, in combination with the number of blocks and the value of
db_file_multiblock_read_count ?

Thanks

It doesn't. It would also be unnecessary, as a block is a block is a
block and a table scan is a table scan is a table scan. Oracle will
always read a complete block. The number of blocks in use is taken
into account.


Sybrand Bakker, Senior Oracle DBA

To reply remove -verwijderdit from my e-mail address
Dias, Sybrand answered your statement on scanning. but in regards to
your first question on when the avg_row_len statistic is used I would
like to add a couple of comments.

First, I am not sure when avg_row_len is used, but I believe that all
statistics gathered by dbms_stats, unlike some of those gathered by
the analyze commad (such as chained rows), are used by the CBO. The
use may be indirect in that some statistics may be used in calculating
other statistics that are directly used. For some reason I think the
use of avg_row_len is in estimating logical IO, but I would not bank
on that.

For the average DBA it is probably much more important to know which
statistics are important to the optimizer and to be able to recognize
when they need recalculation than to know exactly how the CBO uses
them.

HTH -- Mark D Powell --


Reply With Quote
  #5  
Old   
Niall Litchfield
 
Posts: n/a

Default Re: CBO cost and avg_row_len - 07-10-2003 , 10:30 AM



"Mark D Powell" <Mark.Powell (AT) eds (DOT) com> wrote

Quote:
ydias (AT) hotmail (DOT) com (dias) wrote in message
news:<55a68b47.0307100208.30ffd38 (AT) posting (DOT) google.com>...
Dias, Sybrand answered your statement on scanning. but in regards to
your first question on when the avg_row_len statistic is used I would
like to add a couple of comments.

First, I am not sure when avg_row_len is used, but I believe that all
statistics gathered by dbms_stats, unlike some of those gathered by
the analyze commad (such as chained rows), are used by the CBO. The
use may be indirect in that some statistics may be used in calculating
other statistics that are directly used. For some reason I think the
use of avg_row_len is in estimating logical IO, but I would not bank
on that.
I agree, if one does a 10053 trace then AVG_ROW_LEN is one of the pieces of
'BASE STATISTICAL INFORMATION' listed strongly suggesting that it is used in
someway. If the table isn't analyzed the default value of AVG_ROW_LEN is
used to estimate the number of rows in the table.

Quote:
For the average DBA it is probably much more important to know which
statistics are important to the optimizer and to be able to recognize
when they need recalculation than to know exactly how the CBO uses
them.
Absolutely, especially as this information doesn't appear to be published
anywhere.


--
Niall Litchfield
Oracle DBA
Audit Commission UK




Reply With Quote
  #6  
Old   
Jonathan Lewis
 
Posts: n/a

Default Re: CBO cost and avg_row_len - 07-10-2003 , 03:16 PM





I haven't checked yet how Oracle estimates the
row size of rows that go into the hash (build) table
of a hash join. Perhaps the avg_row_length is used
as a guideline at this point.

--
Regards

Jonathan Lewis
http://www.jlcomp.demon.co.uk

The educated person is not the person
who can answer the questions, but the
person who can question the answers -- T. Schick Jr


One-day tutorials:
http://www.jlcomp.demon.co.uk/tutorial.html

____Finland__September 22nd - 24th
____Norway___September 25th - 26th


Three-day seminar:
see http://www.jlcomp.demon.co.uk/seminar.html
____USA_(CA, TX)_August
____USA__October
____UK___November


The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html


"Mark D Powell" <Mark.Powell (AT) eds (DOT) com> wrote

Quote:
First, I am not sure when avg_row_len is used, but I believe that
all
statistics gathered by dbms_stats, unlike some of those gathered by
the analyze commad (such as chained rows), are used by the CBO. The
use may be indirect in that some statistics may be used in
calculating
other statistics that are directly used. For some reason I think
the
use of avg_row_len is in estimating logical IO, but I would not bank
on that.





Reply With Quote
  #7  
Old   
Niall Litchfield
 
Posts: n/a

Default Re: CBO cost and avg_row_len - 07-10-2003 , 04:36 PM



"Quarkman" <quarkman (AT) myrealbox (DOT) com> wrote

Quote:
Just a wild stab in the dark here... but if BLOCKS=400000 and
avg_row_len=50 and num_rows=10, you can be pretty certain that most of
those blocks under the high water mark are full of nothing but fresh air.
Packed optimally, the table should only consume 500 bytes, which should
comfortably fit into one block. The statistics are therefore screaming at
you, "Inflated High Water Mark Syndrome!!" (ie, huge inserts, followed by
massive deletes).

If you asked me then to do a select * from table, I would ordinarily
expect
to do a full table scan, and never mind visiting the index. But with
statistics like those, I suspect I'd rather access the table via the
index.
Indexes give you rowid access to the table data, and you don't have to
wander through 399999 empty blocks to retrieve all 10 rows.

So, yes: knowing your table is fluffier than a very fluffy soufflé is
actually important to the optimizer, and avg_row_len is needed to tell it
that.
Except it don't necessarily work that way


SQL*Plus: Release 9.0.1.0.1 - Production on Thu Jul 10 22:13:18 2003

(c) Copyright 2001 Oracle Corporation. All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.0.1.1.1 - Production
With the Partitioning option
JServer Release 9.0.1.1.1 - Production

SQL> create table tester(id number,padder char(42));

Table created.

SQL> begin
2 for i in 1..10 loop
3 for j in 1..150000 loop
4 insert into tester values(j,to_char(j));
5 end loop;
6 commit;
7 delete tester where id < 2000000;
8 insert into tester values (10000000*i,to_char(i));
9 end loop;
10 commit;
11* end;
SQL> /

PL/SQL procedure successfully completed.

SQL> create index idx_tester on tester(id);

Index created.

SQL> analyze table tester compute statistics;

Table analyzed.

SQL> select blocks,num_rows,avg_row_len from dba_tables where
table_name='TESTER
';

BLOCKS NUM_ROWS AVG_ROW_LEN
---------- ---------- -----------
1182 10 49

SQL> alter table tester modify id not null;

Table altered.

SQL> analyze table tester compute statistics;

Table analyzed.

SQL> select blocks,num_rows,avg_row_len from dba_tables where
table_name='TESTER
';

BLOCKS NUM_ROWS AVG_ROW_LEN
---------- ---------- -----------
1182 10 49

SQL> set autot on explain stat
SQL> ;
1* select /*+ index(tester idx_tester) */ * from tester
SQL> /

ID PADDER
---------- ------------------------------------------
10000000 1
20000000 2
30000000 3
40000000 4
50000000 5
60000000 6
70000000 7
80000000 8
90000000 9
100000000 10

10 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=10 Bytes=440)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TESTER' (Cost=2 Card=10
Bytes=440)

2 1 INDEX (FULL SCAN) OF 'IDX_TESTER' (NON-UNIQUE) (Cost=1 C
ard=10)





Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
1252 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed

SQL> select * from tester;

ID PADDER
---------- ------------------------------------------
20000000 2
40000000 4
60000000 6
80000000 8
100000000 10
90000000 9
50000000 5
30000000 3
70000000 7
10000000 1

10 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=180 Card=10 Bytes=44
0)

1 0 TABLE ACCESS (FULL) OF 'TESTER' (Cost=180 Card=10 Bytes=44
0)





Statistics
----------------------------------------------------------
0 recursive calls
10 db block gets
1183 consistent gets
0 physical reads
0 redo size
1252 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed


So without the hint but with the stats avg_row_len and num_rows correctly
populated Oracle still chooses the wrong plan.

Jonathan's suggestion does make sense though. In general row size might be
relevant for costs of various joins. Not just hash but sort-merge etc might
be affected as well.


--
Niall Litchfield
Oracle DBA
Audit Commission UK
*****************************************
Please include version and platform
and SQL where applicable
It makes life easier and increases the
likelihood of a good answer
******************************************




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.