dbTalk Databases Forums  

Total estimated I/O cost for statement

comp.databases.sybase comp.databases.sybase


Discuss Total estimated I/O cost for statement in the comp.databases.sybase forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
1dg618
 
Posts: n/a

Default Total estimated I/O cost for statement - 10-25-2011 , 05:52 PM






Below are two total estimated I/O cost for statement 1 for my two kind-
of similar queries. What does the number mean? The doc is confusing.

I assume the lower the number the better?


Total estimated I/O cost for statement 1 (at line 1): 20019.

vs.

Total estimated I/O cost for statement 1 (at line 1): 63543.

Reply With Quote
  #2  
Old   
albion26.eve@gmail.com
 
Posts: n/a

Default Re: Total estimated I/O cost for statement - 10-25-2011 , 06:15 PM






IO cost is a unit-less number. Essentially I think it's a weighting applied to the (estimated) logical and physical IO done by the query - 2 units for a logical IO and 18 units for a physical IO.

If I recall correctly, the first access of any page is assumed to be physical, and subsequent accesses assumed to be logical (i.e. perfect caching but with a cold cache initially). It can and will deviate from your real-world results.

Hope that helps,
Antony.

Reply With Quote
  #3  
Old   
Antony
 
Posts: n/a

Default Re: Total estimated I/O cost for statement - 10-25-2011 , 06:21 PM



To answer your other question, a qualified "yes" lower is better - in the sense that the ASE optimizer will seek to minimise this value in choosing the final plan. It's possible in practice, though, that the plan producing the fastest response to your query might not be the same as the one giving the lowest *estiamted* io cost.

Reply With Quote
  #4  
Old   
1dg618
 
Posts: n/a

Default Re: Total estimated I/O cost for statement - 10-25-2011 , 07:13 PM



On Oct 25, 6:21*pm, Antony <albion26.... (AT) gmail (DOT) com> wrote:
Quote:
To answer your other question, a qualified "yes" lower is better - in thesense that the ASE optimizer will seek to minimise this value in choosing the final plan. It's possible in practice, though, that the plan producing the fastest response to your query might not be the same as the one giving the lowest *estiamted* io cost.
You're correct. (1) has the lower *estimate* and the slower time.
Where as (2) has the higher *estimate* and the faster time. It's
always interesting because "things" make sense... and here comes the
wrench.


(1) Total estimated I/O cost for statement 1 (at line 1): 20019.

vs.

(2) Total estimated I/O cost for statement 1 (at line 1): 63543.

Reply With Quote
  #5  
Old   
1dg618
 
Posts: n/a

Default Re: Total estimated I/O cost for statement - 10-25-2011 , 07:20 PM



On Oct 25, 7:13*pm, 1dg618 <1dg... (AT) gmail (DOT) com> wrote:
Quote:
On Oct 25, 6:21*pm, Antony <albion26.... (AT) gmail (DOT) com> wrote:

To answer your other question, a qualified "yes" lower is better - in the sense that the ASE optimizer will seek to minimise this value in choosing the final plan. It's possible in practice, though, that the plan producing the fastest response to your query might not be the same as the one giving the lowest *estiamted* io cost.

You're correct. (1) has the lower *estimate* and the slower time.
Where as (2) has the higher *estimate* and the faster time. It's
always interesting because "things" make sense... and here comes the
wrench.

(1) Total estimated I/O cost for statement 1 (at line 1): 20019.

vs.

(2) Total estimated I/O cost for statement 1 (at line 1): 63543.

In the query plan for (2), "Index contains all needed columns. Base
table will not be read." The estimate was higher and the time was
faster.

Positioning by key.
Index contains all needed columns. Base table will not be read.
Keys are:

In the query plan for (1), it didn't have the "Index contains all
needed columns. Base table will not be read." The estimate was lower
and the time was slower.

Positioning by key.
Keys are:

I guess, I'll have to put together the tables, queries and plan and
post it.

Reply With Quote
  #6  
Old   
Derek Asirvadem
 
Posts: n/a

Default Re: Total estimated I/O cost for statement - 10-29-2011 , 01:21 AM



On Oct 26, 11:20*am, 1dg618 <1dg... (AT) gmail (DOT) com> wrote:

Quote:
(1) Total estimated I/O cost for statement 1 (at line 1): 20019.

vs.

(2) Total estimated I/O cost for statement 1 (at line 1): 63543.

In the query plan for (2), "Index contains all needed columns. Base
table will not be read." The estimate was higher and the time was
faster.

Positioning by key.
Index contains all needed columns. Base table will not be read.
Keys are:

In the query plan for (1), it didn't have the "Index contains all
needed columns. Base table will not be read." The estimate was lower
and the time was slower.

Positioning by key.
Keys are:
Well, that goes to show, you cannot ever take one single stat or
metric, or clause of a sentence, hold it up separate from the context,
forgetting the rest of the sats, metrics, clauses, and examine it in
isolation. That is fragmented, and cannot be relied upon. No, you
need to take all the stats, metrics, etc, and evaluate them as a
whole.

When I see that (2) is a covered query, I would take that as being the
overriding indicator of performance; the *estimated* IO cost becomes
very secondary.

Quote:
I guess, I'll have to put together the tables, queries and plan and
post it.
Yes, please. And:

SET STATISTICS IO ON
SET STATISTICS TIME ON

If you have installed it, DBISQL provides a very attractive visual
rather than text plan.

Regards
Derek

Reply With Quote
  #7  
Old   
1dg618
 
Posts: n/a

Default Re: Total estimated I/O cost for statement - 10-31-2011 , 10:36 AM



Hopefully, this helps out?

CREATE TABLE first
(
one_id INT NOT NULL,
two_id INT NOT NULL,
date DATETIME NOT NULL,
junk_id INT NOT NULL,
CONSTRAINT first_pkc_1
PRIMARY KEY CLUSTERED ( one_id, two_id, date, junk_id ) on
'default'
)
LOCK allpages ON 'default' partition BY roundrobin 1
GO

CREATE TABLE second
(
junk_id INT NOT NULL,
junk_desc varchar(255) NULL,
CONSTRAINT second_pkc_1
PRIMARY KEY CLUSTERED ( junk_id ) on 'default'
)
LOCK allpages ON 'default' partition BY roundrobin 1
GO

select
a.one_id,
a.two_id,
a.date,
b.junk_id
from
first a
left join second b on a.junk_id = b.junk_id
where
a.date >= '20091020'
and
a.date <= '20091120'
and
a.one_id = 120326
and
a.two_id = 1

QUERY PLAN FOR STATEMENT 1 (at line 1).
STEP 1
The type of query is SELECT.
5 operator(s) under root
Quote:
ROOT:EMIT Operator (VA = 5)

|RESTRICT Operator (VA = 4)(0)(0)(0)(0)(11)
|
| |MERGE JOIN Operator (Join Type: Left Outer Join) (VA =
3)
| | Using Worktable2 for internal storage.
| | Key Count: 1
| | Key Ordering: ASC
| |
| | |SORT Operator (VA = 1)
| | | Average Row width is 22.000000
| | | Using Worktable1 for internal storage.
| | |
| | | |SCAN Operator (VA = 0)
| | | | FROM TABLE
| | | | first
| | | | a
| | | | Using Clustered Index.
| | | | Index : first_pkc_1
| | | | Forward Scan.
| | | | Positioning by key.
| | | | Keys are:
| | | | one_id ASC
| | | | two_id ASC
| | | | date ASC
| | | | Using I/O Size 16 Kbytes for data pages.
| | | | With MRU Buffer Replacement Strategy for
data pages.
| |
| | |SCAN Operator (VA = 2)
| | | FROM TABLE
| | | second
| | | b
| | | Using Clustered Index.
| | | Index : XPKsecone
| | | Forward Scan.
| | | Positioning at index start.
| | | Index contains all needed columns. Base table
will not be read.
| | | Using I/O Size 16 Kbytes for index leaf pages.
| | | With LRU Buffer Replacement Strategy for index
leaf pages.

Total estimated I/O cost for statement 1 (at line 1): 77052.


The first table is filled with about 300 million records and the
second table is filled with about 2 million records.

According to the show plan, we it is using work tables and what do
work tables mean?

"Using Worktable2 for internal storage." and "Using Worktable1 for
internal storage."

Is there anything I can do to get rid of the work tables?

Reply With Quote
  #8  
Old   
Derek Asirvadem
 
Posts: n/a

Default Re: Total estimated I/O cost for statement - 11-01-2011 , 06:54 PM



On Nov 1, 3:36*am, 1dg618 <1dg... (AT) gmail (DOT) com> wrote:

Quote:
Hopefully, this helps out?
Not really, it is a completely different question. But never mind.

I do not think you have provided the *whole* DDL for the tables as
requested. Xor the tables are inadequately defined.

Quote:
select
* a.one_id,
* a.two_id,
* a.date,
* b.junk_id
from
* first a
* * left join second b on a.junk_id = b.junk_id
1. So where is the FOREIGN KEY CONSTRAINT to support such a join ? If
there was one, the Optimiser would make much better decisions re the
type of join, and possibly avoid a worktable. The execution is also
faster, but detailing that here would take too long.

2. Since junk_id exists in first, and it does not appear in the WHERE
clause (only in the column list), I see no reason for the join to
second.


Quote:
* * * *| * | * | * |SCAN Operator (VA = 2)
* * * *| * | * | * | *FROM TABLE
* * * *| * | * | * | *second
* * * *| * | * | * | *b
* * * *| * | * | * | *Using Clustered Index.
* * * *| * | * | * | *Index : XPKsecone
3. What is the DDL for that index ? If the Optimiser found it and
used it, it must be there.


Quote:
* * * *| * | * | * | *Index contains all needed columns. Base table
will not be read.
4. That is great. Better still, the entire table can be eliminated.

Quote:
Total estimated I/O cost for statement 1 (at line 1): 77052.

The first table is filled with about 300 million records and the
second table is filled with about 2 million records.

According to the show plan, we it is using work tables and what do
work tables mean?

"Using Worktable2 for internal storage." and "Using Worktable1 for
internal storage."

Is there anything I can do to get rid of the work tables?
Yes. The goal is to minimise, if not eliminate, worktables. The
worktables are internal (invisible to you, except they are identified
in the QP) tables created by the Optimiser. If your query is poor, or
the indices are poor, or the data that it has to sort is large, or the
join type is a bad choice, it will create a worktable. Kind of
unavoidable, in the sense that you cannot directly control the
creation/use of worktables. You need to fix the query, or the table
definitions, and consequently the Optimiser will not require them, or
require less or them, or create a much smaller worktable. Worktable
itself is not so bad, ASE 15.0.x and above use much more of them, but
they are much faster than in previous releases. So now the goal is to
ensure they are small, and minimal.

That's the general answer. The answer for this particular DDL & query
is, the tables are so large that it has decided (probably quite
correctly) that it would be better to read the two CIs into
worktables, then do all the work in memory, between the two
worktables.

Getting rid of the superfluous table reference and join to second will
of course completely change the QP.


5. Ensure that you are executing UPDATE INDEX STATISTICS <table> on
all tables, at least weekly.

Regards
Derek

Reply With Quote
  #9  
Old   
Antony
 
Posts: n/a

Default Re: Total estimated I/O cost for statement - 11-01-2011 , 09:03 PM



On Tuesday, 1 November 2011 05:36:31 UTC+13, 1dg618 wrote:

Quote:
| | |MERGE JOIN Operator (Join Type: Left Outer Join) (VA = 3)
If merge joins are causing you troubles, I'd ignore them:

1> set merge_join off
2> go

I'm loathe to recommend you disable them by default - there must be some use for them - but I have yet to see them cause anything but trouble

At lease - try the query plan again without merge joins enabled ...

Reply With Quote
  #10  
Old   
1dg618
 
Posts: n/a

Default Re: Total estimated I/O cost for statement - 11-02-2011 , 10:39 AM



Suggestions on a good Sybase book that talks about creating tables,
indexes, constraints, etc., to take advantage of the "mechanics" of
Sybase? I realize, I don't know the mechanics of Sybase. Each vendor's
implementation is different. If you understand it, you can take
advantage of it. Am I wrong?

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.