dbTalk Databases Forums  

[Info-Ingres] not in vs outer join syntax as a table expands

comp.databases.ingres comp.databases.ingres


Discuss [Info-Ingres] not in vs outer join syntax as a table expands in the comp.databases.ingres forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Martin Bowes
 
Posts: n/a

Default [Info-Ingres] not in vs outer join syntax as a table expands - 08-03-2009 , 10:01 AM






Hi All,



We have many programs which use the old style 'not in' queries:.

Eg. select count(*)

from progres

where attended_fup1 is not null

and participant_id not in (

select participant_id from fup where fup_num=1);



In some cases some of these queries have been re-written using the more
modern outer join syntax.

ie. select count(*)

from progres p left join fup f on

f.participant_id=p.participant_id and f.fup_num=1

where attended_fup1 is not null

and f.participant_id is null;



We've just noticed that as the table fup grew from 47513 rows to 56154
rows that the query plan changed on the old style query only and the
execution time blew out from sub-second to nearly a minute. The new
style queries were unaffected by the growth in the table.



ie. we went from the following qep which was generated for the small fup
case on the old style query as well as the new style queries on both
versions of the fup table....

QUERY PLAN 4,2, no timeout, of simple aggregate

aggregate expression -> count(*)





left join

Hash Join(participant_id)

Heap

Pages 33 Tups 21938

D412 C1406

/ \

Proj-rest Proj-rest

Sorted(NU) Sorted(NU)

Pages 105 Tups 34952 Pages 118 Tups 16861

D198 C353 D214 C475

/ /

progres fup

B-Tree(NU) B-Tree(NU)

Pages 790 Tups 35305 Pages 857 Tups 47513



To:

QUERY PLAN 1,3, no timeout, of subselect T2

Proj-rest

Sorted(participant_id)

Pages 61 Tups 17471

D295 C562

/

fup

B-Tree(NU)

Pages 1179 Tups 56154





QUERY PLAN 1,2, no timeout, of simple aggregate

aggregate expression -> count(*)





SE Join

Heap

Pages 6 Tups 3495

D787 C1826

/ \

Proj-rest T2

Sorted(participant_id) Heap

Pages 105 Tups 34952 Pages 61 Tups 17471

D198 C353

/

progres

B-Tree(NU)

Pages 790 Tups 35305



Anyone have any ideas why this would occur?



I've seen it on 9.1.1 and 9.2.0



Martin Bowes.

Reply With Quote
  #2  
Old   
Karl & Betty Schendel
 
Posts: n/a

Default Re: [Info-Ingres] not in vs outer join syntax as a table expands - 08-03-2009 , 10:58 AM






On Aug 3, 2009, at 11:01 AM, Martin Bowes wrote:

Quote:
We've just noticed that as the table fup grew from 47513 rows to
56154 rows that the query plan changed on the old style query only
and the execution time blew out from sub-second to nearly a minute.
The new style queries were unaffected by the growth in the table.

The notin to outer join transform is done in query rewrite,
before any costing. So I am very skeptical that it had
anything to do with the growth in size, although I
suppose it's barely possible. Was a patch installed?

Is participant_id nullable? a NOT IN subquery with a nullable
has some null-semantics issues, and it may be that the
outer-join transform should not be applied because of
those semantics. A fix to extend the nullable check
for NOT IN was done in main in spring 2008, although I have no
idea how that translates to actual releases or patches.

Karl

Reply With Quote
  #3  
Old   
Ingres Forums
 
Posts: n/a

Default Re: [Info-Ingres] not in vs outer join syntax as a table expands - 08-03-2009 , 11:20 PM



Marty,

This one is a good question for Doug (who is on vacation until
mid-august).

Ingres does actually make an attempt to flatten "not ion" sub-selects
to outer joins if it can, but it's not 100% effective. Interestingly
enough, the estimated total tuples for the two plans below are almost
identical, I suspect the optimizer thinks both plans cost approximately
the same amount and as the data size and distribution changed it decided
on one plan over the other. Have you run optimizedb on the tables
recently? Maybe a finer granularity on the histogram might produce a
better result.


--
stephenb

Reply With Quote
  #4  
Old   
Martin Bowes
 
Posts: n/a

Default Re: [Info-Ingres] not in vs outer join syntax as a table expands - 08-04-2009 , 06:50 AM



Quote:
Is participant_id nullable?
Bingo Karl!

What I wasn't told by the programmer (and isn't there always
something)...as the table grew they also made a few columns nullable.

What I've found out since then is the nullable column prevents the
transformation of the old style 'not in' query into an 'outer join'
query. This is very repeatable.

I have found that this problem exists on:
II 9.0.4 (a64.lnx/105)NPTL 12707.

II 9.1.1 (a64.lnx/103)NPTL 13377.

II 9.2.0 (a64.lnx/143)NPTL
II 9.2.0 (a64.lnx/143)NPTL 13468.

I'll raise this as a bug and see if the solution you mention has made it
into the 9.2.0 stream.

Marty

Reply With Quote
  #5  
Old   
Martin Bowes
 
Posts: n/a

Default Re: [Info-Ingres] not in vs outer join syntax as a table expands - 08-04-2009 , 06:52 AM



Hi Stevo,

Karl nailed it with the nullable column. I tried varying levels of stats
from none to maximum buckets and other than getting better estimates the
query plans didn't change.

Marty

-----Original Message-----
From: info-ingres-bounces (AT) kettleriver...ting (DOT) com
[mailto:info-ingres-bounces (AT) kettleriverconsulting (DOT) com] On Behalf Of
Ingres Forums
Sent: 04 August 2009 05:21
To: info-ingres (AT) kettleriverconsulting (DOT) com
Subject: Re: [Info-Ingres] not in vs outer join syntax as a table
expands


Marty,

This one is a good question for Doug (who is on vacation until
mid-august).

Ingres does actually make an attempt to flatten "not ion" sub-selects
to outer joins if it can, but it's not 100% effective. Interestingly
enough, the estimated total tuples for the two plans below are almost
identical, I suspect the optimizer thinks both plans cost approximately
the same amount and as the data size and distribution changed it decided
on one plan over the other. Have you run optimizedb on the tables
recently? Maybe a finer granularity on the histogram might produce a
better result.


--
stephenb
------------------------------------------------------------------------
stephenb's Profile:
http://community.ingres.com/forum/member.php?userid=175
View this thread:
http://community.ingres.com/forum/sh...ad.php?t=10995

_______________________________________________
Info-Ingres mailing list
Info-Ingres (AT) kettleriverconsulting (DOT) com
http://www.kettleriverconsulting.com...fo/info-ingres

Reply With Quote
  #6  
Old   
Ingres Forums
 
Posts: n/a

Default Re: [Info-Ingres] not in vs outer join syntax as a table expands - 08-04-2009 , 07:08 AM



Marty,

I guess that might be bug 121409, marked NOTB (Not a Bug). Given what
happens when you compare any value which is null (it always gives no
match) a subselect comparing a nullable field will have problems giving
the correct answer if the compariing column is nullable.

As mentioned in the bug comments: IN evaluates to "fld = value" for
each value. A NULL makes that result UNKNOWN, which does not match any
rows. Taking NOT of UNKNOWN leaves it as UNKNOWN, which is also
UNKNOWN. Once a NULL value is in the subquery, neither IN nor NOT IN
will ever return any rows.

The fix is to revert to making that column not nullable.

John


--
denjo02

Reply With Quote
  #7  
Old   
Martin Bowes
 
Posts: n/a

Default Re: [Info-Ingres] not in vs outer join syntax as a table expands - 08-04-2009 , 07:37 AM



Hi John,

Yes that makes sense. I've already asked the programmer why the change
was made to nullable and the answer was basically 'was it?'. I suspect
they made an error in a rebuild script.

Marty

-----Original Message-----
From: info-ingres-bounces (AT) kettleriver...ting (DOT) com
[mailto:info-ingres-bounces (AT) kettleriverconsulting (DOT) com] On Behalf Of
Ingres Forums
Sent: 04 August 2009 13:09
To: info-ingres (AT) kettleriverconsulting (DOT) com
Subject: Re: [Info-Ingres] not in vs outer join syntax as a table
expands


Marty,

I guess that might be bug 121409, marked NOTB (Not a Bug). Given what
happens when you compare any value which is null (it always gives no
match) a subselect comparing a nullable field will have problems giving
the correct answer if the compariing column is nullable.

As mentioned in the bug comments: IN evaluates to "fld = value" for
each value. A NULL makes that result UNKNOWN, which does not match any
rows. Taking NOT of UNKNOWN leaves it as UNKNOWN, which is also
UNKNOWN. Once a NULL value is in the subquery, neither IN nor NOT IN
will ever return any rows.

The fix is to revert to making that column not nullable.

John


--
denjo02
------------------------------------------------------------------------
denjo02's Profile:
http://community.ingres.com/forum/member.php?userid=702
View this thread:
http://community.ingres.com/forum/sh...ad.php?t=11000

_______________________________________________
Info-Ingres mailing list
Info-Ingres (AT) kettleriverconsulting (DOT) com
http://www.kettleriverconsulting.com...fo/info-ingres

Reply With Quote
  #8  
Old   
Stephen Ball
 
Posts: n/a

Default Re: [Info-Ingres] not in vs outer join syntax as a table expands - 08-04-2009 , 05:54 PM



O.K. cool....glad to see Karl got it right (he's pretty good with that
optimizer stuff ).

If it's still causing a problem, you might want to create an issue and
have one of the SE guys look into it.

Steve

-----Original Message-----
From: Martin Bowes [mailto:martin.bowes (AT) ctsu (DOT) ox.ac.uk]
Sent: Tuesday, August 04, 2009 9:52 PM
To: Ingres and related product discussion forum
Cc: Stephen Ball
Subject: RE: [Info-Ingres] not in vs outer join syntax as a table
expands

Hi Stevo,

Karl nailed it with the nullable column. I tried varying levels of stats
from none to maximum buckets and other than getting better estimates the
query plans didn't change.

Marty

-----Original Message-----
From: info-ingres-bounces (AT) kettleriver...ting (DOT) com
[mailto:info-ingres-bounces (AT) kettleriverconsulting (DOT) com] On Behalf Of
Ingres Forums
Sent: 04 August 2009 05:21
To: info-ingres (AT) kettleriverconsulting (DOT) com
Subject: Re: [Info-Ingres] not in vs outer join syntax as a table
expands


Marty,

This one is a good question for Doug (who is on vacation until
mid-august).

Ingres does actually make an attempt to flatten "not ion" sub-selects
to outer joins if it can, but it's not 100% effective. Interestingly
enough, the estimated total tuples for the two plans below are almost
identical, I suspect the optimizer thinks both plans cost approximately
the same amount and as the data size and distribution changed it decided
on one plan over the other. Have you run optimizedb on the tables
recently? Maybe a finer granularity on the histogram might produce a
better result.


--
stephenb
------------------------------------------------------------------------
stephenb's Profile:
http://community.ingres.com/forum/member.php?userid=175
View this thread:
http://community.ingres.com/forum/sh...ad.php?t=10995

_______________________________________________
Info-Ingres mailing list
Info-Ingres (AT) kettleriverconsulting (DOT) com
http://www.kettleriverconsulting.com...fo/info-ingres

Reply With Quote
  #9  
Old   
James K. Lowden
 
Posts: n/a

Default Re: [Info-Ingres] not in vs outer join syntax as a table expands - 08-04-2009 , 10:58 PM



Martin Bowes wrote:
Quote:
We have many programs which use the old style 'not in' queries
On what basis is "not in" out-moded and "outer join" modern?

Seems to me JOIN is a relational Join to permit a relational Project and
NOT IN is an existence test, a relational Difference. I don't see what's
"modern" about, ahem, misusing OUTER JOIN to do what NOT IN expresses more
clearly.

--jkl

Reply With Quote
  #10  
Old   
Martin Bowes
 
Posts: n/a

Default Re: [Info-Ingres] not in vs outer join syntax as a table expands - 08-05-2009 , 02:26 AM



I didn't say out-moded, I said old-style.

Its 'old' because it was the original way of doing such queries in
Ingres. The modern 'outer join' stuff was added more recently.

My understanding is that the outer-join style query will produce more
efficient query plans...but (personally speaking) often at the cost of
readability.

Marty

-----Original Message-----
From: info-ingres-bounces (AT) kettleriver...ting (DOT) com
[mailto:info-ingres-bounces (AT) kettleriverconsulting (DOT) com] On Behalf Of
James K. Lowden
Sent: 05 August 2009 04:59
To: Ingres and related product discussion forum
Subject: Re: [Info-Ingres] not in vs outer join syntax as a table
expands

Martin Bowes wrote:
Quote:
We have many programs which use the old style 'not in' queries
On what basis is "not in" out-moded and "outer join" modern?

Seems to me JOIN is a relational Join to permit a relational Project and
NOT IN is an existence test, a relational Difference. I don't see
what's
"modern" about, ahem, misusing OUTER JOIN to do what NOT IN expresses
more
clearly.

--jkl

_______________________________________________
Info-Ingres mailing list
Info-Ingres (AT) kettleriverconsulting (DOT) com
http://www.kettleriverconsulting.com...fo/info-ingres

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.