dbTalk Databases Forums  

Pushing restrictions down into subqueries

comp.databases comp.databases


Discuss Pushing restrictions down into subqueries in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Tom Anderson
 
Posts: n/a

Default Pushing restrictions down into subqueries - 05-19-2010 , 07:25 AM






Hi all,

Given tables that look a bit like:

create table product (
product_id integer primary key
-- other fields
)
create table price (
country_code char(2) not null,
product_id integer not null references product,
constraint price_pk primary key (country_code, product_id)
-- other fields
)

Would you expect this query:

select *
from product p
where p.product_id in
(select product_id from price)

To be as fast as this one:

select *
from product p
where p.product_id in
(select product_id from price where product_id = p.product_id)

(in general, or with the DBMS of your choice)?

I am right in thinking the two are functionally equivalent, right? The
idea is to find products which have a defined price in at least one
country.

I would have thought that any query planner worth its salt would execute
those queries the same way. My colleague insists that the subquery in the
first version will force the engine to make a table scan over price. A
quick experiment (with different but roughly corresponding queries) on SQL
Server finds that the second version, with the where clause, is in fact
faster.

Out of interest, how would you expect these two to compare:

select *
from product p
where exists
(select * from price where product_id = p.product_id)

select *
from product p
where exists
(select 1 from price where product_id = p.product_id)

select distinct p.*
from product p, price q
where p.product_id = q.product_id

?

Thanks,
tom

--
Science which is distinguishable from magic is insufficiently advanced

Reply With Quote
  #2  
Old   
Thomas Kellerer
 
Posts: n/a

Default Re: Pushing restrictions down into subqueries - 05-19-2010 , 08:12 AM






Tom Anderson, 19.05.2010 14:25:
Quote:
Would you expect this query:

select *
from product p
where p.product_id in
(select product_id from price)

To be as fast as this one:

select *
from product p
where p.product_id in
(select product_id from price where product_id = p.product_id)

(in general, or with the DBMS of your choice)?

Yes I would expect the optimizer to detect that situation.

As a matter of fact this is a very good example why FK constraints are useful beyond mere data integrity/constency.

Oracle 11g produces identical plans for both version (so it detects the situation)

PostgreSQL 8.4 unfortunately does not detect that situation properly and produces a slower plan for the first statement (although not that much slower...)

Quote:
I am right in thinking the two are functionally equivalent, right?
I would agree, yes.

Quote:
Out of interest, how would you expect these two to compare:

select *
from product p
where exists
(select * from price where product_id = p.product_id)

select *
from product p
where exists
(select 1 from price where product_id = p.product_id)
They are absolutely identical.
I know it is a common myth that the second one would perform better.

But how should it? To find out if a row exists, the corresponding block needs to be read from the harddisk regardless if I use all columns or not. And once the block is read, it does not make any difference if the database uses any column value from it or not.

Both produce exactly the same execution plan and have the same runtime (give or take some ~100 milliseconds) on Oracle 11g and PostgreSQL 8.4


Regards
Thomas

Reply With Quote
  #3  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: Pushing restrictions down into subqueries - 05-19-2010 , 09:55 AM



On 2010-05-19 15:12, Thomas Kellerer wrote:
Quote:
Tom Anderson, 19.05.2010 14:25:
Would you expect this query:

select *
from product p
where p.product_id in
(select product_id from price)

To be as fast as this one:

select *
from product p
where p.product_id in
(select product_id from price where product_id = p.product_id)

(in general, or with the DBMS of your choice)?


Yes I would expect the optimizer to detect that situation.

As a matter of fact this is a very good example why FK constraints are useful beyond mere data integrity/constency.

Oracle 11g produces identical plans for both version (so it detects the situation)
db2 v9.5 fixpak 0 does as well. It rewrites both queries to:

SELECT DISTINCT Q1.PRODUCT_ID AS PRODUCT_ID
FROM PRICE AS Q1

I will fill the tables with some random data and see if it makes a
difference.


/Lennart

Reply With Quote
  #4  
Old   
Thomas Kellerer
 
Posts: n/a

Default Re: Pushing restrictions down into subqueries - 05-19-2010 , 11:01 AM



Lennart Jonsson wrote on 19.05.2010 16:55:
Quote:
Yes I would expect the optimizer to detect that situation.

As a matter of fact this is a very good example why FK constraints are useful beyond mere data integrity/constency.

Oracle 11g produces identical plans for both version (so it detects the situation)

db2 v9.5 fixpak 0 does as well. It rewrites both queries to:

SELECT DISTINCT Q1.PRODUCT_ID AS PRODUCT_ID
FROM PRICE AS Q1

But that is a completely different result!

In your examples you are retrieving the complete row from the product table, but this statement only returns product_ids

Thomas

Reply With Quote
  #5  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: Pushing restrictions down into subqueries - 05-19-2010 , 11:05 AM



On 2010-05-19 16:55, Lennart Jonsson wrote:
[...]
Quote:
I will fill the tables with some random data and see if it makes a
difference.

I populated the tables with:

insert into product (product_id)
with T (n) as (
values 0
union all
select n+1 from T where n<10000
) select n from T;

insert into price (product_id, country_code)
select product_id, 'SE' from product where mod(product_id,2) = 0
union all
select product_id, 'NO' from product where mod(product_id,3) = 0
union all
select product_id, 'DK' from product where mod(product_id,5) = 0
union all
select product_id, 'FI' from product where mod(product_id,7) = 0;

runstats on table lelle.price and indexes all;
runstats on table lelle.product and indexes all;

but the plan still looks the same:

Total Cost: 131.136
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
Quote:
7424
TBSCAN
( 2)
131.136
40.3529
Quote:
7424
SORT
( 3)
131.135
40.3529
Quote:
11765
IXSCAN
( 4)
123.686
40.3529
Quote:
11765
INDEX: LELLE
PRICE_PK

I also tries with mysql 5.1 and even though I don't understand how one
can make anything out of the output of explain, the results for the two
different queries look the same.

+----+--------------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
Quote:
id | select_type | table | type | possible_keys | key |
key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
Quote:
1 | PRIMARY | p | index | NULL | PRIMARY | 4
NULL | 1 | Using where; Using index |
2 | DEPENDENT SUBQUERY | price | index | NULL | PRIMARY | 6
NULL | 1 | Using where; Using index |
+----+--------------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
2 rows in set (0.01 sec)



Quote:
/Lennart

Reply With Quote
  #6  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: Pushing restrictions down into subqueries - 05-19-2010 , 11:12 AM



On 2010-05-19 18:01, Thomas Kellerer wrote:
Quote:
Lennart Jonsson wrote on 19.05.2010 16:55:
Yes I would expect the optimizer to detect that situation.

As a matter of fact this is a very good example why FK constraints are useful beyond mere data integrity/constency.

Oracle 11g produces identical plans for both version (so it detects the situation)

db2 v9.5 fixpak 0 does as well. It rewrites both queries to:

SELECT DISTINCT Q1.PRODUCT_ID AS PRODUCT_ID
FROM PRICE AS Q1

But that is a completely different result!

In your examples you are retrieving the complete row from the product table, but this statement only returns product_ids

The definition of the product table might have something to do with it ;-)

create table product (
product_id integer primary key
-- other fields
);

/Lennart

Reply With Quote
  #7  
Old   
Jasen Betts
 
Posts: n/a

Default Re: Pushing restrictions down into subqueries - 05-20-2010 , 03:01 AM



On 2010-05-19, Tom Anderson <twic (AT) urchin (DOT) earth.li> wrote:
Quote:
Hi all,

Given tables that look a bit like:

create table product (
product_id integer primary key
-- other fields
)
create table price (
country_code char(2) not null,
product_id integer not null references product,
constraint price_pk primary key (country_code, product_id)
-- other fields
)

Would you expect this query:

select *
from product p
where p.product_id in
(select product_id from price)

To be as fast as this one:

select *
from product p
where p.product_id in
(select product_id from price where product_id = p.product_id)
no I'd expect a difference.

The subquery in the upper need only be evaluated once, while in the lower it's
a new subquery for every row of product.
performance depends on wether the table scan is faster than the several
aborted (hopefully) index scans that the lower asks for.

Quote:
I would have thought that any query planner worth its salt would execute
those queries the same way. My colleague insists that the subquery in the
first version will force the engine to make a table scan over price. A
quick experiment (with different but roughly corresponding queries) on SQL
Server finds that the second version, with the where clause, is in fact
faster.
how does that compare with a simple join?

select distinct product.* from product join price;

Quote:
Out of interest, how would you expect these two to compare:
depends on row count and how many useful indices you have.


--- news://freenews.netfront.net/ - complaints: news (AT) netfront (DOT) net ---

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.