dbTalk Databases Forums  

Simple Join Taking a Long Time

comp.databases comp.databases


Discuss Simple Join Taking a Long Time in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
battle.chris@gmail.com
 
Posts: n/a

Default Simple Join Taking a Long Time - 02-17-2007 , 04:37 PM






I have a very simple join which is taking nearly a minute to execute.
It uses the following two tables:

CREATE TABLE `candidates`
(
`id` integer (11) NOT NULL AUTO_INCREMENT ,
`first_name` varchar (127),
`surname` varchar (127),
PRIMARY KEY (`id`)
)

CREATE TABLE `candidates_skills`
(
`id` integer (11) NOT NULL AUTO_INCREMENT ,
`candidate_id` integer (11),
`skill_id` integer (11),
PRIMARY KEY (`id`)
)

And I'm trying to perform the following:

SELECT c.id, cs.id FROM candidates AS c
LEFT OUTER JOIN candidates_skills AS cs ON c.id = cs.candidate_id AND
( cs.skill_id = 5 OR cs.skill_id = 15)

The result I'm getting is exactly what I need but I'm obviously doing
something wrong cause it's taking too long to execute.

Thanks.


Reply With Quote
  #2  
Old   
Tonkuma
 
Posts: n/a

Default Re: Simple Join Taking a Long Time - 02-17-2007 , 08:11 PM






The way to tuning performance is something different DBMS BY DBMS.
But, generary it is effective to make appropriate indexes.
For example:
CREATE INDEX cs_cid_sid_id ON candidates_skills
(candidate_id, skill_id, id)

SELECT c.id, cs.id
FROM candidates AS c
LEFT OUTER JOIN
candidates_skills AS cs
ON c.id = cs.candidate_id
AND cs.skill_id IN (5, 15)


Reply With Quote
  #3  
Old   
battle.chris@gmail.com
 
Posts: n/a

Default Re: Simple Join Taking a Long Time - 02-17-2007 , 09:45 PM



On Feb 18, 1:11 pm, "Tonkuma" <tonk... (AT) jp (DOT) ibm.com> wrote:
Quote:
The way to tuning performance is something different DBMS BY DBMS.
But, generary it is effective to make appropriate indexes.
For example:
CREATE INDEX cs_cid_sid_id ON candidates_skills
(candidate_id, skill_id, id)

SELECT c.id, cs.id
FROM candidates AS c
LEFT OUTER JOIN
candidates_skills AS cs
ON c.id = cs.candidate_id
AND cs.skill_id IN (5, 15)
Thanks Tonkuma; this totally resolved the lag issue.

Does this mean the index has to be updated each time the table is
amended?

Thanks again.



Reply With Quote
  #4  
Old   
Ed Prochak
 
Posts: n/a

Default Re: Simple Join Taking a Long Time - 02-18-2007 , 01:27 PM



On Feb 17, 5:37 pm, battle.ch... (AT) gmail (DOT) com wrote:
Quote:
I have a very simple join which is taking nearly a minute to execute.
It uses the following two tables:

CREATE TABLE `candidates`
(
`id` integer (11) NOT NULL AUTO_INCREMENT ,
`first_name` varchar (127),
`surname` varchar (127),
PRIMARY KEY (`id`)
)

CREATE TABLE `candidates_skills`
(
`id` integer (11) NOT NULL AUTO_INCREMENT ,
`candidate_id` integer (11),
`skill_id` integer (11),
PRIMARY KEY (`id`)
)

And I'm trying to perform the following:

SELECT c.id, cs.id FROM candidates AS c
LEFT OUTER JOIN candidates_skills AS cs ON c.id = cs.candidate_id AND
( cs.skill_id = 5 OR cs.skill_id = 15)

The result I'm getting is exactly what I need but I'm obviously doing
something wrong cause it's taking too long to execute.

Thanks.
My crystal ball is broken, What DBMS are you using?

Why are you doing an OUTER JOIN?
On this syntax I get my LEFT and RIGHT mixed up but isn't doing
candidates LEFT OUTER JOIN candidates_skills
getting all candidates, even those without skills?
But then you force the result set to have at least one of two skills.
IOW you included all candidates only to throw some away. The optimizer
is likely to not do well with such logic.

HTH, but PLEASE include which DBMS next time.
ED



Reply With Quote
  #5  
Old   
greg.fenton
 
Posts: n/a

Default Re: Simple Join Taking a Long Time - 02-19-2007 , 08:37 AM



On Feb 18, 2:27 pm, "Ed Prochak" <edproc... (AT) gmail (DOT) com> wrote:
Quote:
IOW you included all candidates only to throw some away. The optimizer
is likely to not do well with such logic.

Yep, using a LEFT OUTER JOIN instead of a simple inner join is a waste
in this case, though a good RDBMS optimizer should recognize that the
criteria exclude NULL rows from the right table and automatically
rewrite the join...

Again, which RDBMS and version?

greg.fenton



Reply With Quote
  #6  
Old   
Walt
 
Posts: n/a

Default Re: Simple Join Taking a Long Time - 02-19-2007 , 01:19 PM




"greg.fenton" <greg.fenton (AT) gmail (DOT) com> wrote

Quote:
On Feb 18, 2:27 pm, "Ed Prochak" <edproc... (AT) gmail (DOT) com> wrote:
IOW you included all candidates only to throw some away. The optimizer
is likely to not do well with such logic.


Yep, using a LEFT OUTER JOIN instead of a simple inner join is a waste
in this case, though a good RDBMS optimizer should recognize that the
criteria exclude NULL rows from the right table and automatically
rewrite the join...
Will you name one RDBMS optimizer that will actually perform the action you
describe above?

I know a few optimizers, and would be very surprised if any of them got this
one.

Quote:
Again, which RDBMS and version?

greg.fenton




Reply With Quote
  #7  
Old   
Lennart
 
Posts: n/a

Default Re: Simple Join Taking a Long Time - 02-19-2007 , 03:04 PM



On Feb 18, 9:27 pm, "Ed Prochak" <edproc... (AT) gmail (DOT) com> wrote:
[...]
Quote:
Why are you doing an OUTER JOIN?
On this syntax I get my LEFT and RIGHT mixed up but isn't doing
candidates LEFT OUTER JOIN candidates_skills
getting all candidates, even those without skills?
But then you force the result set to have at least one of two skills.
IOW you included all candidates only to throw some away. The optimizer
is likely to not do well with such logic.

I might be wrong, but since:

AND cs.skill_id IN (5, 15)

is part of the join condition, isn't all candidates returned, with or
without skills? I.e. the way the query is expressed I don't think it
will produce the same result as an inner join.

/Lennart



Reply With Quote
  #8  
Old   
greg.fenton
 
Posts: n/a

Default Re: Simple Join Taking a Long Time - 02-20-2007 , 09:59 AM



On Feb 19, 4:04 pm, "Lennart" <Erik.Lennart.Jons... (AT) gmail (DOT) com> wrote:
Quote:
I might be wrong, but since:

AND cs.skill_id IN (5, 15)

is part of the join condition, isn't all candidates returned, with or
without skills? I.e. the way the query is expressed I don't think it
will produce the same result as an inner join.

No, that would be the case if you wrote:

AND ( cs.skill_id IN (5, 15) OR cs.skill_id IS NULL )

greg.fenton



Reply With Quote
  #9  
Old   
Lennart
 
Posts: n/a

Default Re: Simple Join Taking a Long Time - 02-20-2007 , 12:21 PM



On Feb 20, 5:59 pm, "greg.fenton" <greg.fen... (AT) gmail (DOT) com> wrote:
Quote:
On Feb 19, 4:04 pm, "Lennart" <Erik.Lennart.Jons... (AT) gmail (DOT) com> wrote:

I might be wrong, but since:

AND cs.skill_id IN (5, 15)

is part of the join condition, isn't all candidates returned, with or
without skills? I.e. the way the query is expressed I don't think it
will produce the same result as an inner join.

No, that would be the case if you wrote:

AND ( cs.skill_id IN (5, 15) OR cs.skill_id IS NULL )

greg.fenton
can you provide an example where:

OR cs.skill_id IS NULL )

makes a difference?

/Lennart





Reply With Quote
  #10  
Old   
Ed Prochak
 
Posts: n/a

Default Re: Simple Join Taking a Long Time - 02-21-2007 , 12:59 PM



On Feb 20, 1:21 pm, "Lennart" <Erik.Lennart.Jons... (AT) gmail (DOT) com> wrote:
Quote:
On Feb 20, 5:59 pm, "greg.fenton" <greg.fen... (AT) gmail (DOT) com> wrote:

On Feb 19, 4:04 pm, "Lennart" <Erik.Lennart.Jons... (AT) gmail (DOT) com> wrote:

I might be wrong, but since:

AND cs.skill_id IN (5, 15)

is part of the join condition, isn't all candidates returned, with or
without skills? I.e. the way the query is expressed I don't think it
will produce the same result as an inner join.

No, that would be the case if you wrote:

AND ( cs.skill_id IN (5, 15) OR cs.skill_id IS NULL )

greg.fenton

can you provide an example where:

OR cs.skill_id IS NULL )

makes a difference?

/Lennart
assume the result set before applying the AND clause has two rws:
JOE 15
BILL NULL (ie Bill has no skills)

then using this:
AND ( cs.skill_id IN (5, 15) OR cs.skill_id IS NULL )
the final result set still has both rows.

BUT the original query had this:
AND ( cs.skill_id = 5 OR cs.skill_id = 15)
BILL fails the condition (NULL does not match either 5 or 15)

IOW the original query in this thread returns the same result as an
inner join.

If you don't believe me, try it.
Ed




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.