dbTalk Databases Forums  

Bad query

comp.databases.mysql comp.databases.mysql


Discuss Bad query in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
pac Man
 
Posts: n/a

Default Bad query - 02-16-2011 , 08:52 AM






Hello

I have table BLOG

id, user_id, creation_date, name, content (around 7000 records)

And then I have table SITE_USERS
id, name (around 20000 records)

I want to select the latest blog entries, but one user should not
appear more than once.
Doing:

select
B.*, S.name
from
blog B, site_users S
where
B.user_id = S.id
group by B.user_id
order by creation_date

is not so good. It seems that its not fetching the latest blog entry
from a user (why should it? I dont think I say it in the query, sigh).

So, I tought this query could work:

select
B.*, S.name
from
blog B, site_users S
where
B.user_id = S.id
and
B.id IN (select MAX(id) from blog group by user_id)
order by creation_date

It actually works but takes 20 (!!!!) seconds to execute.

The problem seems to be when joining the two tables. Causa if I remove
from th query the other table (SITE_USERS), then it executes without
problems.
Both tables have been indexed. And have the auto_increment and
primary_key of course.
I really cannot understand why it takes so much time when there is so
little data. Or, well, I guess the subquery must be very bad itself I
guess?

Reply With Quote
  #2  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: Bad query - 02-16-2011 , 03:25 PM






On 2011-02-16 15:52, pac Man wrote:
Quote:
Hello

I have table BLOG

id, user_id, creation_date, name, content (around 7000 records)

And then I have table SITE_USERS
id, name (around 20000 records)

I want to select the latest blog entries, but one user should not
appear more than once.
Doing:

select
B.*, S.name
from
blog B, site_users S
where
B.user_id = S.id
group by B.user_id
order by creation_date

is not so good. It seems that its not fetching the latest blog entry
from a user (why should it? I dont think I say it in the query, sigh).

So, I tought this query could work:

select
B.*, S.name
from
blog B, site_users S
where
B.user_id = S.id
and
B.id IN (select MAX(id) from blog group by user_id)
order by creation_date

It actually works but takes 20 (!!!!) seconds to execute.

The problem seems to be when joining the two tables. Causa if I remove
from th query the other table (SITE_USERS), then it executes without
problems.
Both tables have been indexed. And have the auto_increment and
primary_key of course.
I really cannot understand why it takes so much time when there is so
little data. Or, well, I guess the subquery must be very bad itself I
guess?
MySQL is not very good at handling sub selects, but there are other ways
to skin a cat. If you trust your auto increment column as a time
ordering attribute, the latest blog entry per user can be defined as:

select user_id, max(id) as max_id
from BLOG
group by user_id

The result is in it self a table so you can join this with your base
tables like (untested):

select B.*, S.name
from blog B
join (
select user_id, max(id) as max_id
from BLOG
group by user_id
) X
on B.id = X.max_id
join site_users S
on S.id = X.user_id

Note that a construction like:

select B.*, S.name
from blog B, site_users S
where B.user_id = S.id
group by B.user_id

is invalid sql (in general) and should not be used. You can prevent
mistakes like that by enabling ONLY_FULL_GROUP_BY in @@sql_mode.

/Lennart

Reply With Quote
  #3  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: Bad query - 02-16-2011 , 10:37 PM



On 2/16/2011 9:52 AM, pac Man wrote:
Quote:
Hello

I have table BLOG

id, user_id, creation_date, name, content (around 7000 records)

And then I have table SITE_USERS
id, name (around 20000 records)

I want to select the latest blog entries, but one user should not
appear more than once.
Doing:

select
B.*, S.name
from
blog B, site_users S
where
B.user_id = S.id
group by B.user_id
order by creation_date

is not so good. It seems that its not fetching the latest blog entry
from a user (why should it? I dont think I say it in the query, sigh).

So, I tought this query could work:

select
B.*, S.name
from
blog B, site_users S
where
B.user_id = S.id
and
B.id IN (select MAX(id) from blog group by user_id)
order by creation_date

It actually works but takes 20 (!!!!) seconds to execute.

The problem seems to be when joining the two tables. Causa if I remove
from th query the other table (SITE_USERS), then it executes without
problems.
Both tables have been indexed. And have the auto_increment and
primary_key of course.
I really cannot understand why it takes so much time when there is so
little data. Or, well, I guess the subquery must be very bad itself I
guess?
So, what's the definition of your merchants table?

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
  #4  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: Bad query - 02-17-2011 , 07:09 AM



On 2/16/2011 11:37 PM, Jerry Stuckle wrote:
Quote:
On 2/16/2011 9:52 AM, pac Man wrote:
Hello

I have table BLOG

id, user_id, creation_date, name, content (around 7000 records)

And then I have table SITE_USERS
id, name (around 20000 records)

I want to select the latest blog entries, but one user should not
appear more than once.
Doing:

select
B.*, S.name
from
blog B, site_users S
where
B.user_id = S.id
group by B.user_id
order by creation_date

is not so good. It seems that its not fetching the latest blog entry
from a user (why should it? I dont think I say it in the query, sigh).

So, I tought this query could work:

select
B.*, S.name
from
blog B, site_users S
where
B.user_id = S.id
and
B.id IN (select MAX(id) from blog group by user_id)
order by creation_date

It actually works but takes 20 (!!!!) seconds to execute.

The problem seems to be when joining the two tables. Causa if I remove
from th query the other table (SITE_USERS), then it executes without
problems.
Both tables have been indexed. And have the auto_increment and
primary_key of course.
I really cannot understand why it takes so much time when there is so
little data. Or, well, I guess the subquery must be very bad itself I
guess?

So, what's the definition of your merchants table?

Sorry, got my responses mixed up.

What you need is known around here as the "strawberry query". Google
this group for more information on it.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.