dbTalk Databases Forums  

[MySQL] SELECT DISTINCT gives wrong output

comp.databases comp.databases


Discuss [MySQL] SELECT DISTINCT gives wrong output in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
JD
 
Posts: n/a

Default [MySQL] SELECT DISTINCT gives wrong output - 07-10-2006 , 01:31 PM






I have a problem with a MySQL query, that I have been trapped with for 3
days now. I just don't get it, as I think the result is inconsistent.
I am running MySQL 5.0.23 on a FreeBSD 6.1.

I have these original data:

SELECT id_host FROM child ORDER BY ldate DESC LIMIT 20
273
273
127
214
2795
2101
272
77
271
270
15
269
268
267
266
265
264
2794
263
262

When I only want to have a DISTINCT id_host, the result looks like this,
which is very much wrong:

SELECT DISTINCT id_host FROM child ORDER BY ldate DESC LIMIT 20
273
2795
272
271
270
269
268
267
266
265
264
2794
263
262
261
260
259
258
257
256

As you might see, there are several ID's missing above - eg. 127, 214 and
2101 etc. The result SHOULD have looked like this:

273
127
214
2795
2101
272
77
271
270
15
269
268
267
266
265
264
2794
263
262

I would really apreciate any feedback. Perhaps I am doing something wrong
but I cannot see where?



Reply With Quote
  #2  
Old   
Gints Plivna
 
Posts: n/a

Default Re: SELECT DISTINCT gives wrong output - 07-10-2006 , 03:21 PM







JD wrote:
Quote:
I have a problem with a MySQL query, that I have been trapped with for 3
days now. I just don't get it, as I think the result is inconsistent.
I am running MySQL 5.0.23 on a FreeBSD 6.1.

I have these original data:

SELECT id_host FROM child ORDER BY ldate DESC LIMIT 20
273
273
127
214
2795
2101
272
77
271
270
15
269
268
267
266
265
264
2794
263
262

When I only want to have a DISTINCT id_host, the result looks like this,
which is very much wrong:

SELECT DISTINCT id_host FROM child ORDER BY ldate DESC LIMIT 20
273
2795
272
271
270
269
268
267
266
265
264
2794
263
262
261
260
259
258
257
256

As you might see, there are several ID's missing above - eg. 127, 214 and
2101 etc. The result SHOULD have looked like this:

273
127
214
2795
2101
272
77
271
270
15
269
268
267
266
265
264
2794
263
262

I would really apreciate any feedback. Perhaps I am doing something wrong
but I cannot see where?
Oh my GOD! I'm a really wondering that MySQL allows that. At least I
don't know how can this list be ordered if there are two values of
id_host with different ldate? Which one to take???
Try select distinct id_host, ldate FROM child ORDER BY ldate DESC LIMIT
20
Also you should check what is the order of applying ORDER BY, DISTINCT
and LIMIT clauses.

Gints Plivna
http://www.gplivna.eu/



Reply With Quote
  #3  
Old   
JD
 
Posts: n/a

Default Re: SELECT DISTINCT gives wrong output - 07-10-2006 , 04:18 PM



Quote:
I would really apreciate any feedback. Perhaps I am doing something wrong
but I cannot see where?

Oh my GOD! I'm a really wondering that MySQL allows that. At least I
don't know how can this list be ordered if there are two values of
id_host with different ldate? Which one to take???
Try select distinct id_host, ldate FROM child ORDER BY ldate DESC LIMIT
20
The problem is thar it doesn't output all ID's, so I cannot see which ldate
values it has for these. I cannot see any logic in its way to find which
ID's to show?

I cannot see why this should be any difference than eg. PHP, array_unique.
It keeps the first value and skips the rest identical values?

Quote:
Also you should check what is the order of applying ORDER BY, DISTINCT
and LIMIT clauses.
I have tried GROUP BY id_host, but it is the same weird result.

Do you have any other ideas how this might could work?




Reply With Quote
  #4  
Old   
Bill Karwin
 
Posts: n/a

Default Re: [MySQL] SELECT DISTINCT gives wrong output - 07-10-2006 , 05:28 PM



JD wrote:
Quote:
I have a problem with a MySQL query, that I have been trapped with for 3
days now. I just don't get it, as I think the result is inconsistent.
I am running MySQL 5.0.23 on a FreeBSD 6.1.
Your test case seems to work correctly on my machine, running 5.0.21 on
Windows XP.

I notice in the change logs, there's a bug #18068 related to DISTINCT,
fixed in 5.0.23. It's often a good place to start looking, if there was
a change applied to a similar area of functionality recently. It could
be that the fix broke something else.

See:
http://bugs.mysql.com/bug.php?id=18068

Regards,
Bill K.


Reply With Quote
  #5  
Old   
JD
 
Posts: n/a

Default Re: [MySQL] SELECT DISTINCT gives wrong output - 07-10-2006 , 06:11 PM



Quote:
I have a problem with a MySQL query, that I have been trapped with for 3
days now. I just don't get it, as I think the result is inconsistent.
I am running MySQL 5.0.23 on a FreeBSD 6.1.

Your test case seems to work correctly on my machine, running 5.0.21 on
Windows XP.
Thanks for your test. Perhaps I should try a similar test on another
computer/MySQL,

Quote:
I notice in the change logs, there's a bug #18068 related to DISTINCT,
fixed in 5.0.23. It's often a good place to start looking, if there was a
change applied to a similar area of functionality recently. It could be
that the fix broke something else.
http://bugs.mysql.com/bug.php?id=18068
I actually have seen that bug, but as you also mentioned, it should have
been fixed in 5.0.23.




Reply With Quote
  #6  
Old   
Gints Plivna
 
Posts: n/a

Default Re: SELECT DISTINCT gives wrong output - 07-11-2006 , 03:25 AM



JD wrote:
Quote:
Oh my GOD! I'm a really wondering that MySQL allows that. At least I
don't know how can this list be ordered if there are two values of
id_host with different ldate? Which one to take???
Try select distinct id_host, ldate FROM child ORDER BY ldate DESC LIMIT
20

The problem is thar it doesn't output all ID's, so I cannot see which ldate
values it has for these. I cannot see any logic in its way to find which
ID's to show?
Yea I cannot see any logic why allow such statement at all

Quote:
I cannot see why this should be any difference than eg. PHP, array_unique.
It keeps the first value and skips the rest identical values?
Rows in a heap table normally haven't such an attribute as first row or
last row. Normally there is no guarantee in which order they are
processed and the only guarantee to receive them in particular order
would be adding order by clause. you statement generally could give
inconsistent results i.e. with the same data there can be different
results.

Quote:
Do you have any other ideas how this might could work?
BTW you haven't formulated what you really need, but keeping in mind
you said, just take the first ldate, then you can probably use
something like that (I'v used Oracle, but I think syntax could be the
same in MySQL as well):
SQL> create table test (id_host number, ldate date);

Table created.
SQL> insert into test values (1, sysdate);

1 row created.

SQL> insert into test values (1, sysdate - 10);

1 row created.

SQL> insert into test values (2, sysdate - 5);

1 row created.

SQL> insert into test values (2, sysdate - 3);

1 row created.

SQL> select id_host, min(ldate) from test
2 group by id_host
3 order by min(ldate) desc
4 /

ID_HOST MIN(LDATE
---------- ---------
2 06-JUL-06
1 01-JUL-06

SQL> select id_host, max(ldate) from test
2 group by id_host
3 order by max(ldate) desc
4 /

ID_HOST MAX(LDATE
---------- ---------
1 11-JUL-06
2 08-JUL-06

Gints Plivna
http://www.gplivna.eu/



Reply With Quote
  #7  
Old   
JD
 
Posts: n/a

Default Re: SELECT DISTINCT gives wrong output - 07-11-2006 , 04:18 AM



Quote:
Rows in a heap table normally haven't such an attribute as first row or
last row. Normally there is no guarantee in which order they are
processed and the only guarantee to receive them in particular order
would be adding order by clause. you statement generally could give
inconsistent results i.e. with the same data there can be different
results.

BTW you haven't formulated what you really need, but keeping in mind
you said, just take the first ldate, then you can probably use
something like that (I'v used Oracle, but I think syntax could be the
same in MySQL as well):
[CUT]

Quote:
SQL> select id_host, max(ldate) from test
2 group by id_host
3 order by max(ldate) desc
4 /
This one is actually working perfect in my case also - with the "order by
max(ldate) desc".

I still don't get it - what is the difference on these two statements:

1)
select id_host, max(ldate) from test
group by id_host
order by ldate desc

2)
select id_host, ldate from test
group by id_host
order by max(ldate) desc

Number 2) is working - 1) is not!? In case 1 I would have thought that ldate
is sorted first, before GROUP BY and then it would only take the biggest
value of ldate. But some values are not outputtet? When I use 2) it looks
fine!?




Reply With Quote
  #8  
Old   
David Cressey
 
Posts: n/a

Default Re: [MySQL] SELECT DISTINCT gives wrong output - 07-11-2006 , 06:05 AM




"JD" <jonh (AT) doe (DOT) invalid> wrote

Quote:
I have a problem with a MySQL query, that I have been trapped with for 3
days now. I just don't get it, as I think the result is inconsistent.
I am running MySQL 5.0.23 on a FreeBSD 6.1.

I have these original data:

SELECT id_host FROM child ORDER BY ldate DESC LIMIT 20
273
273
127
214
2795
2101
272
77
271
270
15
269
268
267
266
265
264
2794
263
262

When I only want to have a DISTINCT id_host, the result looks like this,
which is very much wrong:

SELECT DISTINCT id_host FROM child ORDER BY ldate DESC LIMIT 20
273
2795
272
271
270
269
268
267
266
265
264
2794
263
262
261
260
259
258
257
256

As you might see, there are several ID's missing above - eg. 127, 214 and
2101 etc. The result SHOULD have looked like this:

273
127
214
2795
2101
272
77
271
270
15
269
268
267
266
265
264
2794
263
262

I would really apreciate any feedback. Perhaps I am doing something wrong
but I cannot see where?


Your presentation of the original data is incomplete. You should have
showed the ldate. That way, we could understand what the query is doing.

In the meantime, try this:

SELECT id_host
max (ldate) as ldate
FROM child
GROUP by id_host
ORDER BY ldate DESC LIMIT 20

I can't try this on MySql. Good luck.




Reply With Quote
  #9  
Old   
JD
 
Posts: n/a

Default Re: [MySQL] SELECT DISTINCT gives wrong output - 07-11-2006 , 06:17 AM



Quote:
Your presentation of the original data is incomplete. You should have
showed the ldate. That way, we could understand what the query is doing.
Well, I will admit that but ldate is just dates ordered by DESC.

Quote:
In the meantime, try this:
SELECT id_host
max (ldate) as ldate
FROM child
GROUP by id_host
ORDER BY ldate DESC LIMIT 20
I have tried that and it didn't work eighter. As written earlier, this works
fine:

SELECT id_host
FROM child
GROUP BY id_host
ORDER BY MAX(ldate) DESC LIMIT 20




Reply With Quote
  #10  
Old   
Gene Wirchenko
 
Posts: n/a

Default Re: SELECT DISTINCT gives wrong output - 07-11-2006 , 01:28 PM



On Tue, 11 Jul 2006 11:18:14 +0200, "JD" <jonh (AT) doe (DOT) xyz> wrote:

[snip]

Quote:
2)
select id_host, ldate from test
group by id_host
order by max(ldate) desc

Number 2) is working - 1) is not!? In case 1 I would have thought that ldate
The second query relies on undefined behaviour. If there is a
group by clause, it must use all non-aggregated columns. For the
second query, this is both id_host AND ldate.

Quote:
is sorted first, before GROUP BY and then it would only take the biggest
value of ldate. But some values are not outputtet? When I use 2) it looks
fine!?
Sincerely,

Gene Wirchenko



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.