![]() | |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I'm hoping that Axel can answer this one for me. I have a need to add row numbers to the result set of a query, to give ranking within a particular sort order. A sample query (in practice there are a lot more columns extracted and calculated and which may be used as a sort column) is: SELECT @downloads_per_month := total_downloads/2 downloads_per_month, FORMAT(@downloads_per_month,GREATEST(PERIOD_DIFF(E XTRACT(YEAR_MONTH FROM NOW()),EXTRACT(YEAR_MONTH FROM member_since)),1)) downloads_per_month_display, @start := @start + 1 position FROM contributors2 JOIN (SELECT @start:=0) as start ORDER BY downloads_per_month LIMIT 20 Now the above query correctly numbers the rows from 1 to 20. However, if instead I sort on downloads_per_month_display, the row numbering goes haywire. An EXPLAIN shows that the "good" query uses filesort, whereas the "bad" one uses temporary and then filesort. Now what I don't get is why the "good" query actually works. Since the value of downloads_per_month isn't known until it is extracted, the query cannot use an index to pull the rows in the correct order. I would have expected that MySQL would pull the data for a row, calculate all the calculated columns (including the position one) and then once it had them all, sort them and take the first 20. This should mean that, unless I nested the query and did the numbering in the outer query, the row numbers would always be wrong. So how does it get the row numbers correct in this case? |
#3
| |||
| |||
|
|
Captain Paralytic <paul_lautman (AT) yahoo (DOT) com> wrote: SELECT @downloads_per_month := total_downloads/2 downloads_per_month, FORMAT(@downloads_per_month,GREATEST(PERIOD_DIFF(E XTRACT(YEAR_MONTH FROM NOW()),EXTRACT(YEAR_MONTH FROM member_since)),1)) downloads_per_month_display, @start := @start + 1 position FROM contributors2 JOIN (SELECT @start:=0) as start ORDER BY downloads_per_month LIMIT 20 Now the above query correctly numbers the rows from 1 to 20. However, if instead I sort on downloads_per_month_display, the row numbering goes haywire. I'm not Axel |

|
but my guess would be as follows: |
#4
| |||
| |||
|
|
tony (AT) mountifield (DOT) org (Tony Mountifield) wrote: Captain Paralytic <paul_lautman (AT) yahoo (DOT) com> wrote: SELECT @downloads_per_month := total_downloads/2 downloads_per_month, FORMAT(@downloads_per_month,GREATEST(PERIOD_DIFF(E XTRACT(YEAR_MONTH FROM NOW()),EXTRACT(YEAR_MONTH FROM member_since)),1)) downloads_per_month_display, @start := @start + 1 position FROM contributors2 JOIN (SELECT @start:=0) as start ORDER BY downloads_per_month LIMIT 20 Now the above query correctly numbers the rows from 1 to 20. However, if instead I sort on downloads_per_month_display, the row numbering goes haywire. I'm not Axel But I am! ![]() but my guess would be as follows: No need to guess. Quoting from http://dev.mysql.com/doc/refman/5.1/...variables.html "As a general rule, you should never assign a value to a user variable and read the value within the same statement. You might get the results you expect, but this is not guaranteed." So indeed Paul relies on undefined behavior. Also "In a SELECT statement, each select expression is evaluated only when sent to the client. This means that in a HAVING, GROUP BY, or ORDER BY clause, referring to a variable that is assigned a value in the select expression list does not work as expected" This second statement has to be taken with a pinch of salt, because expressions are also evaluated if the result goes to a temporary table. But we can learn that expressions mith be evaluated late in query execution, after filtering and maybe even sorting rows. So if a query requires a temporary table but one still wants to have line numbers, one should use a "derived table" subquery like so: SELECT @start:=@start+1 AS position, inner.* FROM (/* the original query, including SORT, LIMIT */) AS inner JOIN (SELECT @start:=0) as start BTW, that bogus JOIN to initialize @start is a nice trick. Haven't seen that before! Of course this also relies on undefined behavior - the JOIN order. Better use STRAIGHT_JOIN and put the initializer part first. XL |
![]() |
| Thread Tools | |
| Display Modes | |
| |