![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I'm coming from Oracle world and trying to find something similar to rownum in Oracle. I know there exists TOP which normally if used in the same select woth order by firstly sorts data and then only gets top n. So the question is what actually happens when top is used in inner query and order by in outer query. The problem is that it seems to be somehow inconsistent at least for the first sight. Using SQL Server 2005 So I have following test case: create table t3 (id integer, data varchar(4000)); insert into t3 values (1, replicate('a', 4000)); insert into t3 values (2, replicate('b', 4000)); insert into t3 values (3, replicate('c', 4000)); insert into t3 values (4, replicate('d', 4000)); insert into t3 values (5, replicate('e', 4000)); insert into t3 values (6, replicate('f', 4000)); insert into t3 values (7, replicate('g', 4000)); insert into t3 values (8, replicate('h', 4000)); insert into t3 values (9, replicate('i', 4000)); SET STATISTICS IO ON firstly just select all rows to know how many logical reads are needed for all table. select * from t3 1 aaa... ... 9 iiii.... logical reads 5 Now get first two rows without any where clause: select top 2 * from t3 1 aaa... 2 bbb... logical reads 1 Now the same first two rows just with outer select without any order by: select * from ( select top 2 * from t3 ) as q 1 aaa... 2 bbb... logical reads 1 OK till now it's as expected, just one logical read get first 2 rows and end query. However look at next query's logical reads 5. This somehow is very interestingly equal to logical reads for select all rows from t3. select * from ( select top 2 * from t3 ) as q order by data asc 1 aaa... 2 bbb... logical reads 5 So the next one shows that order by clause has affected the result set and actually semms to be pushed into inner query. Also logical reads are 5 meaning that actually we have scanned all the table. select * from ( select top 2 * from t3 ) as q order by data desc 9 iii.. 8 hhh... logical reads 5 However for TOP 1 everything works in a different way i.e. there is always the same one row and the same one logical read in spite of diffferent order by clauses: select * from ( select top 1 * from t3 ) as q order by data asc 1 aaa.... logical reads 1 select * from ( select top 1 * from t3 ) as q order by data desc 1 aaa.... logical reads 1 So where is the truth? Why the functionality is different? The business case is that we have search with potentially weak user criteria resulting in BIG potential result sets, but we want to show the user just ANY N rows satisfying criteria. But these N rows should be ordered. So what I'd like to achieve is: 1) get ANY no more than N rows according to my criteria 2) sort these N rows according to my order by clause. I DEFINITELY don't want: 1) get ALL rows 2) sort them and throw away all but first N. TIA, Gints |
#3
| |||
| |||
|
|
I'm coming from Oracle world and trying to find something similar to rownum in Oracle. I know there exists TOP which normally if used in the same select woth order by firstly sorts data and then only gets top n. So the question is what actually happens when top is used in inner query and order by in outer query. The problem is that it seems to be somehow inconsistent at least for the first sight. Using SQL Server 2005 So I have following test case: create table t3 (id integer, data varchar(4000)); insert into t3 values (1, replicate('a', 4000)); insert into t3 values (2, replicate('b', 4000)); insert into t3 values (3, replicate('c', 4000)); insert into t3 values (4, replicate('d', 4000)); insert into t3 values (5, replicate('e', 4000)); insert into t3 values (6, replicate('f', 4000)); insert into t3 values (7, replicate('g', 4000)); insert into t3 values (8, replicate('h', 4000)); insert into t3 values (9, replicate('i', 4000)); SET STATISTICS IO ON firstly just select all rows to know how many logical reads are needed for all table. select * from t3 1 aaa... ... 9 iiii.... logical reads 5 Now get first two rows without any where clause: select top 2 * from t3 1 aaa... 2 bbb... logical reads 1 Now the same first two rows just with outer select without any order by: select * from ( select top 2 * from t3 ) as q 1 aaa... 2 bbb... logical reads 1 OK till now it's as expected, just one logical read get first 2 rows and end query. However look at next query's logical reads 5. This somehow is very interestingly equal to logical reads for select all rows from t3. select * from ( select top 2 * from t3 ) as q order by data asc 1 aaa... 2 bbb... logical reads 5 So the next one shows that order by clause has affected the result set and actually semms to be pushed into inner query. Also logical reads are 5 meaning that actually we have scanned all the table. select * from ( select top 2 * from t3 ) as q order by data desc 9 iii.. 8 hhh... logical reads 5 However for TOP 1 everything works in a different way i.e. there is always the same one row and the same one logical read in spite of diffferent order by clauses: select * from ( select top 1 * from t3 ) as q order by data asc 1 aaa.... logical reads 1 select * from ( select top 1 * from t3 ) as q order by data desc 1 aaa.... logical reads 1 So where is the truth? Why the functionality is different? The business case is that we have search with potentially weak user criteria resulting in BIG potential result sets, but we want to show the user just ANY N rows satisfying criteria. But these N rows should be ordered. So what I'd like to achieve is: 1) get ANY no more than N rows according to my criteria 2) sort these N rows according to my order by clause. I DEFINITELY don't want: 1) get ALL rows 2) sort them and throw away all but first N. TIA, Gints |
#4
| |||
| |||
|
|
Your table is so small, that even the simplest query plan is so cheap that the optimizer seems to consider it useless to search for anything better. |

#5
| |||
| |||
|
|
See also the ROW_NUMBER() function, which is standard SQL and supported by ... SQL Server. |
#6
| |||
| |||
|
|
OK I've also tried it with much bigger table and got the same results as you. Let's hope optimizer will be smart enough to distinguish big work from small work and in case of big work won't do order before top ![]() |
#7
| |||
| |||
|
|
Gints Plivna (gints.pli... (AT) gmail (DOT) com) writes: OK I've also tried it with much bigger table and got the same results as you. Let's hope optimizer will be smart enough to distinguish big work from small work and in case of big work won't do order before top ![]() Since you are on SQL 2005, any reason to not use row_number()? Then you would have code that would run both on SQL Server and on Oracle? |
.
#8
| |||
| |||
|
|
I'm quite sure row_number won't help me in this case, because I'd like to limit found number of rows. row_number actually must have order by clause and ordering before limiting returned number of rows is the thing I'd like to avoid. |
|
And also this code will run only on SQL Server, so no need for "portable SQL" (I'm BTW quite sceptical about such "portable SQLs" generally, because usually it means code will be slow on all databases . |
#9
| |||
| |||
|
|
Gints Plivna (gints.pli... (AT) gmail (DOT) com) writes: I'm quite sure row_number won't help me in this case, because I'd like to limit found number of rows. row_number actually must have order by clause and ordering before limiting returned number of rows is the thing I'd like to avoid. That can be achieved with row_number with some trickery. Consider: * *with numbered AS ( * *select *, rn = row_number() OVER (order by x) * *from * (SELECT *, x = 'x' FROM Orders) as s * *) * *SELECT * FROM numbered WHERE rn < 20 * *ORDER BY CustomerID |

![]() |
| Thread Tools | |
| Display Modes | |
| |