dbTalk Databases Forums  

finding aggregate values in parent - child scheme

comp.databases.ms-sqlserver comp.databases.ms-sqlserver


Discuss finding aggregate values in parent - child scheme in the comp.databases.ms-sqlserver forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
migurus
 
Posts: n/a

Default finding aggregate values in parent - child scheme - 01-19-2010 , 06:23 PM






I have a task of finding records that have biggest values in a given
attributes,
where data is split between 'parent' table that holds my main
categories
and a few 'children' tables that keep temporal data.
To illustrate I decided to use simple scenario of a customer and a
list his purchases.

USE tempdb

CREATE TABLE CUSTOMER (
ID int
, NAME varchar(32)
)
CREATE TABLE PURCHASE (
CID int -- customer id
, PID int -- product id
, AMT int -- amount
, PDATE datetime
, ID int IDENTITY
)
INSERT INTO CUSTOMER SELECT 1, N'BOB'
INSERT INTO CUSTOMER SELECT 2, N'MARY'
INSERT INTO CUSTOMER SELECT 3, N'JERRY'

INSERT INTO PURCHASE SELECT 1, 10, 500, '20091201'
INSERT INTO PURCHASE SELECT 1, 20, 200, '20100101'
INSERT INTO PURCHASE SELECT 1, 10, 200, '20100105'
INSERT INTO PURCHASE SELECT 1, 10, 300, '20100105'

INSERT INTO PURCHASE SELECT 2, 10, 600, '20091215'
INSERT INTO PURCHASE SELECT 2, 20, 200, '20100101'
INSERT INTO PURCHASE SELECT 2, 20, 200, '20100105'

INSERT INTO PURCHASE SELECT 3, 10, 100, '20091225'
INSERT INTO PURCHASE SELECT 3, 10, 120, '20100101'
INSERT INTO PURCHASE SELECT 3, 10, 250, '20100105'


-- I need to identify customers and their biggest
-- purchase of a given product within a given date range
--
SELECT C.ID, C.NAME, P.AMT, P.PDATE
FROM
CUSTOMER C
, PURCHASE P
, (
SELECT CID, MAX(AMT) as MAXAMT
FROM PURCHASE
WHERE PID = 10
AND PDATE BETWEEN '20100101' and '20101231'
GROUP BY CID
) M

WHERE
P.PID = 10
AND P.CID = C.ID
AND P.PDATE BETWEEN '20100101' and '20101231'
AND P.AMT = M.MAXAMT
AND P.CID = M.CID

DROP TABLE CUSTOMER
DROP TABLE PURCHASE


The result shows
ID NAME AMT PDATE
1 BOB 300 2010-01-05 00:00:00.000
3 JERRY 250 2010-01-05 00:00:00.000

I am not sure if this query is optimal. If I understand correctly,
I am making it read data twice: first, when group by is done
into alias M, then when it joins C, P, and M.

Is this concern valid? If yes, what could be changed?

Any suggestions / comments appreciated

Reply With Quote
  #2  
Old   
Plamen Ratchev
 
Posts: n/a

Default Re: finding aggregate values in parent - child scheme - 01-19-2010 , 08:27 PM






The derived table gets expanded in the query plan so it doesn't really read the data twice. Here are two more solutions,
the first one being ANSI compliant and should perform good, assuming you have appropriate indexes:

SELECT C.ID, C.NAME, P.AMT, P.PDATE
FROM CUSTOMER AS C
JOIN (SELECT AMT, PDATE, PID, CID,
ROW_NUMBER() OVER(PARTITION BY CID ORDER BY AMT DESC) AS rk
FROM PURCHASE
WHERE PID = 10
AND PDATE BETWEEN '20100101' and '20101231') AS P
ON P.CID = C.ID
WHERE rk = 1;

SELECT C.ID, C.NAME, A.AMT, A.PDATE
FROM CUSTOMER AS C
CROSS APPLY (SELECT TOP (1) AMT, PDATE, PID, CID
FROM PURCHASE AS P
WHERE P.PID = 10
AND P.CID = C.ID
AND PDATE BETWEEN '20100101' and '20101231'
ORDER BY AMT DESC) AS A;

--
Plamen Ratchev
http://www.SQLStudio.com

Reply With Quote
  #3  
Old   
migurus
 
Posts: n/a

Default Re: finding aggregate values in parent - child scheme - 01-19-2010 , 08:59 PM



On Jan 19, 6:27*pm, Plamen Ratchev <Pla... (AT) SQLStudio (DOT) com> wrote:
Quote:
The derived table gets expanded in the query plan so it doesn't really read the data twice. Here are two more solutions,
the first one being ANSI compliant and should perform good, assuming you have appropriate indexes:

SELECT C.ID, C.NAME, P.AMT, P.PDATE
FROM CUSTOMER AS C
JOIN (SELECT AMT, PDATE, PID, CID,
* * * * * * * ROW_NUMBER() OVER(PARTITION BY CID ORDER BY AMT DESC) AS rk
* * * *FROM PURCHASE
* * * *WHERE PID = 10
* * * * *AND PDATE BETWEEN '20100101' and '20101231') AS P
* *ON P.CID = C.ID
WHERE rk = 1;

SELECT C.ID, C.NAME, A.AMT, A.PDATE
FROM CUSTOMER AS C
CROSS APPLY (SELECT TOP (1) AMT, PDATE, PID, CID
* * * * * * * FROM PURCHASE AS P
* * * * * * * WHERE P.PID = 10
* * * * * * * * AND P.CID = C.ID
* * * * * * * * AND PDATE BETWEEN '20100101' and '20101231'
* * * * * * * ORDER BY AMT DESC) AS A;

--
Plamen Ratchevhttp://www.SQLStudio.com
Thank you very much, your feedback is truly valuable.
I forgot to mention, that some of our clients still run SQL 2000.
Please let me know if there is some tricks applicable to 2000.
Again, your answer is appreciated very mich.

Reply With Quote
  #4  
Old   
Plamen Ratchev
 
Posts: n/a

Default Re: finding aggregate values in parent - child scheme - 01-19-2010 , 09:08 PM



Here is another SQL Server 2000 solution, but the solution you already have may be better:

SELECT C.ID, C.NAME, P.AMT, P.PDATE
FROM CUSTOMER AS C
JOIN PURCHASE AS P
ON C.ID = P.CID
WHERE P.PID = 10
AND P.PDATE BETWEEN '20100101' and '20101231'
AND P.AMT = (SELECT MAX(P2.AMT)
FROM PURCHASE AS P2
WHERE P2.CID = P.CID
AND P2.PDATE BETWEEN '20100101' and '20101231');

--
Plamen Ratchev
http://www.SQLStudio.com

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.