dbTalk Databases Forums  

cumulative sum with threshold

comp.databases.ibm-db2 comp.databases.ibm-db2


Discuss cumulative sum with threshold in the comp.databases.ibm-db2 forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Lennart Jonsson
 
Posts: n/a

Default cumulative sum with threshold - 10-31-2011 , 03:14 PM






Simplified scenario:

with T (prio, status, credit, limit) as (
values (1,0,10,15), (2,1,2,15), (3,0,10,15),
(4,0,1,15), (5,1,10,15)
)
select prio, status, credit, limit,
sum( case when status = 0 then credit else 0 end )
over ( order by prio ) as acc
from T

PRIO STATUS CREDIT LIMIT ACC
----------- ----------- ----------- ----------- -----------
1 0 10 15 10
2 1 2 15 10
3 0 10 15 20
4 0 1 15 21
5 1 10 15 21


What I would like to calculate is

PRIO ACC2
1 10
2 10
3 10 -- because ACC + CREDIT > LIMIT
4 11
5 11

I.e. I would like to calculate the cumulative sum of credit (where
status = 0) as long as it does not exceed limit, in that case it
contributes with 0 and the cumulative sum continues from there. Any
suggestions?


/Lennart

Reply With Quote
  #2  
Old   
Tonkuma
 
Posts: n/a

Default Re: cumulative sum with threshold - 11-02-2011 , 03:44 AM






An easy way may be using recursive query.

Reply With Quote
  #3  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: cumulative sum with threshold - 11-02-2011 , 04:59 AM



On 2011-11-02 10:44, Tonkuma wrote:
Quote:
An easy way may be using recursive query.
Yes, but the problem is that the query will be part of a merge
statement, and if I recall correctly that is not allowed in a merge.

In some sence this is a variation of the thread "Help in writing an SQL
query to allocate parts to purchase orders" with an additional
constraint that we are not allowed to divide an order. No matter how I
try I seem to end up with a nasty self-reference to the cumulative sum.

I imagine that the problem stated can't be that unusal, say that we have
a number of gods that we should stuff in a container, the gods have to
be picked in a certain order and the container must not exceed a certain
weight. I figure that an extension to the
window-aggregation-group-clause could help with such queries.
Hypothetical example:

with T (prio, status, credit, limit) as (
values (1,0,10,15), (2,1,2,15), (3,0,10,15),
(4,0,1,15), (5,1,10,15)
)
select prio, status, credit, limit,
sum( case when status = 0 then credit else 0 end )
over ( order by prio RESULT BETWEEN 0 AND limit ) as acc
from T



/Lelle

Reply With Quote
  #4  
Old   
Tonkuma
 
Posts: n/a

Default Re: cumulative sum with threshold - 11-02-2011 , 06:20 AM



I tried to make the query without using recursive query.
I made it partly(too complex and there are some repeating subqueries),
but also used (non-recursive) common-table-expressions.
So, it must not be a solution for your problem.

My idea is to use data-change-table-references in common-table-
expressions, insted of using merge statement.

Please see "Example 1" in this thread...
dBforums > Database Server Software > DB2 > Update query with
conditional joins?
http://www.dbforums.com/db2/1671168-...nal-joins.html

Reply With Quote
  #5  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: cumulative sum with threshold - 11-02-2011 , 07:19 AM



On 2011-11-02 13:20, Tonkuma wrote:
Quote:
I tried to make the query without using recursive query.
I made it partly(too complex and there are some repeating subqueries),
but also used (non-recursive) common-table-expressions.
So, it must not be a solution for your problem.

My idea is to use data-change-table-references in common-table-
expressions, insted of using merge statement.

Please see "Example 1" in this thread...
dBforums > Database Server Software > DB2 > Update query with
conditional joins?
http://www.dbforums.com/db2/1671168-...nal-joins.html

Thanks Tonkuma, I'll try to rephrase the problem using
data-change-table-references. Given the constraints I left out it
becomes quite challenging, but hey that's the fun part of it ;-)



/Lennart

Reply With Quote
  #6  
Old   
Tonkuma
 
Posts: n/a

Default Re: cumulative sum with threshold - 11-05-2011 , 08:09 PM



I tried to write the query without using common-table-
expression(including recursive cte).
Here is the example.

Apart from its lengthy and complex code,
the most problem(I thought) was repeating series of nested-table-
expressions, like...
g1 ---> g ---> g2 g ---> g3 ---> g --->g4 ---> g
(g0 is necessary)
I used 4 level of nesting in the example,
but the nesting level must be incresed depending on the number of
rows in the table.

Note: A common-table-expression T was used to make test data.
As you know, it can be replaced by a CREATE TABLE statement and an
INSERT statement.

Example 1:
------------------------- Commands Entered
------------------------------
WITH
T (prio, status, credit, limit) AS (
VALUES
( 1 , 0, 10 , 15 ) -- 10 1 - 1
, ( 2 , 1, 2 , 15 ) --
, ( 3 , 0, 10 , 15 ) -- 10
, ( 4 , 0, 1 , 15 ) -- 11 3 - 4
, ( 5 , 1, 10 , 15 ) --

, ( 6 , 0, 8 , 15 ) -- 5
, ( 7 , 0, 5 , 15 ) -- 13
, ( 8 , 0, 2 , 15 ) -- 15 6 - 8
, ( 9 , 3, 10 , 15 ) --
, ( 10 , 0, 2 , 15 ) -- 2
, ( 11 , 0, 3 , 15 ) -- 5
, ( 12 , 0, 4 , 15 ) -- 9
, ( 13 , 0, 5 , 15 ) -- 14 10 - 13
, ( 14 , 0, 6 , 15 ) -- 6 14 - 14
, ( 15 , 0, 10 , 15 ) -- 19 15 - 15
, ( 16 , 2, 7 , 15 ) --
)
SELECT t.*
, SUM( CASE t.status WHEN 0 THEN t.credit END )
OVER( PARTITION BY low_prio
ORDER BY t.prio ) AS acc2
FROM t
LEFT OUTER JOIN
(SELECT low_prio
, LEAD( low_prio , 1 , 0 )
OVER(ORDER BY low_prio) - 1 AS high_prio
/* level 4 */
FROM (
SELECT low_prio , high_prio
, low_rnum , high_rnum
FROM (-- g3
SELECT g.*
, '*'
Quote:
| LISTAGG( CHAR(high_rnum) || '*' )
OVER(ORDER BY low_rnum) AS
high_rnum_list
/* level 3 */
FROM (
SELECT low_prio , high_prio
, low_rnum , high_rnum
FROM (-- g3
SELECT g.*
, '*'
Quote:
| LISTAGG( CHAR(high_rnum) || '*' )
OVER(ORDER BY low_rnum) AS
high_rnum_list
/* level 2 */
FROM (
SELECT low_prio , high_prio
, low_rnum , high_rnum
FROM (-- g2
SELECT g.*
, '*' || LISTAGG( CHAR(high_rnum) ||
'*' )
OVER(ORDER BY low_rnum)
high_rnum_list
/* level 1 */
FROM (
SELECT low_prio , high_prio
, low_rnum , high_rnum
FROM (-- g1
SELECT g.*
, '*' || LISTAGG( CHAR(high_rnum) ||
'*' )
OVER(ORDER BY low_rnum)
high_rnum_list
/* level 0 */
FROM (
SELECT low.prio AS
low_prio
, MAX( NULLIF(acc.prio , high.prio) ) AS
high_prio
, MIN(low.rnum) AS
low_rnum
, MAX( high.rnum ) - 1 AS
high_rnum
FROM (SELECT t.*
, INT( ROW_NUMBER() OVER(ORDER BY prio) )
AS rnum
FROM T
WHERE status = 0
) AS low
INNER JOIN
(SELECT COALESCE( prio
, (SELECT MAX(prio) + 1
FROM T)
) AS
prio
, MAX(limit) AS
limit
, INT( ROW_NUMBER()
OVER(ORDER BY prio) ) AS
rnum
FROM T
WHERE status = 0
GROUP BY
ROLLUP ( (prio) )
) AS high
ON high.prio >= low.prio
INNER JOIN
(SELECT COALESCE( prio
, (SELECT MAX(prio) + 1
FROM T)
) AS
prio
, COALESCE( credit , MAX(limit) + 1 ) AS
credit
FROM T
WHERE status = 0
GROUP BY
ROLLUP ( (prio , credit) )
) AS acc
ON acc.prio BETWEEN low.prio AND high.prio
GROUP BY
low.prio , high.prio , high.limit
HAVING
SUM( acc.credit ) > high.limit
AND SUM( acc.credit
* SIGN(high.prio - acc.prio)
) <= high.limit
) g
/* End of level 0 */
/* level 1 */
) g1
WHERE low_rnum = 1
OR LOCATE( '*' || CHAR(low_rnum - 1) || '*' ,
high_rnum_list ) > 0
) g
/* End of level 1 */
/* level 2 */
) g2
WHERE low_rnum = 1
OR LOCATE( '*' || CHAR(low_rnum - 1) || '*' ,
high_rnum_list ) > 0
) g
/* End of level 2 */
/* level 3 */
) g3
WHERE low_rnum = 1
OR LOCATE( '*' || CHAR(low_rnum - 1) || '*' ,
high_rnum_list ) > 0
) g
/* End of level 3 */
/* level 4 */
) g4
WHERE low_rnum = 1
OR LOCATE( '*' || CHAR(low_rnum - 1) || '*' ,
high_rnum_list ) > 0
) g
/* End of level 4 */
) g
ON t.prio BETWEEN g.low_prio AND g.high_prio
ORDER BY
prio
;
------------------------------------------------------------------------------

PRIO STATUS CREDIT LIMIT ACC2
----------- ----------- ----------- ----------- -----------
1 0 10 15 10
2 1 2 15 10
3 0 10 15 10
4 0 1 15 11
5 1 10 15 11
6 0 8 15 8
7 0 5 15 13
8 0 2 15 15
9 3 10 15 15
10 0 2 15 2
11 0 3 15 5
12 0 4 15 9
13 0 5 15 14
14 0 6 15 6
15 0 10 15 10
16 2 7 15 10

16 record(s) selected.

Reply With Quote
  #7  
Old   
Tonkuma
 
Posts: n/a

Default Re: cumulative sum with threshold - 11-05-2011 , 08:23 PM



I tried to write the query without using common-table-
expression(including recursive cte).
Here is the example.

Apart from its lengthy and complex code,
the most problem(I thought) was repeating series of nested-table-
expressions,
like...
g1 ---> g ---> g2 ---> g3 ---> g --->g4 ---> g
(g0 is necessary)
I used 4 level of nesting in the example,
but the nesting level must be incresed depending on the number of rows
in the table.

Example 1:
------------------------- Commands Entered
------------------------------
WITH
T (prio, status, credit, limit) AS (
VALUES
( 1 , 0, 10 , 15 ) -- 10 1 - 1
, ( 2 , 1, 2 , 15 ) --
, ( 3 , 0, 10 , 15 ) -- 10
, ( 4 , 0, 1 , 15 ) -- 11 3 - 4
, ( 5 , 1, 10 , 15 ) --

, ( 6 , 0, 8 , 15 ) -- 5
, ( 7 , 0, 5 , 15 ) -- 13
, ( 8 , 0, 2 , 15 ) -- 15 6 - 8
, ( 9 , 3, 10 , 15 ) --
, ( 10 , 0, 2 , 15 ) -- 2
, ( 11 , 0, 3 , 15 ) -- 5
, ( 12 , 0, 4 , 15 ) -- 9
, ( 13 , 0, 5 , 15 ) -- 14 10 - 13
, ( 14 , 0, 6 , 15 ) -- 6 14 - 14
, ( 15 , 0, 10 , 15 ) -- 19 15 - 15
, ( 16 , 2, 7 , 15 ) --
)
SELECT t.*
, SUM( CASE t.status WHEN 0 THEN t.credit END )
OVER( PARTITION BY low_prio
ORDER BY t.prio ) AS acc2
FROM t
LEFT OUTER JOIN
(SELECT low_prio
, LEAD( low_prio , 1 , 0 )
OVER(ORDER BY low_prio) - 1 AS high_prio
/* level 4 */
FROM (
SELECT low_prio , high_prio
, low_rnum , high_rnum
FROM (-- g3
SELECT g.*
, '*' ||
LISTAGG( CHAR(high_rnum) || '*' )
OVER(ORDER BY low_rnum) AS
high_rnum_list
/* level 3 */
FROM (
SELECT low_prio , high_prio
, low_rnum , high_rnum
FROM (-- g3
SELECT g.*
, '*' ||
LISTAGG( CHAR(high_rnum) || '*' )
OVER(ORDER BY low_rnum) AS
high_rnum_list
/* level 2 */
FROM (
SELECT low_prio , high_prio
, low_rnum , high_rnum
FROM (-- g2
SELECT g.*
, '*' ||
LISTAGG( CHAR(high_rnum) || '*' )
OVER(ORDER BY low_rnum) AS
high_rnum_list
/* level 1 */
FROM (
SELECT low_prio , high_prio
, low_rnum , high_rnum
FROM (-- g1
SELECT g.*
, '*' ||
LISTAGG( CHAR(high_rnum) || '*' )
OVER(ORDER BY low_rnum) AS
high_rnum_list
/* level 0 */
FROM (
SELECT low.prio AS low_prio
, MAX( NULLIF( acc.prio , high.prio)
) AS high_prio
, MIN(low.rnum) AS low_rnum
, MAX( high.rnum ) - 1 AS high_rnum
FROM (SELECT t.*
, INT( ROW_NUMBER()
OVER(ORDER BY prio) ) AS rnum
FROM T
WHERE status = 0
) AS low
INNER JOIN
(SELECT COALESCE( prio
, (SELECT MAX(prio) + 1
FROM T)
) AS prio
, MAX(limit) AS limit
, INT( ROW_NUMBER()
OVER(ORDER BY prio) ) AS rnum
FROM T
WHERE status = 0
GROUP BY
ROLLUP ( (prio) )
) AS high
ON high.prio >= low.prio
INNER JOIN
(SELECT COALESCE( prio
, (SELECT MAX(prio)
FROM T) + 1
) AS prio
, COALESCE( credit
, MAX(limit) + 1
) AS credit
FROM T
WHERE status = 0
GROUP BY
ROLLUP ( (prio , credit) )
) AS acc
ON acc.prio BETWEEN low.prio AND high.prio
GROUP BY
low.prio , high.prio , high.limit
HAVING
SUM( acc.credit ) > high.limit
AND SUM( acc.credit
* SIGN(high.prio - acc.prio)
) <= high.limit
) g
/* End of level 0 */
/* level 1 */
) g1
WHERE low_rnum = 1
OR LOCATE( '*' || CHAR(low_rnum - 1) || '*' ,
high_rnum_list ) > 0
) g
/* End of level 1 */
/* level 2 */
) g2
WHERE low_rnum = 1
OR LOCATE( '*' || CHAR(low_rnum - 1) || '*' ,
high_rnum_list ) > 0
) g
/* End of level 2 */
/* level 3 */
) g3
WHERE low_rnum = 1
OR LOCATE( '*' || CHAR(low_rnum - 1) || '*' ,
high_rnum_list ) > 0
) g
/* End of level 3 */
/* level 4 */
) g4
WHERE low_rnum = 1
OR LOCATE( '*' || CHAR(low_rnum - 1) || '*' ,
high_rnum_list ) > 0
) g
/* End of level 4 */
) g
ON t.prio BETWEEN g.low_prio AND g.high_prio
ORDER BY
prio
;
------------------------------------------------------------------------------

PRIO STATUS CREDIT LIMIT ACC2
----------- ----------- ----------- ----------- -----------
1 0 10 15 10
2 1 2 15 10
3 0 10 15 10
4 0 1 15 11
5 1 10 15 11
6 0 8 15 8
7 0 5 15 13
8 0 2 15 15
9 3 10 15 15
10 0 2 15 2
11 0 3 15 5
12 0 4 15 9
13 0 5 15 14
14 0 6 15 6
15 0 10 15 10
16 2 7 15 10

16 record(s) selected.

Reply With Quote
  #8  
Old   
Tonkuma
 
Posts: n/a

Default Re: cumulative sum with threshold - 11-05-2011 , 09:36 PM



Quote:
the most problem(I thought) was repeating series of nested-table-
expressions,
*like...
g1 ---> g ---> g2 ---> g3 ---> g --->g4 ---> g
(g0 is necessary)
Correction:
g1 ---> g ---> g2 ---> g ---> g3 ---> g --->g4 ---> g

Reply With Quote
  #9  
Old   
Tonkuma
 
Posts: n/a

Default Re: cumulative sum with threshold - 11-06-2011 , 01:27 PM



I tried to remove repeating series of nested-table-expressions in
Example 1, by using an UDF.
The repetition was replaced by a recursion in the UDF.

Example 2:
2-a: UDF
------------------------------ Commands Entered
------------------------------
CREATE FUNCTION find_sequence( in_list VARCHAR(4000) )
RETURNS VARCHAR(4000)
DETERMINISTIC
NO EXTERNAL ACTION
RETURN
WITH
r ( p , k , max_p ) AS (
VALUES ( 1 , 0 , LENGTH(in_list) / 11 )
UNION ALL
SELECT SUBSTR(in_list , p * 11 - 10 , 11)
, k + 1
, max_p
FROM r
WHERE p < max_p
AND k < 1000
)
SELECT LISTAGG( CHAR(p) )
FROM r
WHERE p < max_p
;
------------------------------------------------------------------------------
DB20000I The SQL command completed successfully.


2-2: Query
------------------------------ Commands Entered
------------------------------
WITH T (prio, status, credit, limit) AS (
VALUES
( 1 , 0, 10 , 15 ) -- 10 1 - 1
, ( 2 , 1, 2 , 15 ) --
, ( 3 , 0, 10 , 15 ) -- 10
, ( 4 , 0, 1 , 15 ) -- 11 3 - 4
, ( 5 , 1, 10 , 15 ) --
, ( 6 , 0, 8 , 15 ) -- 5
, ( 7 , 0, 5 , 15 ) -- 13
, ( 8 , 0, 2 , 15 ) -- 15 6 - 8
, ( 9 , 1, 10 , 15 ) --
, ( 10 , 0, 2 , 15 ) -- 2
, ( 11 , 0, 3 , 15 ) -- 5
, ( 12 , 0, 4 , 15 ) -- 9
, ( 13 , 0, 5 , 15 ) -- 14 10 - 13
, ( 14 , 0, 6 , 15 ) -- 6 14 - 14
, ( 15 , 0, 10 , 15 ) -- 19 15 - 15
, ( 16 , 2, 7 , 15 ) --
)
SELECT t.*
, SUM( CASE status WHEN 0 THEN credit END )
OVER( PARTITION BY low_prio
ORDER BY prio ) AS acc2
FROM t
LEFT OUTER JOIN
(SELECT low_prio
, LEAD(low_prio)
OVER(ORDER BY low_prio) - 1 AS high_prio
/* valid_group: Part 1 */
FROM (SELECT low_prio
FROM (SELECT low_prio
, low_rnum
, LISTAGG( CHAR(high_rnum + 1) )
OVER( ORDER BY low_rnum
ROWS BETWEEN UNBOUNDED
PRECEDING
AND UNBOUNDED
FOLLOWING
) AS high_rnum_list
/* find_group */
FROM (SELECT low.prio AS low_prio
, MAX( NULLIF( acc.prio
, high.prio ) ) AS high_prio
, MIN( low.rnum) AS low_rnum
, MAX(high.rnum) - 1 AS high_rnum
FROM (SELECT COALESCE( prio
, (SELECT MAX(prio)
FROM T) + 1
) AS prio
, INT( ROW_NUMBER()
OVER(ORDER BY prio) ) AS rnum
FROM T
WHERE status = 0
GROUP BY
ROLLUP ( (prio) )
) AS low
INNER JOIN
(SELECT COALESCE( prio
, (SELECT MAX(prio)
FROM T) + 1
) AS prio
, MAX(limit) AS limit
, INT( ROW_NUMBER()
OVER(ORDER BY prio) ) AS rnum
FROM T
WHERE status = 0
GROUP BY
ROLLUP ( (prio) )
) AS high
ON high.prio >= low.prio
INNER JOIN
(SELECT COALESCE( prio
, (SELECT MAX(prio)
FROM T) + 1
) AS prio
, COALESCE( credit
, MAX(limit) + 1
) AS credit
FROM T
WHERE status = 0
GROUP BY
ROLLUP ( (prio , credit) )
) AS acc
ON acc.prio BETWEEN low.prio
AND high.prio
GROUP BY
low.prio , high.prio , high.limit
HAVING
SUM( acc.credit ) > high.limit
AND SUM( acc.credit
* SIGN(high.prio - acc.prio)
) <= high.limit
)
/* End of find_group */
/* valid_group: Part 2 */
)
WHERE LOCATE( CHAR(low_rnum)
, find_sequence(high_rnum_list) ) > 0
)
/* End of valid_group */
) g
ON prio BETWEEN low_prio AND high_prio
ORDER BY
prio
;
------------------------------------------------------------------------------

PRIO STATUS CREDIT LIMIT ACC2
----------- ----------- ----------- ----------- -----------
1 0 10 15 10
2 1 2 15 10
3 0 10 15 10
4 0 1 15 11
5 1 10 15 11
6 0 8 15 8
7 0 5 15 13
8 0 2 15 15
9 1 10 15 15
10 0 2 15 2
11 0 3 15 5
12 0 4 15 9
13 0 5 15 14
14 0 6 15 6
15 0 10 15 10
16 2 7 15 10

16 record(s) selected.

Reply With Quote
  #10  
Old   
Tonkuma
 
Posts: n/a

Default Re: cumulative sum with threshold - 11-06-2011 , 06:48 PM



Replace three scalar-subselects by a subquery "f".

Example 2-2a:
....
/* find_group */
FROM (SELECT low.prio AS
low_prio
, MAX( NULLIF(acc.prio , high.prio) ) AS
high_prio
, MIN( low.rnum) AS
low_rnum
, MAX(high.rnum) - 1 AS
high_rnum
FROM (SELECT MAX(prio) + 1 AS max_prio
, MAX(limit) + 1 AS max_limit
FROM T
) AS f
CROSS JOIN
LATERAL
(SELECT COALESCE(prio , max_prio) AS prio
, INT( ROW_NUMBER()
OVER(ORDER BY prio) ) AS rnum
FROM T
WHERE status = 0
GROUP BY
ROLLUP ( (prio) )
) AS low
INNER JOIN
LATERAL
(SELECT COALESCE(prio , max_prio) AS prio
, MAX(limit) AS limit
, INT( ROW_NUMBER()
OVER(ORDER BY prio) ) AS rnum
FROM T
WHERE status = 0
GROUP BY
ROLLUP ( (prio) )
) AS high
ON high.prio >= low.prio
INNER JOIN
LATERAL
(SELECT COALESCE(prio , max_prio) AS prio
, COALESCE(credit , max_limit) AS credit
FROM T
WHERE status = 0
GROUP BY
ROLLUP ( (prio , credit) )
) AS acc
ON acc.prio BETWEEN low.prio AND high.prio
GROUP BY
low.prio , high.prio , high.limit
HAVING
SUM( acc.credit ) > high.limit
AND SUM( acc.credit
* SIGN(high.prio - acc.prio)
) <= high.limit
)
/* End of find_group */
....

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.