dbTalk Databases Forums  

Another CTE Question

comp.databases.ibm-db2 comp.databases.ibm-db2


Discuss Another CTE Question in the comp.databases.ibm-db2 forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Michel Esber
 
Posts: n/a

Default Another CTE Question - 08-26-2010 , 09:46 AM






Hi Guys,

DB2 LUW v9.5.

Can anyone shred some light here?

Sample DDL Code:

create table DEPARTMENT
(
DEP_ID integer not null primary key,
NAME VARCHAR(32) not null,
PARENT integer not null
) ;

insert into DEPARTMENT VALUES (1,'A',-2),(2,'B',1),(3,'C',2),
(4,'D',-2),(5,'E',4),(6,'F',5),(7,'G',6);

create table PERSON
(
PERSON_ID integer not null primary key,
NAME VARCHAR(128) not null,
DEP_ID integer,
constraint FK_01 foreign key (DEP_ID) references DEPARTMENT(DEP_ID)
);

insert into PERSON values (1,'John',2),(2,'Mary',2),(3,'Suzan',3),
(4,'Joe',3),(5,'Lewis',3),(6,'Elvis',4),(7,'Presle y',5),(8,'Preston',
5),(9,'Pee',7),(10,'Wee',7),(11,'Steve-o',7),(12,'Demi Moore',7);

with t_recurs (dep_id, parent, iter) as

(
select dep_id, parent, 0 from DEPARTMENT

union all

select r.dep_id, d.parent, iter + 1
from t_recurs r, DEPARTMENT d
where r.parent = d.dep_id and iter < 1000
)

select R.DEP_ID, COUNT(P.PERSON_ID)
from T_RECURS R inner join DEPARTMENT D on (R.DEP_ID, R.PARENT) =
(D.DEP_ID, D.PARENT)
left outer join PERSON P on P.DEP_ID = D.DEP_ID
group by R.DEP_ID
;


The output looks good:

DEP_ID DEP_COUNT
----------- -----------
1 0
2 2
3 3
4 1
5 2
6 0
7 4

What am I actually looking for is also the count of Persons that are
in all sub-departments in that hierarchy, including the departement a
person is currently in. Somehow I can't get this to work.

Expected output:

DEP_ID DEP_COUNT HIERARCHY_COUNT
----------- ----------- ---------------
1 0 5
2 2 5
3 3 3
4 1 7
5 2 6
6 0 6
7 4 4

Any ideas?

Thanks

Reply With Quote
  #2  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: Another CTE Question - 08-26-2010 , 12:09 PM






On 2010-08-26 16:46, Michel Esber wrote:
Quote:
Hi Guys,

DB2 LUW v9.5.

Can anyone shred some light here?

Sample DDL Code:

create table DEPARTMENT
(
DEP_ID integer not null primary key,
NAME VARCHAR(32) not null,
PARENT integer not null
) ;

insert into DEPARTMENT VALUES (1,'A',-2),(2,'B',1),(3,'C',2),
(4,'D',-2),(5,'E',4),(6,'F',5),(7,'G',6);

create table PERSON
(
PERSON_ID integer not null primary key,
NAME VARCHAR(128) not null,
DEP_ID integer,
constraint FK_01 foreign key (DEP_ID) references DEPARTMENT(DEP_ID)
);

insert into PERSON values (1,'John',2),(2,'Mary',2),(3,'Suzan',3),
(4,'Joe',3),(5,'Lewis',3),(6,'Elvis',4),(7,'Presle y',5),(8,'Preston',
5),(9,'Pee',7),(10,'Wee',7),(11,'Steve-o',7),(12,'Demi Moore',7);

with t_recurs (dep_id, parent, iter) as

(
select dep_id, parent, 0 from DEPARTMENT

union all

select r.dep_id, d.parent, iter + 1
from t_recurs r, DEPARTMENT d
where r.parent = d.dep_id and iter < 1000
)

select R.DEP_ID, COUNT(P.PERSON_ID)
from T_RECURS R inner join DEPARTMENT D on (R.DEP_ID, R.PARENT) =
(D.DEP_ID, D.PARENT)
left outer join PERSON P on P.DEP_ID = D.DEP_ID
group by R.DEP_ID
;


The output looks good:

DEP_ID DEP_COUNT
----------- -----------
1 0
2 2
3 3
4 1
5 2
6 0
7 4

What am I actually looking for is also the count of Persons that are
in all sub-departments in that hierarchy, including the departement a
person is currently in. Somehow I can't get this to work.

Expected output:

DEP_ID DEP_COUNT HIERARCHY_COUNT
----------- ----------- ---------------
1 0 5
2 2 5
3 3 3
4 1 7
5 2 6
6 0 6
7 4 4

Any ideas?

Here's a start ( I assumed parent -2 means that it is a root):

with tc (dep_id, ancestor_id, iter) as (
select dep_id, dep_id, 0 from department
union all
select tc.dep_id, d.parent, iter + 1
from tc, department d
where iter < 1000
and tc.ancestor_id = d.dep_id
and d.parent >= 0
)
select dep_id, ancestor_id from tc order by dep_id

Reply With Quote
  #3  
Old   
Michel Esber
 
Posts: n/a

Default Re: Another CTE Question - 08-26-2010 , 12:58 PM



Lennart, thanks a lot. I was able to create exactly what I want.

Since I had to reference my PERSON table twice (see below), I was
wondering if there are better ways to write this query. Performance of
this query is critical on production servers.

Here's the DDL:

with tc (dep_id, ancestor_id, iter) as (
select dep_id, dep_id, 0 from department
union all
select tc.dep_id, d.parent, iter + 1
from tc, department d
where iter < 1000
and tc.ancestor_id = d.dep_id
and d.parent >= 0
)


select HIERARCHY.ANCESTOR_ID, LEVEL.TOTAL, HIERARCHY.TOTAL from

(
select ancestor_id, count(NAME) as TOTAL
from tc t left outer join person p on (t.DEP_ID = p.DEP_ID)
group by ancestor_id
) as HIERARCHY

left outer join

(
select T.ancestor_id, count(NAME) as TOTAL
from (select distinct ANCESTOR_ID from tc ) t inner join person p on
(t.ANCESTOR_ID = p.DEP_ID)
group by T.ancestor_id
) as LEVEL

on (HIERARCHY.ANCESTOR_ID = LEVEL.ANCESTOR_ID)

order by HIERARCHY.ANCESTOR_ID
;

Reply With Quote
  #4  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: Another CTE Question - 08-26-2010 , 01:18 PM



On 2010-08-26 19:58, Michel Esber wrote:
Quote:
Lennart, thanks a lot. I was able to create exactly what I want.

Since I had to reference my PERSON table twice (see below), I was
wondering if there are better ways to write this query. Performance of
this query is critical on production servers.
I don't think that we can avoid the double reference of person.

If performance is critical an alternative approach to using cte's is to
materialize the transitive closure in a separate table and maintain that
table when the tree changes. I have some notes on such implementation
for mysql, but it is relatively easy to port to for example db2. Here's
a link in case you are interested:

http://www.dustbite.se/tree/

/Lennart

[...]

Reply With Quote
  #5  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: Another CTE Question - 08-26-2010 , 02:33 PM



On 2010-08-26 19:58, Michel Esber wrote:
Quote:
Lennart, thanks a lot. I was able to create exactly what I want.

Since I had to reference my PERSON table twice (see below), I was
wondering if there are better ways to write this query. Performance of
this query is critical on production servers.

Here's the DDL:

with tc (dep_id, ancestor_id, iter) as (
select dep_id, dep_id, 0 from department
union all
select tc.dep_id, d.parent, iter + 1
from tc, department d
where iter < 1000
and tc.ancestor_id = d.dep_id
and d.parent >= 0
)


select HIERARCHY.ANCESTOR_ID, LEVEL.TOTAL, HIERARCHY.TOTAL from

(
select ancestor_id, count(NAME) as TOTAL
from tc t left outer join person p on (t.DEP_ID = p.DEP_ID)
group by ancestor_id
) as HIERARCHY

left outer join

(
select T.ancestor_id, count(NAME) as TOTAL
from (select distinct ANCESTOR_ID from tc ) t inner join person p on
(t.ANCESTOR_ID = p.DEP_ID)
group by T.ancestor_id
) as LEVEL

on (HIERARCHY.ANCESTOR_ID = LEVEL.ANCESTOR_ID)

order by HIERARCHY.ANCESTOR_ID
;
Here's another variant:

db2 "with tc (dep_id, ancestor_id, iter) as (
select dep_id, dep_id, 0 from department
union all
select tc.dep_id, d.parent, iter + 1
from tc, department d
where iter < 1000
and tc.ancestor_id = d.dep_id
and d.parent >= 0
)
select tc.ancestor_id,
(select count(1) from person
where dep_id = tc.ancestor_id),
count(p.name)
from tc
join person p
on tc.dep_id = p.dep_id
group by tc.ancestor_id
order by tc.ancestor_id"

ANCESTOR_ID 2 3
----------- ----------- -----------
1 0 5
2 2 5
3 3 3
4 1 7
5 2 6
6 0 4
7 4 4

7 record(s) selected.

/Lennart

Reply With Quote
  #6  
Old   
Tonkuma
 
Posts: n/a

Default Re: Another CTE Question - 08-27-2010 , 03:52 AM



1) I think that HIERARCHY_COUNT for DEP_ID = 6 should be 4 like in
Lennart's result.

2) Recursion is not neccesary to get DEP_COUNT only.
Simple LEFT OUTER JOIN will be enough.

3) You don't need to reference PERSON table twice, like in the
following example.

4) I like to use upper case characters for keywords and lower case
characters for object names.


Here is an example:

WITH
tc(dep_id , descendant , iter) AS (
SELECT dep_id , dep_id , 0
FROM department
UNION ALL
SELECT tc.dep_id
, d. dep_id
, iter + 1
FROM tc
, department d
WHERE iter < 1000
AND d.parent = tc.descendant
)
SELECT tc.dep_id
, COUNT(CASE
WHEN tc.dep_id = tc.descendant THEN
/* or tc.dep_id = p. dep_id */
p.person_id
END
) AS dep_count
, COUNT(person_id) AS hierarchy_count
FROM tc
LEFT OUTER JOIN
person p
ON p.dep_id = tc.descendant
GROUP BY
tc.dep_id
;
------------------------------------------------------------------------------

DEP_ID DEP_COUNT HIERARCHY_COUNT
----------- ----------- ---------------
1 0 5
2 2 5
3 3 3
4 1 7
5 2 6
6 0 4
7 4 4

7 record(s) selected.

Reply With Quote
  #7  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Another CTE Question - 08-28-2010 , 04:47 AM



For some reason, Michel also posted his question to
comp.databases.ms-sqlserver and comp.databases.oracle.server, without
telling which product his really was on. Needless to say, in both
groups questions were raised because of unsupported syntax.

Anyway, I repost my answer here, slightly edited, since this appears
to be the newsgroup that Michel follows, in case the answer is helpful
to him.

............................

Although it looks like a standard problem it is not exactly trivial, if
you have not done it before. I know. I had to give up my first attempt
the other day, and I did not come back to the problem until now.

I think the query you have is a non-starter. You start from the bottom
and work upwards. This means that you start on every node, so you get
lots of extra rows. This can easily be dealt with a NOT EXISTS in the
anchor, but as you accumulate when you traverse upwards there is another
issue which your sample data would not reveal: If department 4 has one guy,
and two subdepartments 5 and 6 with one two guys each, you will count the
guy in dept 4 twice.

Eventually I came to the conclusion that you cannot do this without a
materialised path (unless you switch to a representation with nested
sets). Traverse the tree from the top, to get the path for all nodes,
then you can join the tree with itself with a LIKE expression to get
all the subdepartments:

WITH depcnt (DEP_ID, cnt) AS (
SELECT DEP_ID, COUNT(*)
FROM PERSON
GROUP BY DEP_ID
), recurs (DEP_ID, iter, path) AS (
SELECT DEP_ID, 0, cast(str(DEP_ID, 4) as varchar(8000))
FROM DEPARTMENT
WHERE PARENT = -2
UNION ALL
SELECT D.DEP_ID, r.iter+1, r.path + str(D.DEP_ID, 4)
FROM recurs r
JOIN DEPARTMENT D ON r.DEP_ID = D.PARENT
)
SELECT R1.DEP_ID, SUM(dc.cnt)
FROM recurs R1
JOIN recurs R2 ON R2.path LIKE R1.path + '%'
LEFT JOIN depcnt dc ON R2.DEP_ID = dc.DEP_ID
GROUP BY R1.DEP_ID
ORDER BY R1.DEP_ID

Notes:

* The str() function is likely be SQL Server specific.

* The + in R1.path + '%' is string concateneation. The ANSI operator
is ||. What DB2 uses, I have no idea.

* If you are on SQL Server, performance is likely to be better if you
save the result of the CTE in a temp table, since SQL Server else
will compute the CTE better. Whether this also applies to DB2,
I don't know.


Here is an extended version of the sample data:



create table DEPARTMENT
(
DEP_ID integer not null primary key,
NAME VARCHAR(32) not null,
PARENT integer not null
) ;


insert into DEPARTMENT
VALUES (1,'A',-2),
(2,'B',1),
(3,'C',2),
(4,'D',-2),
(5,'E',4),
(6,'F',5),
(7,'G',6),
(8,'D1', 4),
(9,'D2', 4),
(10,'E2', 5),
(11,'E3', 5)


create table PERSON (
PERSON_ID integer not null primary key,
NAME VARCHAR(128) not null,
DEP_ID integer,
constraint FK_01 foreign key (DEP_ID) references DEPARTMENT(DEP_ID)
);


insert into PERSON values
(1,'John',2),(2,'Mary',2),(3,'Suzan',3),
(4,'Joe',3), (5,'Lewis',3),(6,'Elvis',4),
(7,'Presley',5),(8,'Preston',5),(9,'Pee',7),
(10,'Wee',7),(11,'Steve-o',7),(12,'Demi Moore',7),
(13,'Pelle',8),(14,'Nisse',8),(15,'Petra',9),
(16,'Jöns',10),(17,'Pekka',11),(18,'Lalle',11),
(19,'Putte',11);



--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinf...ons/books.mspx

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.