dbTalk Databases Forums  

Generating a tree structure from data

comp.databases.ms-sqlserver comp.databases.ms-sqlserver


Discuss Generating a tree structure from data in the comp.databases.ms-sqlserver forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Steve
 
Posts: n/a

Default Generating a tree structure from data - 06-22-2011 , 04:40 AM






Hi,

I've been banging my head against a brick wall trying to sort this and
hoping somebody can help!!!

i have a table that has 4 main fields;
icode
idcode
parent
iddesc

parent is either blank, Y (meaning there are child links) and H
(meaning it's the top level of a structure)

icode is effectively the parent id and idcode is the child id

iddesc is the description of the link

I want to be able to create a procedure that recusively loops through
the table (intradtl) and outputs an indented structure of the file

Level description
..0 <<iddesc>>
...1 <<iddesc>>
.....2 <<iddesc>>
...1 <<iddesc>>
etc

can anyone offer any advice on how to do it? i want to do it in as
simpler way as possible......

thanks in advance

Reply With Quote
  #2  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Generating a tree structure from data - 06-22-2011 , 03:18 PM






Steve (steve.simpson (AT) aircelle (DOT) com) writes:
Quote:
parent is either blank, Y (meaning there are child links) and H
(meaning it's the top level of a structure)
That's an unusual column for that type of structure. I don't see what it
adds. Usually this relation is encoded in the ids. And what does blank mean?

Quote:
I want to be able to create a procedure that recusively loops through
the table (intradtl) and outputs an indented structure of the file

Level description
..0 <<iddesc
...1 <<iddesc
.....2 <<iddesc
...1 <<iddesc
etc
So ignoring Parent, and assuming that you are on SQL 2005 or later, you
can do:

WITH rekurs AS (
SELECT icode, iddesc, lvl = 0
FROM intradt1
WHERE idcode IS NULL
UNION ALL
SELECT i. icode, i.iddesc, r.lvl + 1
FROM intradt1 i
JOIN rekurs r ON i.idcode = i.code
)
SELECT replicate('.', lvl + 2) + ltrim(str(lvl)), idddesc
FROM rekurs


--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx

Reply With Quote
  #3  
Old   
--CELKO--
 
Posts: n/a

Default Re: Generating a tree structure from data - 06-23-2011 , 09:36 AM



Quote:
I've been banging my head against a brick wall trying to sort this and hoping somebody can help!!!
People cannot read your mind, so post your code and clear specs if you
really want help. Please post real DDL and not narrative or your own
personal programming language. Learn to use ISO-11179 rules for the
data element names, avoid needless dialect and use ISO-8601 temporal
formats, codes and so forth. Please tell us what SQL product and
release you are using. Tell us if you can change the DDL or if you are
stuck with it.

The vague narrative you did post is completely wrong. Columns are
nothing like fields. A data elemetn can be a “<something>_code” or a
“<something>_id” but never an absurd hybrid “id_code”. Do you also say
“blood_type” or “blood_type_code_value_id”?

The terms “parent”: and “link” belong to the pre-relational network
databases. They do not exist in RDBMS or SQL. We have referenced and
referencing columns. My guess at your DDL is that you have a
incorrectly done adjacency list model. The nodes and they tree
st5rucutre are crammed itno one table in violation for the basic data
modeling rule never to mix relationship and data in a single table.

Quote:
parent is either blank, Y (meaning there are child links [sic]) and H (meaning it's the top level of a structure)
First of all, we use a NULL, not a blank for missing data in SQL.
Blank fields were part of 1950's COBOL.

Secondly, those silly flags are carrying structural information at
different levels of aggregation.

Quote:
i_code is effectively the parent id and id_code is the child id
Same data element with two names? Again, another fundamental error.
The VIN on your automobile does not change from your insurance policy,
inspection sticker, or DMV records. Data elements have one and only
one name, but they might have a role; “child_node_id” and
“parent_node_id” would be valid.

Quote:
id_desc is the description of the link [sic]
How does an identifier have a description? Entities have
descriptions.


Quote:
I want to be able to create a procedure that recursively loops through the table (intradtl) and outputs an indented structure of the file
Procedural code instead of declarative code. Yep, back in the 1950's
again.

Another way of representing trees is to show them as nested sets.
Since SQL is a set oriented language, this is a better model than the
usual adjacency list approach you see in most text books. Let us
define a simple OrgChart table like this.

CREATE TABLE OrgChart
(emp_name CHAR(10) NOT NULL PRIMARY KEY,
lft INTEGER NOT NULL UNIQUE CHECK (lft > 0),
rgt INTEGER NOT NULL UNIQUE CHECK (rgt > 1),
CONSTRAINT order_okay CHECK (lft < rgt));


OrgChart
emp_name lft rgt
======================
'Albert' 1 12
'Bert' 2 3
'Chuck' 4 11
'Donna' 5 6
'Eddie' 7 8
'Fred' 9 10

The (lft, rgt) pairs are like tags in a mark-up language, or parens
in algebra, BEGIN-END blocks in Algol-family programming languages,
etc. -- they bracket a sub-set. This is a set-oriented approach to
trees in a set-oriented language.

The organizational chart would look like this as a directed graph:

Albert (1, 12)
/ \
/ \
Bert (2, 3) Chuck (4, 11)
/ | \
/ | \
/ | \
/ | \

Donna (5, 6) Eddie (7, 8) Fred (9, 10)

The adjacency list table is denormalized in several ways. We are
modeling both the Personnel and the Organizational chart in one table.
But for the sake of saving space, pretend that the names are job
titles and that we have another table which describes the Personnel
that hold those positions.

Another problem with the adjacency list model is that the
boss_emp_name and employee columns are the same kind of thing (i.e.
identifiers of personnel), and therefore should be shown in only one
column in a normalized table. To prove that this is not normalized,
assume that "Chuck" changes his name to "Charles"; you have to change
his name in both columns and several places. The defining
characteristic of a normalized table is that you have one fact, one
place, one time.

The final problem is that the adjacency list model does not model
subordination. Authority flows downhill in a hierarchy, but If I fire
Chuck, I disconnect all of his subordinates from Albert. There are
situations (i.e. water pipes) where this is true, but that is not the
expected situation in this case.


To show a tree as nested sets, replace the nodes with ovals, and
then nest subordinate ovals inside each other. The root will be the
largest oval and will contain every other node. The leaf nodes will be
the innermost ovals with nothing else inside them and the nesting will
show the hierarchical relationship. The (lft, rgt) columns (I cannot
use the reserved words LEFT and RIGHT in SQL) are what show the
nesting. This is like XML, HTML or parentheses.

At this point, the boss_emp_name column is both redundant and
denormalized, so it can be dropped. Also, note that the tree structure
can be kept in one table and all the information about a node can be
put in a second table and they can be joined on employee number for
queries.

To convert the graph into a nested sets model think of a little worm
crawling along the tree. The worm starts at the top, the root, makes a
complete trip around the tree. When he comes to a node, he puts a
number in the cell on the side that he is visiting and increments his
counter. Each node will get two numbers, one of the right side and one
for the left. Computer Science majors will recognize this as a
modified preorder tree traversal algorithm. Finally, drop the unneeded
OrgChart.boss_emp_name column which used to represent the edges of a
graph.

This has some predictable results that we can use for building
queries. The root is always (left = 1, right = 2 * (SELECT COUNT(*)
FROM TreeTable)); leaf nodes always have (left + 1 = right); subtrees
are defined by the BETWEEN predicate; etc. Here are two common queries
which can be used to build others:

1. An employee and all their Supervisors, no matter how deep the
tree.

SELECT O2.*
FROM OrgChart AS O1, OrgChart AS O2
WHERE O1.lft BETWEEN O2.lft AND O2.rgt
AND O1.emp_name = :myemployee;

2. The employee and all their subordinates. There is a nice
symmetry here.

SELECT O1.*
FROM OrgChart AS O1, OrgChart AS O2
WHERE O1.lft BETWEEN O2.lft AND O2.rgt
AND O2.emp_name = :myemployee;

3. Add a GROUP BY and aggregate functions to these basic queries and
you have hierarchical reports. For example, the total salaries which
each employee controls:

SELECT O2.emp_name, SUM(S1.salary_amt)
FROM OrgChart AS O1, OrgChart AS O2,
Salaries AS S1
WHERE O1.lft BETWEEN O2.lft AND O2.rgt
AND O1.emp_name = S1.emp_name
GROUP BY O2.emp_name;

4. To find the level of each emp_name, so you can print the tree as
an indented listing.

SELECT T1.node,
SUM(CASE WHEN T2.lft <= T1.lft THEN 1 ELSE 0 END
+ CASE WHEN T2.rgt < T1.lft THEN -1 ELSE 0 END) AS lvl
FROM Tree AS T1, Tree AS T2
WHERE T2.lft <= T1.lft
GROUP BY T1.node;

An untested version of this using OLAP functions might be better
able to use the ordering.

SELECT T1.node,
SUM(CASE WHEN T2.lft <= T1.lft THEN 1 ELSE 0 END
+ CASE WHEN T2.rgt < T1.lft THEN -1 ELSE 0 END)
OVER (ORDER BY T1.lft
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS lvl
FROM Tree AS T1, Tree AS T2
WHERE T2.lft <= T1.lft;

5. The nested set model has an implied ordering of siblings which
the adjacency list model does not. To insert a new node, G1, under
part G. We can insert one node at a time like this:

BEGIN ATOMIC
DECLARE rightmost_spread INTEGER;
SET rightmost_spread
= (SELECT rgt
FROM Frammis
WHERE part = 'G');
UPDATE Frammis
SET lft = CASE WHEN lft > rightmost_spread
THEN lft + 2
ELSE lft END,
rgt = CASE WHEN rgt >= rightmost_spread
THEN rgt + 2
ELSE rgt END
WHERE rgt >= rightmost_spread;

INSERT INTO Frammis (part, lft, rgt)
VALUES ('G1', rightmost_spread, (rightmost_spread + 1));

COMMIT WORK;
END;

The idea is to spread the (lft, rgt) numbers after the youngest
child of the parent, G in this case, over by two to make room for the
new addition, G1. This procedure will add the new node to the
rightmost child position, which helps to preserve the idea of an age
order among the siblings.

6. To convert a nested sets model into an adjacency list model:

SELECT B.emp_name AS boss_emp_name, E.emp_name
FROM OrgChart AS E
LEFT OUTER JOIN
OrgChart AS B
ON B.lft
= (SELECT MAX(lft)
FROM OrgChart AS S
WHERE E.lft > S.lft
AND E.lft < S.rgt);

7. To find the immediate parent of a node:

SELECT MAX(P2.lft), MIN(P2.rgt)
FROM Personnel AS P1, Personnel AS P2
WHERE P1.lft BETWEEN P2.lft AND P2.rgt
AND P1.emp_name = :my_emp_name;

I have a book on TREES & HIERARCHIES IN SQL which you can get at
Amazon.com right now.

Reply With Quote
  #4  
Old   
rembrandt
 
Posts: n/a

Default Re: Generating a tree structure from data - 06-29-2011 , 06:47 AM



On Jun 22, 5:40*am, Steve <steve.simp... (AT) aircelle (DOT) com> wrote:
Quote:
Hi,

I've been banging my head against a brick wall trying to sort this and
hoping somebody can help!!!

i have a table that has 4 *main fields;
icode
idcode
parent
iddesc

parent is either blank, Y (meaning there are child links) and H
(meaning it's the top level of a structure)

icode is effectively the parent id and idcode is the child id

iddesc is the description of the link

I want to be able to create a procedure that recusively loops through
the table (intradtl) and outputs an indented structure of the file

Level * *description
.0 * * * * <<iddesc
..1 * * * *<<iddesc
....2 * * *<<iddesc
..1 * * * *<<iddesc
etc

can anyone offer any advice on how to do it? *i want to do it in as
simpler way as possible......

thanks in advance
export your data out to a more logical table (temp) and run your query
there.

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.