dbTalk Databases Forums  

Newbie normal form question

comp.databases comp.databases


Discuss Newbie normal form question in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Patrick L. Nolan
 
Posts: n/a

Default Newbie normal form question - 07-06-2007 , 01:45 AM






I maintain a small database which contains information about
a group of people. Somebody suggested an addition. I proposed
an alternate method. My method feels better to me intuitively.
I'm trying to figure out if the difference is related to
normalization, perhaps the 5th normal form.

Every person in the database is uniquely defined by a single key,
their userid. Everyone is assigned a category, call them A, B and X.
Everyone in category X has a supervisor, who must be in either
category A or category B. The proposal is to divide category X
into two, call them XA and XB. All the XA people would have A
supervisors, and all the XB people would have B supervisors.

I immediately noticed that there's redundancy and the possibility
of inconsistency. Suppose somebody in category XA somehow gets
assigned to a supervisor in category B, contrary to the definition
of XA. Murphy's law would require this to happen after a while.
When I pointed this out, the response was the the XA people
are really different from the XB people, with different rights
and responsibilities. This difference ought to be reflected in
their own table entries. In my scheme, it would require an extra
table lookup to find whether someone's supervisor is an A or a B.

My argument would be more impressive if I could say, "Why, that
violates the N'th Normal Form rule. That's very bad." But I
find the language of the normal forms quite obtuse. This sort of
sounds like the 5th form, as described in the Wikipedia, but I'm
not sure. All the examples seem to require breaking up a table into
multiple smaller ones, and I don't see the need for that here.
Am I on the right track?

--
* Patrick L. Nolan *
* W. W. Hansen Experimental Physics Laboratory (HEPL) *
* Stanford University *

Reply With Quote
  #2  
Old   
Roy Hann
 
Posts: n/a

Default Re: Newbie normal form question - 07-06-2007 , 02:12 AM







"Patrick L. Nolan" <pln (AT) razzle (DOT) Stanford.EDU> wrote

Quote:
I maintain a small database which contains information about
a group of people. Somebody suggested an addition. I proposed
an alternate method. My method feels better to me intuitively.
I'm trying to figure out if the difference is related to
normalization, perhaps the 5th normal form.

Every person in the database is uniquely defined by a single key,
their userid. Everyone is assigned a category, call them A, B and X.
Everyone in category X has a supervisor, who must be in either
category A or category B. The proposal is to divide category X
into two, call them XA and XB. All the XA people would have A
supervisors, and all the XB people would have B supervisors.

I immediately noticed that there's redundancy and the possibility
of inconsistency. Suppose somebody in category XA somehow gets
assigned to a supervisor in category B, contrary to the definition
of XA. Murphy's law would require this to happen after a while.
When I pointed this out, the response was the the XA people
are really different from the XB people, with different rights
and responsibilities. This difference ought to be reflected in
their own table entries. In my scheme, it would require an extra
table lookup to find whether someone's supervisor is an A or a B.

My argument would be more impressive if I could say, "Why, that
violates the N'th Normal Form rule. That's very bad." But I
find the language of the normal forms quite obtuse. This sort of
sounds like the 5th form, as described in the Wikipedia, but I'm
not sure. All the examples seem to require breaking up a table into
multiple smaller ones, and I don't see the need for that here.
Am I on the right track?
We know that rows in a table should be presumed to be true statements of
fact about the world. If there is a row that says a thing is so, it is so.
To understand 5NF you first have to know that databases are expected to
conform to the Closed World Hypothesis which more or less says that the
database has to represent the entire enterprise of interest, so that the
*absence* of a hypothetical row can allow you to correctly conclude that the
hypothetical fact is *not* true. 5NF is all about decomposing tables so
that you cannot put a row in a table that inadvertently asserts one fact is
true but not another, when in fact the second one *must inescapably* also be
true, because its absence would make the database logically inconsistent.
This problem can occur only when there is a join dependency, and the only
way there can be a join dependency is if there is some "business rule" (or
law of nature) that forces you to expect that second row.

Roy




Reply With Quote
  #3  
Old   
Ed Prochak
 
Posts: n/a

Default Re: Newbie normal form question - 07-06-2007 , 01:00 PM



On Jul 6, 2:45 am, "Patrick L. Nolan" <p... (AT) razzle (DOT) Stanford.EDU>
wrote:
Quote:
I maintain a small database which contains information about
a group of people. Somebody suggested an addition. I proposed
an alternate method. My method feels better to me intuitively.
I'm trying to figure out if the difference is related to
normalization, perhaps the 5th normal form.

Every person in the database is uniquely defined by a single key,
their userid. Everyone is assigned a category, call them A, B and X.
Everyone in category X has a supervisor, who must be in either
category A or category B. The proposal is to divide category X
into two, call them XA and XB. All the XA people would have A
supervisors, and all the XB people would have B supervisors.

I immediately noticed that there's redundancy and the possibility
of inconsistency. Suppose somebody in category XA somehow gets
assigned to a supervisor in category B, contrary to the definition
of XA. Murphy's law would require this to happen after a while.
When I pointed this out, the response was the the XA people
are really different from the XB people, with different rights
and responsibilities. This difference ought to be reflected in
their own table entries. In my scheme, it would require an extra
table lookup to find whether someone's supervisor is an A or a B.

My argument would be more impressive if I could say, "Why, that
violates the N'th Normal Form rule. That's very bad." But I
find the language of the normal forms quite obtuse. This sort of
sounds like the 5th form, as described in the Wikipedia, but I'm
not sure. All the examples seem to require breaking up a table into
multiple smaller ones, and I don't see the need for that here.
Am I on the right track?

--
* Patrick L. Nolan *
* W. W. Hansen Experimental Physics Laboratory (HEPL) *
* Stanford University *

I am not quite sure what your scheme is.
let's see, GUESSING at the table design since you failed to provide
one:

PERSON
userid
cat_code (this employee's category)
super_code (the boss's level)

CAT
code
description

currently CAT contains
A QUEEN BEE
B BUSY BEE
X WORKER BEE

and PERSON has FK constraints on
cat_code references CAT (code)
super_code references CAT (code)
currently there is no enforcement of supervisor relation (who can be
assigned to a A or B boss)

Proposal: new CAT contains
A QUEEN BEE
B BUSY BEE
X WORKER BEE (XB)
W WORKER DRONE (XA)

I think your intuition is off. In fact you can now add a check
constraint so that you cannot assign a worker drone to a busy bee
unless you demote him to worker bee. So this is seems to fit the
requirements and doesn't violate normal forms that I can see. I assume
your design involved a third table?

HTH
ed



Reply With Quote
  #4  
Old   
--CELKO--
 
Posts: n/a

Default Re: Newbie normal form question - 07-11-2007 , 06:03 PM



Here are a bunch of tricks for you. Ussers and job assignments are
different. Users are entiies and job assignmnts are relationsm thus
we need two tables. The job categories are so short you could put
them in a CHECK() constraint, but what the heck!

The firtst trick is to have a super key in the Users table that can be
referenced by the job assignments:

CREATE TABLE Users
(user_id INTEGER NOT NULL PRIMARY KEY,
job_cat CHAR(1) NOT NULL
REFERENCES JobCategories(job_cat),
UNIQUE (user_id, job_cat), -- super key!
etc.);

-- this could be a CHECK() in Users
CREATE TABLE JobCategories
(job_cat CHAR(1) NOT NULL PRIMARY KEY, -- {'A', 'B', 'X'}
job_cat_description VARCHAR(50) NOT NULL);

The job assignments use the supper key as their foreign key. Notice
the use of a role prefix on the data element names.

CREATE TABLE JobAssignments
(sub_user_id INTEGER NOT NULL,
sub_job_cat CHAR(1) NOT NULL,
FOREIGN KEY (sub_user_id, sub_job_cat)
REFERENCES Users(user_id, job_cat),

super_user_id INTEGER NOT NULL,
super_job_cat CHAR(1) NOT NULL,
FOREIGN KEY (super_user_id, super_job_cat)
REFERENCES Users(user_id, job_cat),

-- the tricky part!!
CHECK (sub_user_id <> super_user_id), -- assumed

CHECK (CASE WHEN sub_job_cat = 'X' AND super_job_cat IN ('A', 'B')
THEN 'T' ELSE 'F' END = 'T').

PRIMARY KEY (sub_user_id, super_user_id),
etc.);

This assures nobody is his own supervisor and that everyone in
category X has a supervisor, who must be in either category A or
category B. You can easily expand the CASE expression to as
complicated a set of rules as you wish.


Reply With Quote
  #5  
Old   
--CELKO--
 
Posts: n/a

Default Re: Newbie normal form question - 07-11-2007 , 06:08 PM



I might not have been clear about expanding the CASE experession:

CHECK (CASE WHEN sub_job_cat = 'X' AND super_job_cat IN ('A', 'B')
THEN 'T'
WHEN sub_job_cat = 'A' AND <<other rules??>
THEN 'T'
WHEN sub_job_cat = 'B' AND <<other rules??>
THEN 'T'
ELSE 'F' END = 'T').


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.