dbTalk Databases Forums  

Re: ADR's Normalization question

comp.databases.theory comp.databases.theory


Discuss Re: ADR's Normalization question in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
paul c
 
Posts: n/a

Default Re: ADR's Normalization question - 11-14-2009 , 02:36 AM






Mr. Scott wrote:
Quote:
...
If I'm not mistaken Armstrong's axioms can be applied to show that

{ CITY, CLASS } -> { STATUS },

so the CTS relation isn't needed if the CLS relation is replaced by

CLS { CITY, CLASS, STATUS }
KEY { CITY, CLASS }

Have I got this right? If so, I think ADR's suggested rule does apply to
Date's example.


Sorry, you've got it wrong. {CITY,CLASS,STATUS} satisfies the multi-valued
dependency,

STATUS ->-:> CITY | CLASS

which is not implied by the key

{CITY,CLASS}

{CITY,CLASS,STATUS} is therefore not in 4NF let alone 5NF.
The projections, {CITY,STATUS} and {CLASS,STAUTS}, are in 5NF, though.

While AB -> C can be inferred from A -> C and B -> C, that is,
a table that satisfies A -> C and B -> C also satisfies AB -> C, but
A -> C and B -> C cannot be inferred from AB -> C.
...
Thanks, I shouldn't have dropped those two dependencies and I take back
the strawman crack.

But I don't get why { CITY, CLASS, STATUS} isn't 4NF, a similar MVD
could be claimed for any relation that has at least one key attribute
and one non-key attribute.

Reply With Quote
  #2  
Old   
Mr. Scott
 
Posts: n/a

Default Re: ADR's Normalization question - 11-14-2009 , 09:08 AM






"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Mr. Scott wrote:
...
If I'm not mistaken Armstrong's axioms can be applied to show that

{ CITY, CLASS } -> { STATUS },

so the CTS relation isn't needed if the CLS relation is replaced by

CLS { CITY, CLASS, STATUS }
KEY { CITY, CLASS }

Have I got this right? If so, I think ADR's suggested rule does apply
to Date's example.


Sorry, you've got it wrong. {CITY,CLASS,STATUS} satisfies the
multi-valued dependency,

STATUS ->-:> CITY | CLASS

which is not implied by the key

{CITY,CLASS}

{CITY,CLASS,STATUS} is therefore not in 4NF let alone 5NF.
The projections, {CITY,STATUS} and {CLASS,STAUTS}, are in 5NF, though.

While AB -> C can be inferred from A -> C and B -> C, that is,
a table that satisfies A -> C and B -> C also satisfies AB -> C, but
A -> C and B -> C cannot be inferred from AB -> C.
...

Thanks, I shouldn't have dropped those two dependencies and I take back
the strawman crack.

But I don't get why { CITY, CLASS, STATUS} isn't 4NF, a similar MVD could
be claimed for any relation that has at least one key attribute and one
non-key attribute.
The claim is due to Date's "otherwise quite independent" criterion.
Whenever there is a functional dependency from A to B, for each B value
there is a disjoint subset of A values. The set of A values is effectively
"partitioned" by the set of B values. In {CITY,CLASS,STATUS}, there are two
distinct functional dependencies,

CITY -> STATUS and CLASS -> STATUS;

consequently, for each STATUS value there is a disjoint subset of CITY
values and a disjoint subset of CLASS values. In order for those subsets to
be "otherwise quite independent," the multtivalued dependency,

STATUS ->-> CITY | CLASS

must hold. In the case of a table that satisfies a functional dependency
like

K -> A,

the degenerate multivalued dependency,

A ->-> K | nil,

is due to the functional dependency K -> A. It is therefore safe to say
that it is implied by the key.

It is also safe to say that {CITY,CLASS,STATUS} isn't even in 3NF, since
there are functional dependencies that are not implied by the key.

{CITY,CLASS} -> STATUS

does not imply

CITY -> STATUS or CLASS -> STATUS.

Reply With Quote
  #3  
Old   
Mr. Scott
 
Posts: n/a

Default Re: ADR's Normalization question - 11-15-2009 , 07:16 AM



"Mr. Scott" <do_not_reply (AT) noone (DOT) com> wrote

Quote:
"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote in message
news_tLm.52824$PH1.5089 (AT) edtnps82 (DOT) ..
Mr. Scott wrote:
...
If I'm not mistaken Armstrong's axioms can be applied to show that

{ CITY, CLASS } -> { STATUS },

so the CTS relation isn't needed if the CLS relation is replaced by

CLS { CITY, CLASS, STATUS }
KEY { CITY, CLASS }

Have I got this right? If so, I think ADR's suggested rule does apply
to Date's example.


Sorry, you've got it wrong. {CITY,CLASS,STATUS} satisfies the
multi-valued dependency,

STATUS ->-:> CITY | CLASS

which is not implied by the key

{CITY,CLASS}

{CITY,CLASS,STATUS} is therefore not in 4NF let alone 5NF.
The projections, {CITY,STATUS} and {CLASS,STAUTS}, are in 5NF, though.

While AB -> C can be inferred from A -> C and B -> C, that is,
a table that satisfies A -> C and B -> C also satisfies AB -> C, but
A -> C and B -> C cannot be inferred from AB -> C.
...

Thanks, I shouldn't have dropped those two dependencies and I take back
the strawman crack.

But I don't get why { CITY, CLASS, STATUS} isn't 4NF, a similar MVD could
be claimed for any relation that has at least one key attribute and one
non-key attribute.

The claim is due to Date's "otherwise quite independent" criterion.
Whenever there is a functional dependency from A to B, for each B value
there is a disjoint subset of A values. The set of A values is
effectively "partitioned" by the set of B values. In {CITY,CLASS,STATUS},
there are two distinct functional dependencies,

CITY -> STATUS and CLASS -> STATUS;

consequently, for each STATUS value there is a disjoint subset of CITY
values and a disjoint subset of CLASS values. In order for those subsets
to be "otherwise quite independent," the multtivalued dependency,

STATUS ->-> CITY | CLASS

must hold. In the case of a table that satisfies a functional dependency
like

K -> A,

the degenerate multivalued dependency,

A ->-> K | nil,

is due to the functional dependency K -> A. It is therefore safe to say
that it is implied by the key.

It is also safe to say that {CITY,CLASS,STATUS} isn't even in 3NF, since
Sorry, BCNF.

Quote:
there are functional dependencies that are not implied by the key.

{CITY,CLASS} -> STATUS

does not imply

CITY -> STATUS or CLASS -> STATUS.


Reply With Quote
  #4  
Old   
Sampo Syreeni
 
Posts: n/a

Default Re: ADR's Normalization question - 11-16-2009 , 03:23 PM



Quote:
The claim is due to Date's "otherwise quite independent" criterion. [...]
Okay, this is new to me, and given your further exposition, would
appear highly nonstandard to me. Can you point me towards a freely
available paper in which Date nails himself to the cross with this
interpretation?
--
Sampo

Reply With Quote
  #5  
Old   
paul c
 
Posts: n/a

Default Re: ADR's Normalization question - 11-16-2009 , 10:24 PM



Sampo Syreeni wrote:
Quote:
The claim is due to Date's "otherwise quite independent" criterion. [...]

Okay, this is new to me, and given your further exposition, would
appear highly nonstandard to me. Can you point me towards a freely
available paper in which Date nails himself to the cross with this
interpretation?
--
Sampo
Since I started it let me jump in and say that while the second
objection about 'quite independent' was a bit pedantic, given the
context it wasn't wrong. The first objection was more to the point,
ie., I had dropped the two fd's that Date used as a basis. The second
was just a side-effect of the basic error.

Reply With Quote
  #6  
Old   
Mr. Scott
 
Posts: n/a

Default Re: ADR's Normalization question - 11-16-2009 , 10:31 PM



"Sampo Syreeni" <decoy (AT) iki (DOT) fi> wrote

Quote:
The claim is due to Date's "otherwise quite independent" criterion. [...]

Okay, this is new to me, and given your further exposition, would
appear highly nonstandard to me. Can you point me towards a freely
available paper in which Date nails himself to the cross with this
interpretation?
I suggest you read Ronald Fagin's paper "Multivalued Dependencies and a New
Normal Form for Relational Databases." Date's "otherwise quite
independent" criterion exemplifies the kind of relationship defined in the
paper. For example, an employee's salary is quite independent from (Fagin
uses "orthogonal" to) an employee's set of children. In the same way the
set of cities associated with a particular status is orthogonal to the set
of classes that are associated with it. The appearance of being nonstandard
is probably due to the fact that multivalued dependencies are not ordinarily
considered in the normalization procedure unless the database scheme is
already in BCNF.

Reply With Quote
  #7  
Old   
Sampo Syreeni
 
Posts: n/a

Default Re: ADR's Normalization question - 11-17-2009 , 11:40 AM



Quote:
I suggest you read Ronald Fagin's paper "Multivalued Dependencies and a New
Normal Form for Relational Databases." Date's "otherwise quite
independent" criterion exemplifies the kind of relationship defined in the
paper.
I must have been braindead at the moment: I know and love my MVD's,
but despite your having mentioned them by name, I just kept on
thinking about how Date goes around it in the quote.

I actually consider MVD's more fundamental to normalization than FD's,
because they give a condition for two-way nonloss decomposition that
is not only sufficient but also necessary. Of course join dependencies
would be better still, but in practice they're nowhere near as easy to
spot or wrap your head around than FD's (for each x we have precisely
one y) or MVD's (for each x we have a unique set Y). Just about the
only nasty thing about full MVD's is that you always have to consider
the context in which the attributes appear, so that the concept really
only works well when you're designing the schema from the top down,
and not from the bottom up. In the other direction you'd probably want
to use embedded MVD's, but then IIRC their implication problem is
open.

As a funky note about the higher normal forms and more exotic kinds of
dependencies... Just about all course notes and practical expositions
of normalization I've seen sooner or later tell you 3NF (or perhaps
BCNF) is all you need in practice; "most relations in the wild are
then in 4NF as well". For the load of crap that is, it's repeated
amazingly often. You don't need to have anything in your model beyond
firms with sets of subsidiaries, people with families or orders with
line items, and you're already well into (E)MVD land. As such it's not
a big surprise that one well-known study found about one in five
deployed databases to have MVD's which hadn't been factored out (and
that was before star schemas became all the craze).

But certainly you never have to consider irreducible join
dependencies, right? Wrong again: the very first schema I got to
design from the ground up at work in fact had a genuine, irreducible,
three-pronged join dependency that had to be explicitly broken down.
In that case we had projects with milestones and measurements taken at
those milestones. Each project defined which measurements it was
using, and the milestones were subsetted as well, from a formal
waterfall template shared across the organization. So far we could of
course make do with MVD's, but additionally each metric was only
defined for a subset of the milestones; you couldn't for example have
cost measurements for preproject planning checkpoints where the
project hadn't been given a go, because the appropriate cost accounts
hadn't in that case been assigned yet. Bang: suddenly you no longer
have a two way nonloss decomposition, although a three way one exists.
And of course there was lots of assorted wackiness going on besides
that, like measurements being optional within the above constraints
(the eventual solution had to resort to triggers to check the
inclusion dependency against {project, milestone, measure} since
Oracle doesn't let you point a foreign key to a view), temporal
normalization (measurements could change values after first being
entered, and so had to be versioned), partial temporal constraints on
the order the milestone data was filled out, measurement targets with
a separate life of their own except that they couldn't be set after
actuals started to pour in, and so on, and so forth.

I ended up constraining the hell out of the thing even after the basic
design was in picture perfect PJ/NF (perhaps even 6NF, I didn't
formally check that one), with the result that there hasn't been a
single integrity issue with said DB to date. Had I just normalized
away the final nulls (in validity end time columns) and done away with
surrogates, that baby would probably qualify as my masterpiece.
--
Sampo

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.