dbTalk Databases Forums  

Re: foreign key constraint versus referential integrity constraint

comp.databases.theory comp.databases.theory


Discuss Re: foreign key constraint versus referential integrity constraint in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #41  
Old   
paul c
 
Posts: n/a

Default Re: foreign key constraint versus referential integrity constraint - 10-26-2009 , 04:00 AM






Mr. Scott wrote:
Quote:
"Marshall" <marshall.spight (AT) gmail (DOT) com> wrote in message
news:386975f9-472c-4184-8661-5c3d1e2f7621 (AT) r24g2000prf (DOT) googlegroups.com...
On Oct 24, 10:53 am, Keith H Duggar <dug... (AT) alum (DOT) mit.edu> wrote:
Anyhow, the question here is not one of our imagination but rather
simply this: if it makes sense for the RM to support constraints
on relational /values/ (taken on by variables) why does it not
make sense to support constraints on relational /expressions/?
That is a question of general principle not specific design.
This question, it seems to me, is clear and to the point.
And I would answer it by saying that we shouldn't really
even make the distinction! (At least not formally.)

I think we should make the distinction, and formally.

(p /\ q) -> r is not the same as (p -> r) /\ (q -> r)
but (p \/ q) -> r is the same as (p -> r) \/ (q -> r)

A view consisting of a natural join, for example, represents a set of
conjunctions. Each row of the join represents a conjunction of
propositions, one for each operand. A constraint defined on a join would be
of the form (p /\ q) -> r. That is definitely not the same as constraints
defined on one or more tables, which would take the form (p \/ q) -> r.

....

I guess the attitude, interpretation if you like, that relational ops
implement logic leads to that but another attitude is that they merely
apply logic to obtain relations that consist of simple propositions. I
believe most people happily accept the latter interpretation when
looking at a relation value that has been obtained by a language devices
such as insert or assignment where the definition is based on union.
The 'OR' disappears. I think there is a big difference between the
implementation and the application of logic. Another question is what
happens to the join's conjunction when we project, does it survive or
not depending on which attributes we choose?

Reply With Quote
  #42  
Old   
paul c
 
Posts: n/a

Default Re: foreign key constraint versus referential integrity constraint - 10-26-2009 , 04:25 AM






Cimode wrote:
Quote:
On 25 oct, 04:56, Bob Badour <bbad... (AT) pei (DOT) sympatico.ca> wrote:
paul c wrote:
Bob Badour wrote:
...
You asked about semantic joinability. Semantically, the join works if
like-named attributes have an equality comparison defined and not if
they don't.
...
In RT, how could they not define equality?
What reason would one have to define an equality comparison to compare
employee ids with department ids or with names? Or charge with current
or potential? Or age with price?
Hi paul,

A valid set of questions (pun intended). My conclusion were that Codd
was focused onto defining a side of the relational model that would be
part of information system as opposed as being a part of mathematics
since relational domain analysis is more loosely coupled with set
theory than relational general model. While relational domain analysis
is not mandatory in the context of the general relational model, it is
mandatory in the context of implementation of RM. It even makes
header based definitions somehow moot...IMHO...

Regards...
Cimode, I tend to look at it the same way, at least the part about the
information system motive, not sure about headers. If I recall it
wasn't until his second paper that he introduced the pair of attribute
and domain name, he was adjusting his theory to meet practice.

Reply With Quote
  #43  
Old   
compdb@hotmail.com
 
Posts: n/a

Default Re: foreign key constraint versus referential integrity constraint - 10-26-2009 , 04:51 AM



Keith H Duggar <dug... (AT) alum (DOT) mit.edu> wrote:

Quote:
what is the common name for the semantic constraint
that "attribute RV1:X and RV2:Y are sematincally
Concerns about "semantic joinability" are confused.

Relation operators, attributes and expressions
correspond to logic operators, (bindable) variables
and expressions. So predicates of relation variables
plus predicates corresponding to functions determine
the semantics of relation expressions. (Note that you
need an EQUALS function with two appropriately
typed parameters whenever a relation operator
evaluation uses an attribute's values to remove
duplicate tuples or to compare tuples from multiple
arguments.)

The predicates entirely determine the possible
queries and their meanings. If you don't make
mistakes in understanding predicates then you will
never want to write a query that can't be expressed
in terms of the predicates and answered in terms of
relations.

The predicates entirely determine valid database
states. Attribute types and constraint expressions are
logically redundant. What they do is constrain
updates to valid database states. If you don't make
mistakes in observing the modeled world and
evaluating the predicates in it then you will never
want to make an update that violates a constraint.

Your problem is you don't know that you should be
given the predicates. Your difficulties arise from this.

Quote:
why shouldn't the RM allow us to further constrain what domains
permit for both values AND [relational] operations?
I hope it is now clear that "semantic" limiting of
relation expressions is confused.

There is actually a limit on logic expressions, namely
those that correspond to relation expressions. You can
only write those that involve AND, AND NOT, OR of
expressions with the same free variables, EXISTS,
RENAME and functions (which must include implicitly
needed EQUALSs). Otherwise evaluation cost is a
function of type cardinalities instead of just the
number of tuples in the database.

philip

Reply With Quote
  #44  
Old   
vldm10
 
Posts: n/a

Default Re: foreign key constraint versus referential integrity constraint - 10-26-2009 , 05:20 AM



On Oct 21, 11:36*pm, Sampo Syreeni <de... (AT) iki (DOT) fi> wrote:
Quote:
(Since I first saw Codd's 1970 paper I've never been able to understand
why on the one hand he talked about "symmetrical" access but on the
other he used the very restrictive term "foreign key".) *For me, a
"reference" is a reference even if most dbms' have similar restrictions
to Codd's term.

I think the later talk about domains and e-relations in RM/T (and to a
lesser extent in RM/V2) makes that a bit easier to understand. As I
interpret it, Codd started with a very clean and purely syntactic
model in his seminal article, which will definitely lead to pure
symmetry in access. But from the very start he also worried deeply
about the semantics and making them explicit as metadata, which leads
to the second influential discovery of his: the integral, reflective
data dictionary. Later on his development of the relational model not
only added surrounding functionality (which weighs in so heavily in RM/
V2, what with all of'em T-joins and the lot) but semantics as well.

I think what was driving Codd was the need to make his model fully
self-contained. If you really think about it, that tends to exclude
any and all free-form input by the user. I mean, let's say you let an
end user input a free form name. It's pretty much guaranteed that lots
of nonsense is going to be input then. After that the name field
doesn't really constitute a well-defined domain at all; to a
mathematician like Codd its proper type should probably be more or
less 'general-ape-shit;no-real-type;use-at-own-risk'. I.e. the field
could be straightforwardly likened to a blob of text as we call the
more sustained kind of drivel today; certainly not something one would
make into a domain, and not something which would fit Codd's neat idea
of formal, repetitive business data.

So, what he did was to go the way we'd now call Master Data
Management. He went on to separate all of the data we'd intuitively
call entities, and more formally would define as a) relatively stable
data b) centrally and well managed as a fully controlled vocabulary,
so that c) applying the closed world assumption to it is actually well
founded because there is a formal process to keep the real world and
the minimal model of the database trying to approximate it in close
correspondence, d) attaching to things that have real, societally
established permanence, uniqueness and omnipresence which can be
likened to "a candidate key on an established class of real world
entities which is at most slowly varying", and most of all e)
discrete, enumerated data that is mostly used to classify/qualify
other data, the latter usually being composed of abstract measurements
on a continuous scale (originally account balances, given that we're
after all talking about developments which happened within the
International *Business* Machines Corporation). What we'd nowadays
call "complex", "semistructured", "unstructured", "unconventional" or
even "nasty" data was explicitly excluded by Codd from the very
beginning.

That then lead to a sort of revival of the earlier intuitive idea of
entities as opposed to relationships. Only this time there was a clear
distinction in the criteria: relations were always relations, and they
represented a theory of the outlying reality that was perfectly
symmetric. Nevertheless, the connection to the real world was also
taken into account in that some relations had additional semantics
bolted on in addition to just the logical constraints. They referred
to real world entities with the kinds of properties that would make it
especially easy to keep the correspondence between the minimal model
of the DB and the real world the DB was trying to represent in synch.

The difference to today's MDM and OLAP-dimension frenzy was then that
the logical level design criteria were fully unaffected by such
semantic, real world concerns. They remained completely syntactic, as
they should have. The only thing added was the idea that some
relations which happened to have a certain commonly reoccurring
structure, and in addition were meant to semantically correspond to
real life in what was also a commonplace correspondence, could have
those facts formally notarized in the data dictionary, using a well-
known syntax.

That, and only that, was what made a relation an e-relation. You
neither could nor had to reference the e-relation, but still its
semantics necessarily made it so that a) it made no sense to refer to
any real life modelable thing that wasn't already modeled by the e-
relation, b) thus semantically speaking we had an inclusion dependency
towards the e-relation already, c) of course all such semantic
constraints should be enforced if possible, d) a foreign key
constraint is pretty much The Mechanism to do that in the databases of
the past couple of decades, and finally e) foreign keys seem an awful
lot like unidirectional links, eventhough they could and should be a
lot more besides.

So, in addition to the semantics, we also have the direct implication
that an e-relation fully defines a domain (though not all domains need
to be defined via such enumeration). Then it's quite clear that all
all relations referring in any way to the kind of real life object the
e-relation represents, should be limited to utilizing the
corresponding domain/type determined by it as the type of the
attribute. (In SQL databases that means declaring foreign keys
referring to primary keys. In my ideal database it would amount to
declaring domains, and in addition to those also full inclusion
dependencies which could span any combination of attributes on both
sides, perhaps even using views in the middle to cater for the more
complex cases; I take constraints rather seriously.)

At the logical level all that was done was to declare some
dependencies, syntactic constraints, and telling a little bit about
how to read/update the data to a live person, using a controlled
vocabulary to do so (e.g. "social security number is the primary key
and is supposed to uniquely identify persons; thus if you have a
duplicate number, it's semantically fucked, leaving you only the
option of gathering even more data, until you can again make the
minimal model of the database correspond to reality").

But of course the all-round inclusion dependency also closely mimics
earlier referential semantics in its effects, and the usual way we
tend to use data typically reinforces the notion. You rarely see the
symmetry ("there is no link, there are just equal values"). Still, I
can tell you that when you do ad hoc OLAP queries long enough, you'll
eventually find yourself starting the query "against the tide". That
is then when you finally get it -- you simply couldn't have done it in
the presence of a directional link without "going through the whole
fucking thing using a for-loop" -- and off you are, into the
relational brotherhood.

It seems arbitrary and unnecessarily doctrinaire to
restrict references to only primary keys and unique indexes because I
think if one wants a particular constraint, one should not have specify
two constraints, eg., a 'reference' as well as a key or index. *This
doesn't prevent a language from offering shortcuts that are understood
to involve both.)

That is arbitrary indeed, and one of the few things where genuine
progress has been made in the theory of dependency and normalization
on the issues Codd originated. I also think that it is one of the very
few points where Codd as well fell victim to the earlier record
centric thought -- "referential integrity" often implies the mentality
that we're "linking" from some dependent data to stuff that "is
defined and primarily lives at another place". That is precisely the
record and index centric way of doing things, whereas the central
relational innovations were to a) place such references squarely at
the logical level, which makes them fully symmetric (i.e. in the
current case we might as well say that rvOrders refers to
rvRedemptions as we could say the opposite; it's not about directional
linking, but about referring to the same logical entity in two
different places at once, and just defining the propositional
semantics correctly), b) focus attention on the formal properties,
modelling and constraints (e.g. inclusion dependencies) within the
data instead of playing by real-world intuition alone, and c)
abolishing even the idea that base entities and relationships should
somehow be separated (i.e. what matters is only the formally
verifiable, syntactic dependencies that will necessarily result from
faithfully modeling the data, which then reveals that relationships
and base entities really behave much the same way).
--
Sampo

E-relations in RM/T have been introduced in order for the properties
of an entity to be represented as binary relations. As Codd wrote,
“split into as many binary relations as there are properties to be
recorded.” In these binary relations, surrogates have a key role. E.
Codd writes in RM/T, “Each surrogate appearing in this e-attribute
uniquely identifies the entity being described.”
E. Codd, however, does not show or solve the only important thing in
this paper : how to decompose any relation into binary relations. In
fact, in the RM/T he only expressed the desire for any relation to be
represented through binary relations.

E. Codd is undoubtedly among the most significant people in the
history of computer science and his work is of great and lasting
value. However, the e-relations and surrogates introduced in his paper
RM/T don’t have theoretical significance.

Vladimir Odrljin

Reply With Quote
  #45  
Old   
Mr. Scott
 
Posts: n/a

Default Re: foreign key constraint versus referential integrity constraint - 10-26-2009 , 08:15 AM



"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Mr. Scott wrote:
"Marshall" <marshall.spight (AT) gmail (DOT) com> wrote in message
news:386975f9-472c-4184-8661-5c3d1e2f7621 (AT) r24g2000prf (DOT) googlegroups.com...
On Oct 24, 10:53 am, Keith H Duggar <dug... (AT) alum (DOT) mit.edu> wrote:
Anyhow, the question here is not one of our imagination but rather
simply this: if it makes sense for the RM to support constraints
on relational /values/ (taken on by variables) why does it not
make sense to support constraints on relational /expressions/?
That is a question of general principle not specific design.
This question, it seems to me, is clear and to the point.
And I would answer it by saying that we shouldn't really
even make the distinction! (At least not formally.)

I think we should make the distinction, and formally.

(p /\ q) -> r is not the same as (p -> r) /\ (q -> r)
but (p \/ q) -> r is the same as (p -> r) \/ (q -> r)

A view consisting of a natural join, for example, represents a set of
conjunctions. Each row of the join represents a conjunction of
propositions, one for each operand. A constraint defined on a join would
be of the form (p /\ q) -> r. That is definitely not the same as
constraints defined on one or more tables, which would take the form (p
\/ q) -> r.

...

I guess the attitude, interpretation if you like, that relational ops
implement logic leads to that but another attitude is that they merely
apply logic to obtain relations that consist of simple propositions. I
believe most people happily accept the latter interpretation when looking
at a relation value that has been obtained by a language devices such as
insert or assignment where the definition is based on union. The 'OR'
disappears. I think there is a big difference between the implementation
and the application of logic. Another question is what happens to the
join's conjunction when we project, does it survive or not depending on
which attributes we choose?
I'm not really sure what you mean by "The 'OR' disappears." An insert into
a union view is inherently ambiguous.

The propositions represented in the rows of a projection imply the
propositions represented in the operand of the projection. That's why when
you insert through a projection, the columns on the operand that are not
represented must either allow nulls or have a default constraint defined.

Reply With Quote
  #46  
Old   
Bob Badour
 
Posts: n/a

Default Re: foreign key constraint versus referential integrity constraint - 10-26-2009 , 08:26 AM



compdb (AT) hotmail (DOT) com wrote:

Quote:
On Oct 25, 7:44 am, Bob Badour <bbad... (AT) pei (DOT) sympatico.ca> wrote:

com... (AT) hotmail (DOT) com wrote:

On Oct 24, 8:56 pm, Bob Badour <bbad... (AT) pei (DOT) sympatico.ca> wrote:

What reason would one have to define an equality comparison to compare
employee ids with department ids or with names? Or charge with current
or potential? Or age with price?

shoot your age and you are worth your weight in gold.

My age in what measure? Years? Integral or rational? If integral,
cardinal or ordinal? According to what calendar?

I was just (humorously seriously) giving examples of such peculiar
comparisons that I'd say have reasons.
But surely you understand that's what type casts are for. Right?


Quote:
Otherwise I agree with you.
Certainly the corresponding expressions depend on
the variable and function predicates.

philip

Reply With Quote
  #47  
Old   
paul c
 
Posts: n/a

Default Re: foreign key constraint versus referential integrity constraint - 10-26-2009 , 10:16 AM



Mr. Scott wrote:
....
Quote:
I'm not really sure what you mean by "The 'OR' disappears." An insert into
a union view is inherently ambiguous.
...
I wasn't thinking of views. The person who wishes to assert all the
propositions in a base relation AND some other proposition uses some
language verb that is defined on UNION. Eg., if A is base, INSERT I
INTO A 'loses' the 'OR'. He has used a disjunction to form a conjunction.

(By the way, I thought the regulars here knew better than to get me
going on view updates! But since the door has opened if briefly, I'll
comment that I've never understood why insert to base is treated as an
instruction that produces a conjunction but the definition of a union
view is not treated as such an instruction. I gather this is generally
considered either heretical or cranky, possibly because it might play
hob with de Morgan etc.)

Reply With Quote
  #48  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: foreign key constraint versus referential integrity constraint - 10-26-2009 , 11:27 AM



On Oct 25, 11:06*pm, "Mr. Scott" <do_not_re... (AT) noone (DOT) com> wrote:
Quote:
I think we should make the distinction, and formally.

(p /\ q) -> r * is not the same as * (p -> r) /\ (q -> r)
but *(p \/ q) -> r * is the same as * (p -> r) \/ (q -> r)
I don't follow. If these are BA expressions with the "->" as material
implication, then

(p v q) -> r = ~(p v q) v r = (~p ^ ~q) v r
(p -> r) v (q -> r) = (~p v r) v (~q v r) = (~p v ~q) v r

If the "->" is interpreted as deduction symbol (that is partial
boolean lattice order), then

(p v q) < r is not the same as (p < r) or (q < r)

Reply With Quote
  #49  
Old   
Mr. Scott
 
Posts: n/a

Default Re: foreign key constraint versus referential integrity constraint - 10-26-2009 , 02:01 PM



"Tegiri Nenashi" <tegirinenashi (AT) gmail (DOT) com> wrote

Quote:
On Oct 25, 11:06 pm, "Mr. Scott" <do_not_re... (AT) noone (DOT) com> wrote:
I think we should make the distinction, and formally.

(p /\ q) -> r is not the same as (p -> r) /\ (q -> r)
but (p \/ q) -> r is the same as (p -> r) \/ (q -> r)

I don't follow. If these are BA expressions with the "->" as material
implication, then

(p v q) -> r = ~(p v q) v r = (~p ^ ~q) v r
(p -> r) v (q -> r) = (~p v r) v (~q v r) = (~p v ~q) v r

If the "->" is interpreted as deduction symbol (that is partial
boolean lattice order), then

(p v q) < r is not the same as (p < r) or (q < r)
You're right.

(p /\ q) -> r is not the same as (p -> r) /\ (q -> r)
but (p \/ q) -> r is.

Reply With Quote
  #50  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: foreign key constraint versus referential integrity constraint - 10-26-2009 , 02:51 PM



On Oct 26, 11:01*am, "Mr. Scott" <do_not_re... (AT) noone (DOT) com> wrote:
Quote:
"Tegiri Nenashi" <tegirinena... (AT) gmail (DOT) com> wrote in message

news:84d21c7e-c6df-48a3-b5c9-f012caeded08 (AT) 12g2000pri (DOT) googlegroups.com...

On Oct 25, 11:06 pm, "Mr. Scott" <do_not_re... (AT) noone (DOT) com> wrote:
I think we should make the distinction, and formally.

(p /\ q) -> r is not the same as (p -> r) /\ (q -> r)
but (p \/ q) -> r is the same as (p -> r) \/ (q -> r)
I don't follow. If these are BA expressions with the "->" as material
implication, then

(p v q) -> r = ~(p v q) v r = (~p ^ ~q) v r
(p -> r) v (q -> r) = (~p v r) v (~q v r) = (~p v ~q) v r

If the "->" is interpreted as deduction symbol (that is partial
boolean lattice order), then

*(p v q) < r * is not the same as * (p < r) or (q < r)

You're right.

(p /\ q) -> r is not the same as (p -> r) /\ (q -> r)
but (p \/ q) -> r is.
(p ^ q) -> r = (p -> r) v (q -> r)

and, dually

(p v q) -> r = (p -> r) ^ (q -> r)

(Everything is perfectly symmetric in boolean algebra).

Now, before getting into constraint classification, what language do
you suggest to express them in? Is it RC, RA, or something else?

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.