dbTalk Databases Forums  

more on delete from join

comp.databases.theory comp.databases.theory


Discuss more on delete from join in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
paul c
 
Posts: n/a

Default more on delete from join - 08-22-2009 , 01:31 PM






Here's another kick at the old can. The language 'delete D from J' is
usually taken to mean the facts represented by D are retracted to
produce a new value for J. Expressed algebraically, the resulting value
is usually taken to equal J MINUS D, but this expression is not very
instructive when it comes to deleting from a view.

Given a view J = A JOIN B where J, A and B are relvars, and the headings
hA and hB of A and B respectively, the equallity J = J{hA} JOIN J{hB} is
always true. We can use this 'invariant' to recast a 'delete
definition' in a way that helps instruct an implementation. Whatever
else results, this equation must always be true.

The equalities A = J{hA} and B = J{hB} don't necessarily hold because A
MINUS J{hA} and B MINUS J{hB} may contain tuples that don't match any
projection of the join. Since A MINUS J{hA} and B MINUS J{hB} cannot
represent any facts that could be retracted from J, we need to take care
to include, ie., 'assert', their values in any result that replaces the
values of A and B. (One might choose to view the tuples that represent
those relations to be 'spurious', but such a characterization isn't
necessary from a logical viewpoint, in fact it's as irrelevant as the
colour of a n integer is to arithmetic.)

If we desire to define deletion from a join, the equation J = J{hA} JOIN
J{hB} must be true both before and after any replacements of the values
of A and B. We can take advantage of this to see, fairly easily, what
changes to A and B MUST BE implied by a language's delete statement that
is defined with the algebraic expresion J MINUS D.

Assuming that we wish to use the same language to retract a fact from a
J that is a view as we would from a J that is base, we might express the
retraction by J' = J MINUS D, where the primed J' stands for the
resulting relation and D stands for the relation representing the
fact(s) we wish to retract.

If A' and B' stand for the resulting values of the base relvars,

A' = (J MINUS D} {hA} UNION (A MINUS J{hA})
B' = {J MINUS D} {hB} UNION (B MINUS J{hB}).

The equations for A' and B' don't necessarily produce the same result
when J is a join view as they do when J is base, for example, if A and B
have distinct headings and D contains a single tuple, J' = J. Here are
sample values for that 'cartesian product':

A:
a
1
2

B
b
3
4

J = A JOIN B:
a b
1 3
1 4
2 3
2 4

D:
a b
1 3

Assume bJ has the same value as J but is base:

bJ' = bJ MINUS D:
a b
1 4
2 3
2 4

However, using the above definition of delete,

A' = (J MINUS D} {hA} UNION (A MINUS J{hA}):
a
1
2

and

B' = {J MINUS D} {hB} UNION (B MINUS J{hB}):
b
3
4

so J' = J. It appears nothing has changed, which is not the case when J
is base. This may surprise a user, but there is nothing illogical about
the result, which is obtained simply by following a TTM-like algebra
without appeal to any kind of intuition. One outcome of this is that bJ
and J aren't "interchangeable". But the basic reason for the result is
that the relation that D represents has quite a different predicate from
J's predicate because it is constrained to a single tuple, whereas J is
not constrained that way. Just as the typical relational algebras don't
have any notion of delete, they don't have a notion of exception
either. That is, the algebra operations being closed over relations and
constraints being defineable only with the algebra, it is left up to
the viewer to interpret certain relation values as 'exceptions' (if they
wish to).

When the headings of A and B overlap, which is much more common, such a
definition gives results that have more practical use. Somebody who
believes that delete from join is not always logically possible might
object on the grounds that the logical complement of a join involves
three possible combinations of the A and B result relations whereas
relational closure requires us to represent all resulting relations as
single relvars or single extensions or single 'tables'. This is a red
herring because we don't need to record complements.

Of course, anybody, eg., the implementors of various SQL products, could
claim this result isn't allowed by their product's definitions. But
those are piecemeal definitions that are based on relational logic in
some places but not in others.

Reply With Quote
  #2  
Old   
paul c
 
Posts: n/a

Default Re: more on delete from join - 08-22-2009 , 07:27 PM






paul c wrote:
Quote:
...
A' = (J MINUS D} {hA} UNION (A MINUS J{hA})
(because J MINUS D = J' MINUS D.)

Reply With Quote
  #3  
Old   
paul c
 
Posts: n/a

Default Re: more on delete from join - 08-23-2009 , 04:35 PM



paul c wrote:
Quote:
paul c wrote:
...
A' = (J MINUS D} {hA} UNION (A MINUS J{hA})

(because J MINUS D = J' MINUS D.)


The reason I mentioned this is because if the algebraic result,
according to the equations I gave, is

J:
a b
1 3
1 4
2 3
2 4

then J MINUS D = J' MINUS D becomes a contradiction. There might be
other ways for a language to treat that assertion, but I would say the
more clear-cut one is to treat it as what it is, a constraint. Since
the algebraic result breaks the constraint, the result is more likely an
empty J value.

Reply With Quote
  #4  
Old   
Mr. Scott
 
Posts: n/a

Default Re: more on delete from join - 08-25-2009 , 03:59 AM



"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Here's another kick at the old can. The language 'delete D from J' is
usually taken to mean the facts represented by D are retracted to produce
a new value for J. Expressed algebraically, the resulting value is
usually taken to equal J MINUS D, but this expression is not very
instructive when it comes to deleting from a view.

Given a view J = A JOIN B where J, A and B are relvars, and the headings
hA and hB of A and B respectively, the equallity J = J{hA} JOIN J{hB} is
always true. We can use this 'invariant' to recast a 'delete definition'
in a way that helps instruct an implementation. Whatever else results,
this equation must always be true.

The equalities A = J{hA} and B = J{hB} don't necessarily hold because A
MINUS J{hA} and B MINUS J{hB} may contain tuples that don't match any
projection of the join. Since A MINUS J{hA} and B MINUS J{hB} cannot
represent any facts that could be retracted from J, we need to take care
to include, ie., 'assert', their values in any result that replaces the
values of A and B. (One might choose to view the tuples that represent
those relations to be 'spurious', but such a characterization isn't
necessary from a logical viewpoint, in fact it's as irrelevant as the
colour of a n integer is to arithmetic.)

If we desire to define deletion from a join, the equation J = J{hA} JOIN
J{hB} must be true both before and after any replacements of the values of
A and B. We can take advantage of this to see, fairly easily, what
changes to A and B MUST BE implied by a language's delete statement that
is defined with the algebraic expresion J MINUS D.

Assuming that we wish to use the same language to retract a fact from a J
that is a view as we would from a J that is base, we might express the
retraction by J' = J MINUS D, where the primed J' stands for the resulting
relation and D stands for the relation representing the fact(s) we wish to
retract.

If A' and B' stand for the resulting values of the base relvars,

A' = (J MINUS D} {hA} UNION (A MINUS J{hA})
B' = {J MINUS D} {hB} UNION (B MINUS J{hB}).
I don't think this is right. It assumes that a delete from J translates
into a delete from both A and B, when a delete from either A or B would
suffice. The predicate of a join view is the conjunction of the predicates
of the tables being joined. For there to be a row in J requires that there
be a row in both A and B, so the fact represented by a row in J is the
conjunction of the facts represented by the corresponding rows in A and B.
Denial of at least one of the facts represented in A or B denies the fact
represented in J; conversely, denial of the fact represented in J denies at
least one of the facts represented in A or B, but not necessarily both.

Assuming that your interpretation is the right one, trying to delete from a
join view would be like trying to delete from a denormalized table. One of
the reasons to normalize is to eliminate so-called update anomalies. For
example, a table that contains both customer information and order
information exhibits a delete anomaly. When the last order for a customer
is deleted, that customer's information is deleted. Decomposing the table
into a customers table and an orders table with a foreign key from orders to
customers eliminates that anomaly. If we accept your interpretation, then
trying to delete from a view that joins orders and customers reintroduces
the anomaly.

Quote:
The equations for A' and B' don't necessarily produce the same result when
J is a join view as they do when J is base, for example, if A and B have
distinct headings and D contains a single tuple, J' = J. Here are sample
values for that 'cartesian product':

A:
a
1
2

B
b
3
4

J = A JOIN B:
a b
1 3
1 4
2 3
2 4

D:
a b
1 3

Assume bJ has the same value as J but is base:

bJ' = bJ MINUS D:
a b
1 4
2 3
2 4

However, using the above definition of delete,

A' = (J MINUS D} {hA} UNION (A MINUS J{hA}):
a
1
2

and

B' = {J MINUS D} {hB} UNION (B MINUS J{hB}):
b
3
4

so J' = J. It appears nothing has changed, which is not the case when J
is base. This may surprise a user, but there is nothing illogical about
the result, which is obtained simply by following a TTM-like algebra
without appeal to any kind of intuition. One outcome of this is that bJ
and J aren't "interchangeable". But the basic reason for the result is
that the relation that D represents has quite a different predicate from
J's predicate because it is constrained to a single tuple, whereas J is
not constrained that way. Just as the typical relational algebras don't
have any notion of delete, they don't have a notion of exception either.
That is, the algebra operations being closed over relations and
constraints being defineable only with the algebra, it is left up to the
viewer to interpret certain relation values as 'exceptions' (if they wish
to).
When the headings of A and B overlap, which is much more common, such a
definition gives results that have more practical use. Somebody who
believes that delete from join is not always logically possible might
object on the grounds that the logical complement of a join involves three
possible combinations of the A and B result relations whereas relational
closure requires us to represent all resulting relations as single relvars
or single extensions or single 'tables'. This is a red herring because we
don't need to record complements.

Of course, anybody, eg., the implementors of various SQL products, could
claim this result isn't allowed by their product's definitions. But those
are piecemeal definitions that are based on relational logic in some
places but not in others.




Reply With Quote
  #5  
Old   
Walter Mitty
 
Posts: n/a

Default Re: more on delete from join - 08-25-2009 , 08:02 AM



"Mr. Scott" <do_not_reply (AT) noone (DOT) com> wrote


Quote:
Assuming that your interpretation is the right one, trying to delete from
a join view would be like trying to delete from a denormalized table. One
of the reasons to normalize is to eliminate so-called update anomalies.
For example, a table that contains both customer information and order
information exhibits a delete anomaly. When the last order for a customer
is deleted, that customer's information is deleted. Decomposing the table
into a customers table and an orders table with a foreign key from orders
to customers eliminates that anomaly. If we accept your interpretation,
then trying to delete from a view that joins orders and customers
reintroduces the anomaly.
The delete anomaly for unnormalized tables arises from the fact that delete
operates on entire rows. If you could delete only part of a row, without
deleteing
the other part, it woild be possible to avoid the particular delete anomaly
you mentioned.

When the last order pertaining to a customer is deleted, you only delete the
data pertaining to the order, but you keep the row in the table with the
customer data intact.

So how the heck to do you delete part of a row without deleting the entire
row? Easy. You set the data you are removing to NULL.

I'm not seriously suggesting this as a practice. I'm just showing how the
updateable view question, the NULL question, and normalization are
interconnected.

Reply With Quote
  #6  
Old   
paul c
 
Posts: n/a

Default Re: more on delete from join - 08-25-2009 , 10:01 AM



Mr. Scott wrote:
Quote:
"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote in message
news:IPWjm.40275$Db2.30224 (AT) edtnps83 (DOT) ..
....
If A' and B' stand for the resulting values of the base relvars,

A' = (J MINUS D} {hA} UNION (A MINUS J{hA})
B' = {J MINUS D} {hB} UNION (B MINUS J{hB}).

I don't think this is right. It assumes that a delete from J translates
into a delete from both A and B, when a delete from either A or B would
suffice. The predicate of a join view is the conjunction of the predicates
of the tables being joined. For there to be a row in J requires that there
be a row in both A and B, so the fact represented by a row in J is the
conjunction of the facts represented by the corresponding rows in A and B.
Denial of at least one of the facts represented in A or B denies the fact
represented in J; conversely, denial of the fact represented in J denies at
least one of the facts represented in A or B, but not necessarily both.
...
Yes, it does make that assumption, consequence of logical independence I
think.

Reply With Quote
  #7  
Old   
Bob Badour
 
Posts: n/a

Default Re: more on delete from join - 08-25-2009 , 11:46 AM



paul c wrote:

Quote:
Mr. Scott wrote:

"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote in message
news:IPWjm.40275$Db2.30224 (AT) edtnps83 (DOT) ..

...

If A' and B' stand for the resulting values of the base relvars,

A' = (J MINUS D} {hA} UNION (A MINUS J{hA})
B' = {J MINUS D} {hB} UNION (B MINUS J{hB}).


I don't think this is right. It assumes that a delete from J
translates into a delete from both A and B, when a delete from either
A or B would suffice. The predicate of a join view is the conjunction
of the predicates of the tables being joined. For there to be a row
in J requires that there be a row in both A and B, so the fact
represented by a row in J is the conjunction of the facts represented
by the corresponding rows in A and B. Denial of at least one of the
facts represented in A or B denies the fact represented in J;
conversely, denial of the fact represented in J denies at least one of
the facts represented in A or B, but not necessarily both.
...

Yes, it does make that assumption, consequence of logical independence I
think.
Preservation of symmetry as well. However, I see no theory to support
either choice, which is why I prefer the pragmatism of allowing an
expert user to specify how the delete should operate in this situation.

Reply With Quote
  #8  
Old   
paul c
 
Posts: n/a

Default Re: more on delete from join - 08-25-2009 , 09:47 PM



Bob Badour wrote:
Quote:
paul c wrote:

Mr. Scott wrote:

....
I don't think this is right. It assumes that a delete from J
translates into a delete from both A and B, when a delete from either
A or B would suffice. ...
Yes, it does make that assumption, consequence of logical independence
I think.

Preservation of symmetry as well. However, I see no theory to support
either choice, which is why I prefer the pragmatism of allowing an
expert user to specify how the delete should operate in this situation.

No argument that people who design the predicates in the first place
should be able to trump any dbms conventions they want to, although I
doubt if more than a few at most of today's dbms'es can express
constraints flexibly enough to always allow that. Maybe I'm talking
about a convention rather than theory, but my basic point seems logical
to me and might go like this: While a relation represents the extension
of a predicate, there is no way for a dbms designer (as opposed to a db
designer) to choose whether a particular relation value formed by <OR>
or UNION has a predicate that involves disjunction, at least in the
absence of explicit constraints to prevent or void results that are not
wanted. Since all propositions represented by the tuples of a relation
(at least tuples in named relations) are taken to be true as far as that
relation is concerned (maybe false in the face of other relations, but
if false, they're not individually false, rather their logical
conjunction is false), the fact that a named relation might have been
formed by means of disjunction isn't of any consistent use to a
mechanical engine. Most base relations are formed with disjunction and
I don't see why views should be treated differently. In the absence of
contrary constraints, it seems most reasonable to me for a dbms, say
within the scope of a named view, to 'assume' identical predicates for
relations with equal headers.

My view of RT is extremely narrow, partly to avoid mysticism and partly
to see the minimum machinery a dbms really needs. I think a relation's
'definition' consists of nothing more than a header and constraints and
the constraints must all be expressible in the chosen algebra (this
doesn't mean I think the dbms needs to directly execute the algebra).

Reply With Quote
  #9  
Old   
Mr. Scott
 
Posts: n/a

Default Re: more on delete from join - 08-26-2009 , 12:43 AM



"Walter Mitty" <wamitty (AT) verizon (DOT) net> wrote

Quote:
"Mr. Scott" <do_not_reply (AT) noone (DOT) com> wrote in message
news:VqydnQmqmaKYNQ7XnZ2dnUVZ_uudnZ2d (AT) giganews (DOT) com...

Assuming that your interpretation is the right one, trying to delete from
a join view would be like trying to delete from a denormalized table.
One of the reasons to normalize is to eliminate so-called update
anomalies. For example, a table that contains both customer information
and order information exhibits a delete anomaly. When the last order for
a customer is deleted, that customer's information is deleted.
Decomposing the table into a customers table and an orders table with a
foreign key from orders to customers eliminates that anomaly. If we
accept your interpretation, then trying to delete from a view that joins
orders and customers reintroduces the anomaly.

The delete anomaly for unnormalized tables arises from the fact that
delete operates on entire rows. If you could delete only part of a row,
without deleteing
the other part, it woild be possible to avoid the particular delete
anomaly you mentioned.

When the last order pertaining to a customer is deleted, you only delete
the data pertaining to the order, but you keep the row in the table with
the customer data intact.

So how the heck to do you delete part of a row without deleting the entire
row? Easy. You set the data you are removing to NULL.

I'm not seriously suggesting this as a practice. I'm just showing how the
updateable view question, the NULL question, and normalization are
interconnected.
They are related. It has do with predicates.

Pkx A table in 6NF

Pkx IFF Qky A table in 5NF but not 6NF
or two 6NF tables with mutual
foreign keys
Pkx IFF Qky IFF Rkz A table in 5NF but not 6NF
or three 6NF tables in a referential
cycle
Pkx AND Qky A join view
or a denormalized table
Pkx IMPLIES Qky A join view over FK P to Q
or a denormalized table
Qky IMPLIES Pky A join view over FK Q to P
or a denormalized table
Pk XOR Qkx A table that permits inapplicable nulls

Pk OR Qkx A table that permits I-don't-have-a-clue nulls

Pkx OR Qkx A union view
or just two tables
Pkx XOR Qkx A disjoint union view
or two tables with a distributed key

Note that none of these predicates preclude applicable nulls. If every
applicable null is in logic a distinct individual variable, then some of the
problems with null can disappear. For example, SUM(A) = SUM(A) would be
true, as would SUM(A) + SUM(B) = SUM(A+B), because v = v and u = u, so v + u
= v + u when u and v are arbitrary but distinct instances of applicable
nulls.

Reply With Quote
  #10  
Old   
Kevin Kirkpatrick
 
Posts: n/a

Default Re: more on delete from join - 08-26-2009 , 10:16 AM



On Aug 25, 9:47*pm, paul c <toledobythe... (AT) oohay (DOT) ac> wrote:
Quote:
Bob Badour wrote:
paul c wrote:

Mr. Scott wrote:

...
I don't think this is right. *It assumes that a delete from J
translates into a delete from both A and B, when a delete from either
A or B would suffice. *...
Yes, it does make that assumption, consequence of logical independence
I think.

Preservation of symmetry as well. However, I see no theory to support
either choice, which is why I prefer the pragmatism of allowing an
expert user to specify how the delete should operate in this situation.

No argument that people who design the predicates in the first place
should be able to trump any dbms conventions they want to, although I
doubt if more than a few at most of today's dbms'es can express
constraints flexibly enough to always allow that. *Maybe I'm talking
about a convention rather than theory, but my basic point seems logical
to me and might go like this: *While a relation represents the extension
of a predicate, there is no way for a dbms designer (as opposed to a db
designer) to choose whether a particular relation value formed by <OR
or UNION has a predicate that involves disjunction, at least in the
absence of explicit constraints to prevent or void results that are not
wanted. *Since all propositions represented by the tuples of a relation
(at least tuples in named relations) are taken to be true as far as that
relation is concerned (maybe false in the face of other relations, but
if false, they're not individually false, rather their logical
conjunction is false), the fact that a named relation might have been
formed by means of disjunction isn't of any consistent use to a
mechanical engine. *Most base relations are formed with disjunction and
I don't see why views should be treated differently. *In the absence of
contrary constraints, it seems most reasonable to me for a dbms, say
within the scope of a named view, to 'assume' identical predicates for
relations with equal headers.

My view of RT is extremely narrow, partly to avoid mysticism and partly
to see the minimum machinery a dbms really needs. *I think a relation's
'definition' consists of nothing more than a header and constraints and
the constraints must all be expressible in the chosen algebra (this
doesn't mean I think the dbms needs to directly execute the algebra).
By the narrowest interpretation of RT, shouldn't the whole "view
update" problem just be tossed out once and for all? Tuples in base
relvars represent business meaningful propositions, and tuples in
virtural relvars represent business meaningful conclusions drawn from
base relvar propositions. As such, view updates amount to letting
end-users assert conclusions and having the DBMS make "educated
guesses" at the appropriate modifications to business meaningful
propositions such that those conclusions will be reached. Seems
inherently mystical to me.

In short, before delving into, "*how* should the DBMS handle view
updates?", I'd like to see a discussion about the question, "*should*
the DBMS handle view updates?".

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.