dbTalk Databases Forums  

A different definition of MINUS, part 4

comp.databases.theory comp.databases.theory


Discuss A different definition of MINUS, part 4 in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
paul c
 
Posts: n/a

Default A different definition of MINUS, part 4 - 12-26-2008 , 01:37 PM






Parts one, two and three have various mis-steps and confusions. Let me
start again emphasizing what I think was McGoveran's paramount point -
the desire for logical data independence. I interpret the meaning of
this in an algebra to be that substituting some non-logical aspect of an
implementation for another will not require any logical aspects to be
removed, ie., any true statements to be falsified. Base and virtual
relvars are non-logical aspects in that there is no notion of them in
the algebra or calculus. In Codd's logic, ie. his algebra and calculus,
they are no different than colours, for example it does not matter in
the D&D A-algebra whether a relation is 'red' or 'green'. This does not
mean that an implementation cannot be logical, only that if its results
are to be validated logically, it must define all the concepts involved
in those results in logical, eg., algebraic, terms.

Suppose R is virtual, A and B are base and R is defined as A JOIN B:

R = A <AND> B is true.

Now, make R base and A and B virtual (logical independence should allow
this):

A = R{HA} is true.
B = R{HB} is true.

(HA and HB are the respective headings of A and B.) To have logical
data independence in any implementation, the first equation must remain
true. If we need to falsify any other true equation as a consequence of
making R base, we don't have logical data independence. Logical
independence implies that A cannot have a tuple that is not in R{HA}.

To say that a 'delete' can have indeterminate base results is to permit
A to have a tuple that is not in R{HA} which makes one of the equations
false. Saying this denies logical independence as I see it. I think
one implication of what McGoveran is saying is that certain database
values are not logically possible when logical independence is assumed
and possibly this has to do with the motivation of those who want to
disallow delete through join. Maybe they also are saying that we can
have logical independence sometimes but not always. If so, I think this
brings into question whether dbms results can ever be validated logically.

Somebody might object that making R base as above is not always possible
because a user might have inserted a tuple into an 'A' relvar but not
into a 'B' relvar. My answer to that is that it would always be
logically possible if all three equations were always obeyed, ie., by
logical independence, the values for R, A and B must always be such as
they would be if all inserted tuples had been inserted into R.

Part of Codd's early reason for wanting what he called data independence
was surely the hierarchical db's of the time, where programs changed
frequently in very adhoc ways as a consequence of physical data
re-arrangements. But even if today there were no more hierarchical
db's, logical independence would still be needed at the very least for
optimizations.

Given the above example, let me assume logical independence so that A =
R{HA} and B = R{HB} will always be true and a result R' = A' JOIN B'
where R', A', B' are result relvars, or 'after' values if you like, is
also true. After a successful delete by an implementation, A' must have
no tuple that is not in R'{HA}, similarly for B'. However, this does
not imply that all deletes must 'delete from both sides'. It only means
that when HA = HB, A = B. Much depends on language design choices, such
as the definition of 'MINUS'. Also, I have no doubt that the Tutorial D
language is indeterminate for delete through join and I think that is a
result of the MINUS definition TD uses.

If we want these three equations to be always true as far as MINUS is
concerned, it might help if we somehow reflect them in a definition that
gives some clue as to how an implementation can proceed. One clue was
spelled out by McGoveran in the exchange, treat the view definition as
the general case. Another is implied by the second two equations,
namely the equality of projections. Equality of relations or
projections can be specified by negating the result of logical
exclusive-or, eg., in A-algebra terms: <NOT> (((<NOT> A) <AND> B) <OR>
(A <AND> (<NOT> B))). Let me just write this as A <NXOR> B, so R = A
<NXOR> B. Now I can try to express those clues in a different MINUS
definition:

R MINUS D is semantically equivalent to:

((R{HR1} <NXOR> R{HR2}) <AND> (<NOT> D)){HR}

where (ie., 'it is required that') HR is the heading of R
and
HR1 union HR2 = HR,

From the definition of <NXOR> it is clear that an implementation's
choices do not include deleting a tuple from R{HR1} if it does not also
delete that tuple from R{HR2} when HR1=HR2. However, if HR1 <> HR2, an
implementation is not prevented by this particular definition from
deleting a tuple only from R{HR1}, similarly only from R{HR2}.

I think this definition needs some more work, but I'll take a breath for
now.






















Reply With Quote
  #2  
Old   
Brian Selzer
 
Posts: n/a

Default Re: A different definition of MINUS, part 4 - 12-27-2008 , 11:09 PM







"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Parts one, two and three have various mis-steps and confusions. Let me
start again emphasizing what I think was McGoveran's paramount point -
the desire for logical data independence. I interpret the meaning of
this in an algebra to be that substituting some non-logical aspect of an
implementation for another will not require any logical aspects to be
removed, ie., any true statements to be falsified. Base and virtual
relvars are non-logical aspects in that there is no notion of them in
the algebra or calculus. In Codd's logic, ie. his algebra and calculus,
they are no different than colours, for example it does not matter in
the D&D A-algebra whether a relation is 'red' or 'green'. This does not
mean that an implementation cannot be logical, only that if its results
are to be validated logically, it must define all the concepts involved
in those results in logical, eg., algebraic, terms.

Can you be more specific in what you mean by logical data independence? I
understood it to be a property of equivalent database schemes--characterized
with respect to the Relational Model by the ability to house exactly the
same information using differing sets of relations. The concept is
orthogonal to whether relations are virtual or not. What is important is
that being able to transform the set of relations that conforms to one
database scheme into the set of relations that conforms to another database
scheme is a prerequisite for the schemes to be equivalent. What that means
is that for a view to be uncontingently updatable, there has to be an
equivalent database scheme in which the heading of a base relation matches
the heading of the view. For example, suppose you have two database schemes

scheme 1:

EMPLOYEE {EMPID, LAST, FIRST, MIDDLE, SSAN} KEY{EMPID}

scheme 2:

EMP1 {EMPID, LAST, FIRST, MIDDLE} KEY{EMPID} and
EMP2 {EMPID, SSAN} KEY{EMPID} such that
EMP1[EMPID] = EMP2[EMPID]
EMPLOYEE = EMP1 JOIN EMP2

Here the circular inclusion dependency guarantees that any update through
the EMPLOYEE view can be transformed into a legal set of updates against
EMP1 and EMP2.

I don't see how anything short of schema equivalence could be permitted--for
instance, relaxing the circular inclusion dependency--without changing the
information content of the database.

Quote:
Suppose R is virtual, A and B are base and R is defined as A JOIN B:

R = A <AND> B is true.

Now, make R base and A and B virtual (logical independence should allow
this):

A = R{HA} is true.
B = R{HB} is true.

(HA and HB are the respective headings of A and B.) To have logical
data independence in any implementation, the first equation must remain
true. If we need to falsify any other true equation as a consequence of
making R base, we don't have logical data independence. Logical
independence implies that A cannot have a tuple that is not in R{HA}.

To say that a 'delete' can have indeterminate base results is to permit
A to have a tuple that is not in R{HA} which makes one of the equations
false. Saying this denies logical independence as I see it. I think
one implication of what McGoveran is saying is that certain database
values are not logically possible when logical independence is assumed
and possibly this has to do with the motivation of those who want to
disallow delete through join. Maybe they also are saying that we can
have logical independence sometimes but not always. If so, I think this
brings into question whether dbms results can ever be validated logically.

Somebody might object that making R base as above is not always possible
because a user might have inserted a tuple into an 'A' relvar but not
into a 'B' relvar. My answer to that is that it would always be
logically possible if all three equations were always obeyed, ie., by
logical independence, the values for R, A and B must always be such as
they would be if all inserted tuples had been inserted into R.

Part of Codd's early reason for wanting what he called data independence
was surely the hierarchical db's of the time, where programs changed
frequently in very adhoc ways as a consequence of physical data
re-arrangements. But even if today there were no more hierarchical
db's, logical independence would still be needed at the very least for
optimizations.

Given the above example, let me assume logical independence so that A =
R{HA} and B = R{HB} will always be true and a result R' = A' JOIN B'
where R', A', B' are result relvars, or 'after' values if you like, is
also true. After a successful delete by an implementation, A' must have
no tuple that is not in R'{HA}, similarly for B'. However, this does
not imply that all deletes must 'delete from both sides'. It only means
that when HA = HB, A = B. Much depends on language design choices, such as
the definition of 'MINUS'. Also, I have no doubt that the Tutorial D
language is indeterminate for delete through join and I think that is a
result of the MINUS definition TD uses.

If we want these three equations to be always true as far as MINUS is
concerned, it might help if we somehow reflect them in a definition that
gives some clue as to how an implementation can proceed. One clue was
spelled out by McGoveran in the exchange, treat the view definition as
the general case. Another is implied by the second two equations,
namely the equality of projections. Equality of relations or
projections can be specified by negating the result of logical
exclusive-or, eg., in A-algebra terms: <NOT> (((<NOT> A) <AND> B) <OR
(A <AND> (<NOT> B))). Let me just write this as A <NXOR> B, so R = A
NXOR> B. Now I can try to express those clues in a different MINUS
definition:

R MINUS D is semantically equivalent to:

((R{HR1} <NXOR> R{HR2}) <AND> (<NOT> D)){HR}

where (ie., 'it is required that') HR is the heading of R
and
HR1 union HR2 = HR,

From the definition of <NXOR> it is clear that an implementation's
choices do not include deleting a tuple from R{HR1} if it does not also
delete that tuple from R{HR2} when HR1=HR2. However, if HR1 <> HR2, an
implementation is not prevented by this particular definition from
deleting a tuple only from R{HR1}, similarly only from R{HR2}.

I think this definition needs some more work, but I'll take a breath for
now.
























Reply With Quote
  #3  
Old   
Brian Selzer
 
Posts: n/a

Default Re: A different definition of MINUS, part 4 - 12-27-2008 , 11:09 PM




"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Parts one, two and three have various mis-steps and confusions. Let me
start again emphasizing what I think was McGoveran's paramount point -
the desire for logical data independence. I interpret the meaning of
this in an algebra to be that substituting some non-logical aspect of an
implementation for another will not require any logical aspects to be
removed, ie., any true statements to be falsified. Base and virtual
relvars are non-logical aspects in that there is no notion of them in
the algebra or calculus. In Codd's logic, ie. his algebra and calculus,
they are no different than colours, for example it does not matter in
the D&D A-algebra whether a relation is 'red' or 'green'. This does not
mean that an implementation cannot be logical, only that if its results
are to be validated logically, it must define all the concepts involved
in those results in logical, eg., algebraic, terms.

Can you be more specific in what you mean by logical data independence? I
understood it to be a property of equivalent database schemes--characterized
with respect to the Relational Model by the ability to house exactly the
same information using differing sets of relations. The concept is
orthogonal to whether relations are virtual or not. What is important is
that being able to transform the set of relations that conforms to one
database scheme into the set of relations that conforms to another database
scheme is a prerequisite for the schemes to be equivalent. What that means
is that for a view to be uncontingently updatable, there has to be an
equivalent database scheme in which the heading of a base relation matches
the heading of the view. For example, suppose you have two database schemes

scheme 1:

EMPLOYEE {EMPID, LAST, FIRST, MIDDLE, SSAN} KEY{EMPID}

scheme 2:

EMP1 {EMPID, LAST, FIRST, MIDDLE} KEY{EMPID} and
EMP2 {EMPID, SSAN} KEY{EMPID} such that
EMP1[EMPID] = EMP2[EMPID]
EMPLOYEE = EMP1 JOIN EMP2

Here the circular inclusion dependency guarantees that any update through
the EMPLOYEE view can be transformed into a legal set of updates against
EMP1 and EMP2.

I don't see how anything short of schema equivalence could be permitted--for
instance, relaxing the circular inclusion dependency--without changing the
information content of the database.

Quote:
Suppose R is virtual, A and B are base and R is defined as A JOIN B:

R = A <AND> B is true.

Now, make R base and A and B virtual (logical independence should allow
this):

A = R{HA} is true.
B = R{HB} is true.

(HA and HB are the respective headings of A and B.) To have logical
data independence in any implementation, the first equation must remain
true. If we need to falsify any other true equation as a consequence of
making R base, we don't have logical data independence. Logical
independence implies that A cannot have a tuple that is not in R{HA}.

To say that a 'delete' can have indeterminate base results is to permit
A to have a tuple that is not in R{HA} which makes one of the equations
false. Saying this denies logical independence as I see it. I think
one implication of what McGoveran is saying is that certain database
values are not logically possible when logical independence is assumed
and possibly this has to do with the motivation of those who want to
disallow delete through join. Maybe they also are saying that we can
have logical independence sometimes but not always. If so, I think this
brings into question whether dbms results can ever be validated logically.

Somebody might object that making R base as above is not always possible
because a user might have inserted a tuple into an 'A' relvar but not
into a 'B' relvar. My answer to that is that it would always be
logically possible if all three equations were always obeyed, ie., by
logical independence, the values for R, A and B must always be such as
they would be if all inserted tuples had been inserted into R.

Part of Codd's early reason for wanting what he called data independence
was surely the hierarchical db's of the time, where programs changed
frequently in very adhoc ways as a consequence of physical data
re-arrangements. But even if today there were no more hierarchical
db's, logical independence would still be needed at the very least for
optimizations.

Given the above example, let me assume logical independence so that A =
R{HA} and B = R{HB} will always be true and a result R' = A' JOIN B'
where R', A', B' are result relvars, or 'after' values if you like, is
also true. After a successful delete by an implementation, A' must have
no tuple that is not in R'{HA}, similarly for B'. However, this does
not imply that all deletes must 'delete from both sides'. It only means
that when HA = HB, A = B. Much depends on language design choices, such as
the definition of 'MINUS'. Also, I have no doubt that the Tutorial D
language is indeterminate for delete through join and I think that is a
result of the MINUS definition TD uses.

If we want these three equations to be always true as far as MINUS is
concerned, it might help if we somehow reflect them in a definition that
gives some clue as to how an implementation can proceed. One clue was
spelled out by McGoveran in the exchange, treat the view definition as
the general case. Another is implied by the second two equations,
namely the equality of projections. Equality of relations or
projections can be specified by negating the result of logical
exclusive-or, eg., in A-algebra terms: <NOT> (((<NOT> A) <AND> B) <OR
(A <AND> (<NOT> B))). Let me just write this as A <NXOR> B, so R = A
NXOR> B. Now I can try to express those clues in a different MINUS
definition:

R MINUS D is semantically equivalent to:

((R{HR1} <NXOR> R{HR2}) <AND> (<NOT> D)){HR}

where (ie., 'it is required that') HR is the heading of R
and
HR1 union HR2 = HR,

From the definition of <NXOR> it is clear that an implementation's
choices do not include deleting a tuple from R{HR1} if it does not also
delete that tuple from R{HR2} when HR1=HR2. However, if HR1 <> HR2, an
implementation is not prevented by this particular definition from
deleting a tuple only from R{HR1}, similarly only from R{HR2}.

I think this definition needs some more work, but I'll take a breath for
now.
























Reply With Quote
  #4  
Old   
Brian Selzer
 
Posts: n/a

Default Re: A different definition of MINUS, part 4 - 12-27-2008 , 11:09 PM




"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Parts one, two and three have various mis-steps and confusions. Let me
start again emphasizing what I think was McGoveran's paramount point -
the desire for logical data independence. I interpret the meaning of
this in an algebra to be that substituting some non-logical aspect of an
implementation for another will not require any logical aspects to be
removed, ie., any true statements to be falsified. Base and virtual
relvars are non-logical aspects in that there is no notion of them in
the algebra or calculus. In Codd's logic, ie. his algebra and calculus,
they are no different than colours, for example it does not matter in
the D&D A-algebra whether a relation is 'red' or 'green'. This does not
mean that an implementation cannot be logical, only that if its results
are to be validated logically, it must define all the concepts involved
in those results in logical, eg., algebraic, terms.

Can you be more specific in what you mean by logical data independence? I
understood it to be a property of equivalent database schemes--characterized
with respect to the Relational Model by the ability to house exactly the
same information using differing sets of relations. The concept is
orthogonal to whether relations are virtual or not. What is important is
that being able to transform the set of relations that conforms to one
database scheme into the set of relations that conforms to another database
scheme is a prerequisite for the schemes to be equivalent. What that means
is that for a view to be uncontingently updatable, there has to be an
equivalent database scheme in which the heading of a base relation matches
the heading of the view. For example, suppose you have two database schemes

scheme 1:

EMPLOYEE {EMPID, LAST, FIRST, MIDDLE, SSAN} KEY{EMPID}

scheme 2:

EMP1 {EMPID, LAST, FIRST, MIDDLE} KEY{EMPID} and
EMP2 {EMPID, SSAN} KEY{EMPID} such that
EMP1[EMPID] = EMP2[EMPID]
EMPLOYEE = EMP1 JOIN EMP2

Here the circular inclusion dependency guarantees that any update through
the EMPLOYEE view can be transformed into a legal set of updates against
EMP1 and EMP2.

I don't see how anything short of schema equivalence could be permitted--for
instance, relaxing the circular inclusion dependency--without changing the
information content of the database.

Quote:
Suppose R is virtual, A and B are base and R is defined as A JOIN B:

R = A <AND> B is true.

Now, make R base and A and B virtual (logical independence should allow
this):

A = R{HA} is true.
B = R{HB} is true.

(HA and HB are the respective headings of A and B.) To have logical
data independence in any implementation, the first equation must remain
true. If we need to falsify any other true equation as a consequence of
making R base, we don't have logical data independence. Logical
independence implies that A cannot have a tuple that is not in R{HA}.

To say that a 'delete' can have indeterminate base results is to permit
A to have a tuple that is not in R{HA} which makes one of the equations
false. Saying this denies logical independence as I see it. I think
one implication of what McGoveran is saying is that certain database
values are not logically possible when logical independence is assumed
and possibly this has to do with the motivation of those who want to
disallow delete through join. Maybe they also are saying that we can
have logical independence sometimes but not always. If so, I think this
brings into question whether dbms results can ever be validated logically.

Somebody might object that making R base as above is not always possible
because a user might have inserted a tuple into an 'A' relvar but not
into a 'B' relvar. My answer to that is that it would always be
logically possible if all three equations were always obeyed, ie., by
logical independence, the values for R, A and B must always be such as
they would be if all inserted tuples had been inserted into R.

Part of Codd's early reason for wanting what he called data independence
was surely the hierarchical db's of the time, where programs changed
frequently in very adhoc ways as a consequence of physical data
re-arrangements. But even if today there were no more hierarchical
db's, logical independence would still be needed at the very least for
optimizations.

Given the above example, let me assume logical independence so that A =
R{HA} and B = R{HB} will always be true and a result R' = A' JOIN B'
where R', A', B' are result relvars, or 'after' values if you like, is
also true. After a successful delete by an implementation, A' must have
no tuple that is not in R'{HA}, similarly for B'. However, this does
not imply that all deletes must 'delete from both sides'. It only means
that when HA = HB, A = B. Much depends on language design choices, such as
the definition of 'MINUS'. Also, I have no doubt that the Tutorial D
language is indeterminate for delete through join and I think that is a
result of the MINUS definition TD uses.

If we want these three equations to be always true as far as MINUS is
concerned, it might help if we somehow reflect them in a definition that
gives some clue as to how an implementation can proceed. One clue was
spelled out by McGoveran in the exchange, treat the view definition as
the general case. Another is implied by the second two equations,
namely the equality of projections. Equality of relations or
projections can be specified by negating the result of logical
exclusive-or, eg., in A-algebra terms: <NOT> (((<NOT> A) <AND> B) <OR
(A <AND> (<NOT> B))). Let me just write this as A <NXOR> B, so R = A
NXOR> B. Now I can try to express those clues in a different MINUS
definition:

R MINUS D is semantically equivalent to:

((R{HR1} <NXOR> R{HR2}) <AND> (<NOT> D)){HR}

where (ie., 'it is required that') HR is the heading of R
and
HR1 union HR2 = HR,

From the definition of <NXOR> it is clear that an implementation's
choices do not include deleting a tuple from R{HR1} if it does not also
delete that tuple from R{HR2} when HR1=HR2. However, if HR1 <> HR2, an
implementation is not prevented by this particular definition from
deleting a tuple only from R{HR1}, similarly only from R{HR2}.

I think this definition needs some more work, but I'll take a breath for
now.
























Reply With Quote
  #5  
Old   
Brian Selzer
 
Posts: n/a

Default Re: A different definition of MINUS, part 4 - 12-27-2008 , 11:09 PM




"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Parts one, two and three have various mis-steps and confusions. Let me
start again emphasizing what I think was McGoveran's paramount point -
the desire for logical data independence. I interpret the meaning of
this in an algebra to be that substituting some non-logical aspect of an
implementation for another will not require any logical aspects to be
removed, ie., any true statements to be falsified. Base and virtual
relvars are non-logical aspects in that there is no notion of them in
the algebra or calculus. In Codd's logic, ie. his algebra and calculus,
they are no different than colours, for example it does not matter in
the D&D A-algebra whether a relation is 'red' or 'green'. This does not
mean that an implementation cannot be logical, only that if its results
are to be validated logically, it must define all the concepts involved
in those results in logical, eg., algebraic, terms.

Can you be more specific in what you mean by logical data independence? I
understood it to be a property of equivalent database schemes--characterized
with respect to the Relational Model by the ability to house exactly the
same information using differing sets of relations. The concept is
orthogonal to whether relations are virtual or not. What is important is
that being able to transform the set of relations that conforms to one
database scheme into the set of relations that conforms to another database
scheme is a prerequisite for the schemes to be equivalent. What that means
is that for a view to be uncontingently updatable, there has to be an
equivalent database scheme in which the heading of a base relation matches
the heading of the view. For example, suppose you have two database schemes

scheme 1:

EMPLOYEE {EMPID, LAST, FIRST, MIDDLE, SSAN} KEY{EMPID}

scheme 2:

EMP1 {EMPID, LAST, FIRST, MIDDLE} KEY{EMPID} and
EMP2 {EMPID, SSAN} KEY{EMPID} such that
EMP1[EMPID] = EMP2[EMPID]
EMPLOYEE = EMP1 JOIN EMP2

Here the circular inclusion dependency guarantees that any update through
the EMPLOYEE view can be transformed into a legal set of updates against
EMP1 and EMP2.

I don't see how anything short of schema equivalence could be permitted--for
instance, relaxing the circular inclusion dependency--without changing the
information content of the database.

Quote:
Suppose R is virtual, A and B are base and R is defined as A JOIN B:

R = A <AND> B is true.

Now, make R base and A and B virtual (logical independence should allow
this):

A = R{HA} is true.
B = R{HB} is true.

(HA and HB are the respective headings of A and B.) To have logical
data independence in any implementation, the first equation must remain
true. If we need to falsify any other true equation as a consequence of
making R base, we don't have logical data independence. Logical
independence implies that A cannot have a tuple that is not in R{HA}.

To say that a 'delete' can have indeterminate base results is to permit
A to have a tuple that is not in R{HA} which makes one of the equations
false. Saying this denies logical independence as I see it. I think
one implication of what McGoveran is saying is that certain database
values are not logically possible when logical independence is assumed
and possibly this has to do with the motivation of those who want to
disallow delete through join. Maybe they also are saying that we can
have logical independence sometimes but not always. If so, I think this
brings into question whether dbms results can ever be validated logically.

Somebody might object that making R base as above is not always possible
because a user might have inserted a tuple into an 'A' relvar but not
into a 'B' relvar. My answer to that is that it would always be
logically possible if all three equations were always obeyed, ie., by
logical independence, the values for R, A and B must always be such as
they would be if all inserted tuples had been inserted into R.

Part of Codd's early reason for wanting what he called data independence
was surely the hierarchical db's of the time, where programs changed
frequently in very adhoc ways as a consequence of physical data
re-arrangements. But even if today there were no more hierarchical
db's, logical independence would still be needed at the very least for
optimizations.

Given the above example, let me assume logical independence so that A =
R{HA} and B = R{HB} will always be true and a result R' = A' JOIN B'
where R', A', B' are result relvars, or 'after' values if you like, is
also true. After a successful delete by an implementation, A' must have
no tuple that is not in R'{HA}, similarly for B'. However, this does
not imply that all deletes must 'delete from both sides'. It only means
that when HA = HB, A = B. Much depends on language design choices, such as
the definition of 'MINUS'. Also, I have no doubt that the Tutorial D
language is indeterminate for delete through join and I think that is a
result of the MINUS definition TD uses.

If we want these three equations to be always true as far as MINUS is
concerned, it might help if we somehow reflect them in a definition that
gives some clue as to how an implementation can proceed. One clue was
spelled out by McGoveran in the exchange, treat the view definition as
the general case. Another is implied by the second two equations,
namely the equality of projections. Equality of relations or
projections can be specified by negating the result of logical
exclusive-or, eg., in A-algebra terms: <NOT> (((<NOT> A) <AND> B) <OR
(A <AND> (<NOT> B))). Let me just write this as A <NXOR> B, so R = A
NXOR> B. Now I can try to express those clues in a different MINUS
definition:

R MINUS D is semantically equivalent to:

((R{HR1} <NXOR> R{HR2}) <AND> (<NOT> D)){HR}

where (ie., 'it is required that') HR is the heading of R
and
HR1 union HR2 = HR,

From the definition of <NXOR> it is clear that an implementation's
choices do not include deleting a tuple from R{HR1} if it does not also
delete that tuple from R{HR2} when HR1=HR2. However, if HR1 <> HR2, an
implementation is not prevented by this particular definition from
deleting a tuple only from R{HR1}, similarly only from R{HR2}.

I think this definition needs some more work, but I'll take a breath for
now.
























Reply With Quote
  #6  
Old   
Brian Selzer
 
Posts: n/a

Default Re: A different definition of MINUS, part 4 - 12-27-2008 , 11:09 PM




"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Parts one, two and three have various mis-steps and confusions. Let me
start again emphasizing what I think was McGoveran's paramount point -
the desire for logical data independence. I interpret the meaning of
this in an algebra to be that substituting some non-logical aspect of an
implementation for another will not require any logical aspects to be
removed, ie., any true statements to be falsified. Base and virtual
relvars are non-logical aspects in that there is no notion of them in
the algebra or calculus. In Codd's logic, ie. his algebra and calculus,
they are no different than colours, for example it does not matter in
the D&D A-algebra whether a relation is 'red' or 'green'. This does not
mean that an implementation cannot be logical, only that if its results
are to be validated logically, it must define all the concepts involved
in those results in logical, eg., algebraic, terms.

Can you be more specific in what you mean by logical data independence? I
understood it to be a property of equivalent database schemes--characterized
with respect to the Relational Model by the ability to house exactly the
same information using differing sets of relations. The concept is
orthogonal to whether relations are virtual or not. What is important is
that being able to transform the set of relations that conforms to one
database scheme into the set of relations that conforms to another database
scheme is a prerequisite for the schemes to be equivalent. What that means
is that for a view to be uncontingently updatable, there has to be an
equivalent database scheme in which the heading of a base relation matches
the heading of the view. For example, suppose you have two database schemes

scheme 1:

EMPLOYEE {EMPID, LAST, FIRST, MIDDLE, SSAN} KEY{EMPID}

scheme 2:

EMP1 {EMPID, LAST, FIRST, MIDDLE} KEY{EMPID} and
EMP2 {EMPID, SSAN} KEY{EMPID} such that
EMP1[EMPID] = EMP2[EMPID]
EMPLOYEE = EMP1 JOIN EMP2

Here the circular inclusion dependency guarantees that any update through
the EMPLOYEE view can be transformed into a legal set of updates against
EMP1 and EMP2.

I don't see how anything short of schema equivalence could be permitted--for
instance, relaxing the circular inclusion dependency--without changing the
information content of the database.

Quote:
Suppose R is virtual, A and B are base and R is defined as A JOIN B:

R = A <AND> B is true.

Now, make R base and A and B virtual (logical independence should allow
this):

A = R{HA} is true.
B = R{HB} is true.

(HA and HB are the respective headings of A and B.) To have logical
data independence in any implementation, the first equation must remain
true. If we need to falsify any other true equation as a consequence of
making R base, we don't have logical data independence. Logical
independence implies that A cannot have a tuple that is not in R{HA}.

To say that a 'delete' can have indeterminate base results is to permit
A to have a tuple that is not in R{HA} which makes one of the equations
false. Saying this denies logical independence as I see it. I think
one implication of what McGoveran is saying is that certain database
values are not logically possible when logical independence is assumed
and possibly this has to do with the motivation of those who want to
disallow delete through join. Maybe they also are saying that we can
have logical independence sometimes but not always. If so, I think this
brings into question whether dbms results can ever be validated logically.

Somebody might object that making R base as above is not always possible
because a user might have inserted a tuple into an 'A' relvar but not
into a 'B' relvar. My answer to that is that it would always be
logically possible if all three equations were always obeyed, ie., by
logical independence, the values for R, A and B must always be such as
they would be if all inserted tuples had been inserted into R.

Part of Codd's early reason for wanting what he called data independence
was surely the hierarchical db's of the time, where programs changed
frequently in very adhoc ways as a consequence of physical data
re-arrangements. But even if today there were no more hierarchical
db's, logical independence would still be needed at the very least for
optimizations.

Given the above example, let me assume logical independence so that A =
R{HA} and B = R{HB} will always be true and a result R' = A' JOIN B'
where R', A', B' are result relvars, or 'after' values if you like, is
also true. After a successful delete by an implementation, A' must have
no tuple that is not in R'{HA}, similarly for B'. However, this does
not imply that all deletes must 'delete from both sides'. It only means
that when HA = HB, A = B. Much depends on language design choices, such as
the definition of 'MINUS'. Also, I have no doubt that the Tutorial D
language is indeterminate for delete through join and I think that is a
result of the MINUS definition TD uses.

If we want these three equations to be always true as far as MINUS is
concerned, it might help if we somehow reflect them in a definition that
gives some clue as to how an implementation can proceed. One clue was
spelled out by McGoveran in the exchange, treat the view definition as
the general case. Another is implied by the second two equations,
namely the equality of projections. Equality of relations or
projections can be specified by negating the result of logical
exclusive-or, eg., in A-algebra terms: <NOT> (((<NOT> A) <AND> B) <OR
(A <AND> (<NOT> B))). Let me just write this as A <NXOR> B, so R = A
NXOR> B. Now I can try to express those clues in a different MINUS
definition:

R MINUS D is semantically equivalent to:

((R{HR1} <NXOR> R{HR2}) <AND> (<NOT> D)){HR}

where (ie., 'it is required that') HR is the heading of R
and
HR1 union HR2 = HR,

From the definition of <NXOR> it is clear that an implementation's
choices do not include deleting a tuple from R{HR1} if it does not also
delete that tuple from R{HR2} when HR1=HR2. However, if HR1 <> HR2, an
implementation is not prevented by this particular definition from
deleting a tuple only from R{HR1}, similarly only from R{HR2}.

I think this definition needs some more work, but I'll take a breath for
now.
























Reply With Quote
  #7  
Old   
Brian Selzer
 
Posts: n/a

Default Re: A different definition of MINUS, part 4 - 12-27-2008 , 11:09 PM




"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Parts one, two and three have various mis-steps and confusions. Let me
start again emphasizing what I think was McGoveran's paramount point -
the desire for logical data independence. I interpret the meaning of
this in an algebra to be that substituting some non-logical aspect of an
implementation for another will not require any logical aspects to be
removed, ie., any true statements to be falsified. Base and virtual
relvars are non-logical aspects in that there is no notion of them in
the algebra or calculus. In Codd's logic, ie. his algebra and calculus,
they are no different than colours, for example it does not matter in
the D&D A-algebra whether a relation is 'red' or 'green'. This does not
mean that an implementation cannot be logical, only that if its results
are to be validated logically, it must define all the concepts involved
in those results in logical, eg., algebraic, terms.

Can you be more specific in what you mean by logical data independence? I
understood it to be a property of equivalent database schemes--characterized
with respect to the Relational Model by the ability to house exactly the
same information using differing sets of relations. The concept is
orthogonal to whether relations are virtual or not. What is important is
that being able to transform the set of relations that conforms to one
database scheme into the set of relations that conforms to another database
scheme is a prerequisite for the schemes to be equivalent. What that means
is that for a view to be uncontingently updatable, there has to be an
equivalent database scheme in which the heading of a base relation matches
the heading of the view. For example, suppose you have two database schemes

scheme 1:

EMPLOYEE {EMPID, LAST, FIRST, MIDDLE, SSAN} KEY{EMPID}

scheme 2:

EMP1 {EMPID, LAST, FIRST, MIDDLE} KEY{EMPID} and
EMP2 {EMPID, SSAN} KEY{EMPID} such that
EMP1[EMPID] = EMP2[EMPID]
EMPLOYEE = EMP1 JOIN EMP2

Here the circular inclusion dependency guarantees that any update through
the EMPLOYEE view can be transformed into a legal set of updates against
EMP1 and EMP2.

I don't see how anything short of schema equivalence could be permitted--for
instance, relaxing the circular inclusion dependency--without changing the
information content of the database.

Quote:
Suppose R is virtual, A and B are base and R is defined as A JOIN B:

R = A <AND> B is true.

Now, make R base and A and B virtual (logical independence should allow
this):

A = R{HA} is true.
B = R{HB} is true.

(HA and HB are the respective headings of A and B.) To have logical
data independence in any implementation, the first equation must remain
true. If we need to falsify any other true equation as a consequence of
making R base, we don't have logical data independence. Logical
independence implies that A cannot have a tuple that is not in R{HA}.

To say that a 'delete' can have indeterminate base results is to permit
A to have a tuple that is not in R{HA} which makes one of the equations
false. Saying this denies logical independence as I see it. I think
one implication of what McGoveran is saying is that certain database
values are not logically possible when logical independence is assumed
and possibly this has to do with the motivation of those who want to
disallow delete through join. Maybe they also are saying that we can
have logical independence sometimes but not always. If so, I think this
brings into question whether dbms results can ever be validated logically.

Somebody might object that making R base as above is not always possible
because a user might have inserted a tuple into an 'A' relvar but not
into a 'B' relvar. My answer to that is that it would always be
logically possible if all three equations were always obeyed, ie., by
logical independence, the values for R, A and B must always be such as
they would be if all inserted tuples had been inserted into R.

Part of Codd's early reason for wanting what he called data independence
was surely the hierarchical db's of the time, where programs changed
frequently in very adhoc ways as a consequence of physical data
re-arrangements. But even if today there were no more hierarchical
db's, logical independence would still be needed at the very least for
optimizations.

Given the above example, let me assume logical independence so that A =
R{HA} and B = R{HB} will always be true and a result R' = A' JOIN B'
where R', A', B' are result relvars, or 'after' values if you like, is
also true. After a successful delete by an implementation, A' must have
no tuple that is not in R'{HA}, similarly for B'. However, this does
not imply that all deletes must 'delete from both sides'. It only means
that when HA = HB, A = B. Much depends on language design choices, such as
the definition of 'MINUS'. Also, I have no doubt that the Tutorial D
language is indeterminate for delete through join and I think that is a
result of the MINUS definition TD uses.

If we want these three equations to be always true as far as MINUS is
concerned, it might help if we somehow reflect them in a definition that
gives some clue as to how an implementation can proceed. One clue was
spelled out by McGoveran in the exchange, treat the view definition as
the general case. Another is implied by the second two equations,
namely the equality of projections. Equality of relations or
projections can be specified by negating the result of logical
exclusive-or, eg., in A-algebra terms: <NOT> (((<NOT> A) <AND> B) <OR
(A <AND> (<NOT> B))). Let me just write this as A <NXOR> B, so R = A
NXOR> B. Now I can try to express those clues in a different MINUS
definition:

R MINUS D is semantically equivalent to:

((R{HR1} <NXOR> R{HR2}) <AND> (<NOT> D)){HR}

where (ie., 'it is required that') HR is the heading of R
and
HR1 union HR2 = HR,

From the definition of <NXOR> it is clear that an implementation's
choices do not include deleting a tuple from R{HR1} if it does not also
delete that tuple from R{HR2} when HR1=HR2. However, if HR1 <> HR2, an
implementation is not prevented by this particular definition from
deleting a tuple only from R{HR1}, similarly only from R{HR2}.

I think this definition needs some more work, but I'll take a breath for
now.
























Reply With Quote
  #8  
Old   
Brian Selzer
 
Posts: n/a

Default Re: A different definition of MINUS, part 4 - 12-27-2008 , 11:09 PM




"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Parts one, two and three have various mis-steps and confusions. Let me
start again emphasizing what I think was McGoveran's paramount point -
the desire for logical data independence. I interpret the meaning of
this in an algebra to be that substituting some non-logical aspect of an
implementation for another will not require any logical aspects to be
removed, ie., any true statements to be falsified. Base and virtual
relvars are non-logical aspects in that there is no notion of them in
the algebra or calculus. In Codd's logic, ie. his algebra and calculus,
they are no different than colours, for example it does not matter in
the D&D A-algebra whether a relation is 'red' or 'green'. This does not
mean that an implementation cannot be logical, only that if its results
are to be validated logically, it must define all the concepts involved
in those results in logical, eg., algebraic, terms.

Can you be more specific in what you mean by logical data independence? I
understood it to be a property of equivalent database schemes--characterized
with respect to the Relational Model by the ability to house exactly the
same information using differing sets of relations. The concept is
orthogonal to whether relations are virtual or not. What is important is
that being able to transform the set of relations that conforms to one
database scheme into the set of relations that conforms to another database
scheme is a prerequisite for the schemes to be equivalent. What that means
is that for a view to be uncontingently updatable, there has to be an
equivalent database scheme in which the heading of a base relation matches
the heading of the view. For example, suppose you have two database schemes

scheme 1:

EMPLOYEE {EMPID, LAST, FIRST, MIDDLE, SSAN} KEY{EMPID}

scheme 2:

EMP1 {EMPID, LAST, FIRST, MIDDLE} KEY{EMPID} and
EMP2 {EMPID, SSAN} KEY{EMPID} such that
EMP1[EMPID] = EMP2[EMPID]
EMPLOYEE = EMP1 JOIN EMP2

Here the circular inclusion dependency guarantees that any update through
the EMPLOYEE view can be transformed into a legal set of updates against
EMP1 and EMP2.

I don't see how anything short of schema equivalence could be permitted--for
instance, relaxing the circular inclusion dependency--without changing the
information content of the database.

Quote:
Suppose R is virtual, A and B are base and R is defined as A JOIN B:

R = A <AND> B is true.

Now, make R base and A and B virtual (logical independence should allow
this):

A = R{HA} is true.
B = R{HB} is true.

(HA and HB are the respective headings of A and B.) To have logical
data independence in any implementation, the first equation must remain
true. If we need to falsify any other true equation as a consequence of
making R base, we don't have logical data independence. Logical
independence implies that A cannot have a tuple that is not in R{HA}.

To say that a 'delete' can have indeterminate base results is to permit
A to have a tuple that is not in R{HA} which makes one of the equations
false. Saying this denies logical independence as I see it. I think
one implication of what McGoveran is saying is that certain database
values are not logically possible when logical independence is assumed
and possibly this has to do with the motivation of those who want to
disallow delete through join. Maybe they also are saying that we can
have logical independence sometimes but not always. If so, I think this
brings into question whether dbms results can ever be validated logically.

Somebody might object that making R base as above is not always possible
because a user might have inserted a tuple into an 'A' relvar but not
into a 'B' relvar. My answer to that is that it would always be
logically possible if all three equations were always obeyed, ie., by
logical independence, the values for R, A and B must always be such as
they would be if all inserted tuples had been inserted into R.

Part of Codd's early reason for wanting what he called data independence
was surely the hierarchical db's of the time, where programs changed
frequently in very adhoc ways as a consequence of physical data
re-arrangements. But even if today there were no more hierarchical
db's, logical independence would still be needed at the very least for
optimizations.

Given the above example, let me assume logical independence so that A =
R{HA} and B = R{HB} will always be true and a result R' = A' JOIN B'
where R', A', B' are result relvars, or 'after' values if you like, is
also true. After a successful delete by an implementation, A' must have
no tuple that is not in R'{HA}, similarly for B'. However, this does
not imply that all deletes must 'delete from both sides'. It only means
that when HA = HB, A = B. Much depends on language design choices, such as
the definition of 'MINUS'. Also, I have no doubt that the Tutorial D
language is indeterminate for delete through join and I think that is a
result of the MINUS definition TD uses.

If we want these three equations to be always true as far as MINUS is
concerned, it might help if we somehow reflect them in a definition that
gives some clue as to how an implementation can proceed. One clue was
spelled out by McGoveran in the exchange, treat the view definition as
the general case. Another is implied by the second two equations,
namely the equality of projections. Equality of relations or
projections can be specified by negating the result of logical
exclusive-or, eg., in A-algebra terms: <NOT> (((<NOT> A) <AND> B) <OR
(A <AND> (<NOT> B))). Let me just write this as A <NXOR> B, so R = A
NXOR> B. Now I can try to express those clues in a different MINUS
definition:

R MINUS D is semantically equivalent to:

((R{HR1} <NXOR> R{HR2}) <AND> (<NOT> D)){HR}

where (ie., 'it is required that') HR is the heading of R
and
HR1 union HR2 = HR,

From the definition of <NXOR> it is clear that an implementation's
choices do not include deleting a tuple from R{HR1} if it does not also
delete that tuple from R{HR2} when HR1=HR2. However, if HR1 <> HR2, an
implementation is not prevented by this particular definition from
deleting a tuple only from R{HR1}, similarly only from R{HR2}.

I think this definition needs some more work, but I'll take a breath for
now.
























Reply With Quote
  #9  
Old   
Brian Selzer
 
Posts: n/a

Default Re: A different definition of MINUS, part 4 - 12-27-2008 , 11:09 PM




"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Parts one, two and three have various mis-steps and confusions. Let me
start again emphasizing what I think was McGoveran's paramount point -
the desire for logical data independence. I interpret the meaning of
this in an algebra to be that substituting some non-logical aspect of an
implementation for another will not require any logical aspects to be
removed, ie., any true statements to be falsified. Base and virtual
relvars are non-logical aspects in that there is no notion of them in
the algebra or calculus. In Codd's logic, ie. his algebra and calculus,
they are no different than colours, for example it does not matter in
the D&D A-algebra whether a relation is 'red' or 'green'. This does not
mean that an implementation cannot be logical, only that if its results
are to be validated logically, it must define all the concepts involved
in those results in logical, eg., algebraic, terms.

Can you be more specific in what you mean by logical data independence? I
understood it to be a property of equivalent database schemes--characterized
with respect to the Relational Model by the ability to house exactly the
same information using differing sets of relations. The concept is
orthogonal to whether relations are virtual or not. What is important is
that being able to transform the set of relations that conforms to one
database scheme into the set of relations that conforms to another database
scheme is a prerequisite for the schemes to be equivalent. What that means
is that for a view to be uncontingently updatable, there has to be an
equivalent database scheme in which the heading of a base relation matches
the heading of the view. For example, suppose you have two database schemes

scheme 1:

EMPLOYEE {EMPID, LAST, FIRST, MIDDLE, SSAN} KEY{EMPID}

scheme 2:

EMP1 {EMPID, LAST, FIRST, MIDDLE} KEY{EMPID} and
EMP2 {EMPID, SSAN} KEY{EMPID} such that
EMP1[EMPID] = EMP2[EMPID]
EMPLOYEE = EMP1 JOIN EMP2

Here the circular inclusion dependency guarantees that any update through
the EMPLOYEE view can be transformed into a legal set of updates against
EMP1 and EMP2.

I don't see how anything short of schema equivalence could be permitted--for
instance, relaxing the circular inclusion dependency--without changing the
information content of the database.

Quote:
Suppose R is virtual, A and B are base and R is defined as A JOIN B:

R = A <AND> B is true.

Now, make R base and A and B virtual (logical independence should allow
this):

A = R{HA} is true.
B = R{HB} is true.

(HA and HB are the respective headings of A and B.) To have logical
data independence in any implementation, the first equation must remain
true. If we need to falsify any other true equation as a consequence of
making R base, we don't have logical data independence. Logical
independence implies that A cannot have a tuple that is not in R{HA}.

To say that a 'delete' can have indeterminate base results is to permit
A to have a tuple that is not in R{HA} which makes one of the equations
false. Saying this denies logical independence as I see it. I think
one implication of what McGoveran is saying is that certain database
values are not logically possible when logical independence is assumed
and possibly this has to do with the motivation of those who want to
disallow delete through join. Maybe they also are saying that we can
have logical independence sometimes but not always. If so, I think this
brings into question whether dbms results can ever be validated logically.

Somebody might object that making R base as above is not always possible
because a user might have inserted a tuple into an 'A' relvar but not
into a 'B' relvar. My answer to that is that it would always be
logically possible if all three equations were always obeyed, ie., by
logical independence, the values for R, A and B must always be such as
they would be if all inserted tuples had been inserted into R.

Part of Codd's early reason for wanting what he called data independence
was surely the hierarchical db's of the time, where programs changed
frequently in very adhoc ways as a consequence of physical data
re-arrangements. But even if today there were no more hierarchical
db's, logical independence would still be needed at the very least for
optimizations.

Given the above example, let me assume logical independence so that A =
R{HA} and B = R{HB} will always be true and a result R' = A' JOIN B'
where R', A', B' are result relvars, or 'after' values if you like, is
also true. After a successful delete by an implementation, A' must have
no tuple that is not in R'{HA}, similarly for B'. However, this does
not imply that all deletes must 'delete from both sides'. It only means
that when HA = HB, A = B. Much depends on language design choices, such as
the definition of 'MINUS'. Also, I have no doubt that the Tutorial D
language is indeterminate for delete through join and I think that is a
result of the MINUS definition TD uses.

If we want these three equations to be always true as far as MINUS is
concerned, it might help if we somehow reflect them in a definition that
gives some clue as to how an implementation can proceed. One clue was
spelled out by McGoveran in the exchange, treat the view definition as
the general case. Another is implied by the second two equations,
namely the equality of projections. Equality of relations or
projections can be specified by negating the result of logical
exclusive-or, eg., in A-algebra terms: <NOT> (((<NOT> A) <AND> B) <OR
(A <AND> (<NOT> B))). Let me just write this as A <NXOR> B, so R = A
NXOR> B. Now I can try to express those clues in a different MINUS
definition:

R MINUS D is semantically equivalent to:

((R{HR1} <NXOR> R{HR2}) <AND> (<NOT> D)){HR}

where (ie., 'it is required that') HR is the heading of R
and
HR1 union HR2 = HR,

From the definition of <NXOR> it is clear that an implementation's
choices do not include deleting a tuple from R{HR1} if it does not also
delete that tuple from R{HR2} when HR1=HR2. However, if HR1 <> HR2, an
implementation is not prevented by this particular definition from
deleting a tuple only from R{HR1}, similarly only from R{HR2}.

I think this definition needs some more work, but I'll take a breath for
now.
























Reply With Quote
  #10  
Old   
Brian Selzer
 
Posts: n/a

Default Re: A different definition of MINUS, part 4 - 12-27-2008 , 11:09 PM




"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Parts one, two and three have various mis-steps and confusions. Let me
start again emphasizing what I think was McGoveran's paramount point -
the desire for logical data independence. I interpret the meaning of
this in an algebra to be that substituting some non-logical aspect of an
implementation for another will not require any logical aspects to be
removed, ie., any true statements to be falsified. Base and virtual
relvars are non-logical aspects in that there is no notion of them in
the algebra or calculus. In Codd's logic, ie. his algebra and calculus,
they are no different than colours, for example it does not matter in
the D&D A-algebra whether a relation is 'red' or 'green'. This does not
mean that an implementation cannot be logical, only that if its results
are to be validated logically, it must define all the concepts involved
in those results in logical, eg., algebraic, terms.

Can you be more specific in what you mean by logical data independence? I
understood it to be a property of equivalent database schemes--characterized
with respect to the Relational Model by the ability to house exactly the
same information using differing sets of relations. The concept is
orthogonal to whether relations are virtual or not. What is important is
that being able to transform the set of relations that conforms to one
database scheme into the set of relations that conforms to another database
scheme is a prerequisite for the schemes to be equivalent. What that means
is that for a view to be uncontingently updatable, there has to be an
equivalent database scheme in which the heading of a base relation matches
the heading of the view. For example, suppose you have two database schemes

scheme 1:

EMPLOYEE {EMPID, LAST, FIRST, MIDDLE, SSAN} KEY{EMPID}

scheme 2:

EMP1 {EMPID, LAST, FIRST, MIDDLE} KEY{EMPID} and
EMP2 {EMPID, SSAN} KEY{EMPID} such that
EMP1[EMPID] = EMP2[EMPID]
EMPLOYEE = EMP1 JOIN EMP2

Here the circular inclusion dependency guarantees that any update through
the EMPLOYEE view can be transformed into a legal set of updates against
EMP1 and EMP2.

I don't see how anything short of schema equivalence could be permitted--for
instance, relaxing the circular inclusion dependency--without changing the
information content of the database.

Quote:
Suppose R is virtual, A and B are base and R is defined as A JOIN B:

R = A <AND> B is true.

Now, make R base and A and B virtual (logical independence should allow
this):

A = R{HA} is true.
B = R{HB} is true.

(HA and HB are the respective headings of A and B.) To have logical
data independence in any implementation, the first equation must remain
true. If we need to falsify any other true equation as a consequence of
making R base, we don't have logical data independence. Logical
independence implies that A cannot have a tuple that is not in R{HA}.

To say that a 'delete' can have indeterminate base results is to permit
A to have a tuple that is not in R{HA} which makes one of the equations
false. Saying this denies logical independence as I see it. I think
one implication of what McGoveran is saying is that certain database
values are not logically possible when logical independence is assumed
and possibly this has to do with the motivation of those who want to
disallow delete through join. Maybe they also are saying that we can
have logical independence sometimes but not always. If so, I think this
brings into question whether dbms results can ever be validated logically.

Somebody might object that making R base as above is not always possible
because a user might have inserted a tuple into an 'A' relvar but not
into a 'B' relvar. My answer to that is that it would always be
logically possible if all three equations were always obeyed, ie., by
logical independence, the values for R, A and B must always be such as
they would be if all inserted tuples had been inserted into R.

Part of Codd's early reason for wanting what he called data independence
was surely the hierarchical db's of the time, where programs changed
frequently in very adhoc ways as a consequence of physical data
re-arrangements. But even if today there were no more hierarchical
db's, logical independence would still be needed at the very least for
optimizations.

Given the above example, let me assume logical independence so that A =
R{HA} and B = R{HB} will always be true and a result R' = A' JOIN B'
where R', A', B' are result relvars, or 'after' values if you like, is
also true. After a successful delete by an implementation, A' must have
no tuple that is not in R'{HA}, similarly for B'. However, this does
not imply that all deletes must 'delete from both sides'. It only means
that when HA = HB, A = B. Much depends on language design choices, such as
the definition of 'MINUS'. Also, I have no doubt that the Tutorial D
language is indeterminate for delete through join and I think that is a
result of the MINUS definition TD uses.

If we want these three equations to be always true as far as MINUS is
concerned, it might help if we somehow reflect them in a definition that
gives some clue as to how an implementation can proceed. One clue was
spelled out by McGoveran in the exchange, treat the view definition as
the general case. Another is implied by the second two equations,
namely the equality of projections. Equality of relations or
projections can be specified by negating the result of logical
exclusive-or, eg., in A-algebra terms: <NOT> (((<NOT> A) <AND> B) <OR
(A <AND> (<NOT> B))). Let me just write this as A <NXOR> B, so R = A
NXOR> B. Now I can try to express those clues in a different MINUS
definition:

R MINUS D is semantically equivalent to:

((R{HR1} <NXOR> R{HR2}) <AND> (<NOT> D)){HR}

where (ie., 'it is required that') HR is the heading of R
and
HR1 union HR2 = HR,

From the definition of <NXOR> it is clear that an implementation's
choices do not include deleting a tuple from R{HR1} if it does not also
delete that tuple from R{HR2} when HR1=HR2. However, if HR1 <> HR2, an
implementation is not prevented by this particular definition from
deleting a tuple only from R{HR1}, similarly only from R{HR2}.

I think this definition needs some more work, but I'll take a breath for
now.
























Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.