dbTalk Databases Forums  

a union is always a join!

comp.databases.theory comp.databases.theory


Discuss a union is always a join! in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
paul c
 
Posts: n/a

Default a union is always a join! - 01-05-2009 , 03:56 PM






Criticisms, please (preferably ones based on some formal logic or other).


It is inescapable that every relation is a join (eg., Heath's theorem).
So every relvar points to a join. If we can't 'delete' through a join,
we can't delete from any relvar (my father, who thought a disk buffer
was a polishing material could have concluded this - also, I surmise, we
can't logically express some number of otherwise possible relations
when we have some purpose besides defining a 'view').


The reason for every relation being a join has nothing to do with the
expression that forms a join 'view'. I think it's a consequence of
Codd's algebra that every relation that has more than one subset of
attributes is a join of two or more relations. By Heath, there will
always be two heading subsets (at least one of them having the same
heading as the relation) that will determine the other possible pairs of
heading subsets in a join. Because the join on all of a relation's
attributes with any subset of them is algebraically equal to the
relation, it looks to me that the number of possible and irreducible
join expressions is at least equal to the number of subsets of
attributes in a relation heading. (This is implicit. Explicit
constraints may have a number of expressions that are fewer and still
sufficient to express all the possible expressions.)


For the same reason, every 'insert' is through a join, even if the
'insert' is also to a 'view' formed by union. One can equally say that
some relations involve union, but this doesn't change the fact that they
are also joins, in other words, not all joins are algebraically equal to
some union or other, ie., if a relation is formed by join, it is not
always also a union.


If so, it should be far more productive to design a language based on
axioms that are universal, the theorems that are always true, rather
than on statements that are true only sometimes.



Reply With Quote
  #2  
Old   
Brian Selzer
 
Posts: n/a

Default Re: a union is always a join! - 01-05-2009 , 11:14 PM







"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Criticisms, please (preferably ones based on some formal logic or other).

I think your reasoning is circular: A relation is a join because it can be
joined to a projection of itself over a subset of its own heading?

Quote:
It is inescapable that every relation is a join (eg., Heath's theorem).
So every relvar points to a join. If we can't 'delete' through a join, we
can't delete from any relvar (my father, who thought a disk buffer was a
polishing material could have concluded this - also, I surmise, we can't
logically express some number of otherwise possible relations when we
have some purpose besides defining a 'view').


The reason for every relation being a join has nothing to do with the
expression that forms a join 'view'. I think it's a consequence of
Codd's algebra that every relation that has more than one subset of
attributes is a join of two or more relations. By Heath, there will
always be two heading subsets (at least one of them having the same
heading as the relation) that will determine the other possible pairs of
heading subsets in a join. Because the join on all of a relation's
attributes with any subset of them is algebraically equal to the relation,
it looks to me that the number of possible and irreducible join
expressions is at least equal to the number of subsets of attributes in a
relation heading. (This is implicit. Explicit constraints may have a
number of expressions that are fewer and still sufficient to express all
the possible expressions.)


For the same reason, every 'insert' is through a join, even if the
'insert' is also to a 'view' formed by union. One can equally say that
some relations involve union, but this doesn't change the fact that they
are also joins, in other words, not all joins are algebraically equal to
some union or other, ie., if a relation is formed by join, it is not
always also a union.

If so, it should be far more productive to design a language based on
axioms that are universal, the theorems that are always true, rather than
on statements that are true only sometimes.





Reply With Quote
  #3  
Old   
rpost
 
Posts: n/a

Default Re: a union is always a join! - 01-07-2009 , 01:40 PM



paul c wrote:

Quote:
Criticisms, please (preferably ones based on some formal logic or other).

It is inescapable that every relation is a join (eg., Heath's theorem).
As Brian writes, this is one step too far. Every relation can be
decomposed into smaller ones based on its nontrivial functional
dependencies, but if it hasn't any - and the resulting relations don't -
it is only a join of itself.

Perhaps by join you mean something like the AND in Darwen's 'A New
Relational Algebra' (thanks for giving its URL earlier), which has the
join and and union of relational algebra as special cases. We can also
decompose relations by writing them as the unions of other relations
(someone I know wrote a PhD thesis about it), but Heath isn't about that.

Quote:
So every relvar points to a join. If we can't 'delete' through a join,
we can't delete from any relvar (my father, who thought a disk buffer
was a polishing material could have concluded this - also, I surmise, we
can't logically express some number of otherwise possible relations
when we have some purpose besides defining a 'view').
I don't follow. If we know the dependencies,
some deletions are possible, while others aren't.
How is this relevant if you don't make a distinction between 'base'
and 'virtual' relations? (Or 'time varying' relations and views,
whatever.)

Quote:
The reason for every relation being a join has nothing to do with the
expression that forms a join 'view'. I think it's a consequence of
Codd's algebra that every relation that has more than one subset of
attributes is a join of two or more relations.
If by 'subset of attributes' you mean 'key', you're right.

Quote:
By Heath, there will
always be two heading subsets (at least one of them having the same
heading as the relation) that will determine the other possible pairs of
heading subsets in a join.
Heath has an 'if' clause; aren't you overlooking it?

Quote:
Because the join on all of a relation's
attributes with any subset of them is algebraically equal to the
relation, it looks to me that the number of possible and irreducible
join expressions is at least equal to the number of subsets of
attributes in a relation heading. (This is implicit. Explicit
constraints may have a number of expressions that are fewer and still
sufficient to express all the possible expressions.)
Yes, but what are trivial joins good for?

Quote:
For the same reason, every 'insert' is through a join, even if the
'insert' is also to a 'view' formed by union.
An insert extends a relation vertically (it adds tuples); a relational
algebra join horizontally (it extends tuples with attributes and values),
but when projecting these attributes away again, hasn't added anything).
An insert can be expressed as a (relational algebra) union of the existing
relation with the inserted tuples, but no nontrivial r.a. union is a
r.a. join (and vice versa). Using the term 'join' for Darwen's AND
operator, which combines r.a. union and join, doesn't change this.

What are you trying to achieve?

Quote:
One can equally say that
some relations involve union, but this doesn't change the fact that they
are also joins, in other words, not all joins are algebraically equal to
some union or other, ie., if a relation is formed by join, it is not
always also a union.
In the relational algebra, a nontrivial union is never a join or vice versa.
What does combining them into "AND" buy you?

Quote:
If so, it should be far more productive to design a language based on
axioms that are universal, the theorems that are always true, rather
than on statements that are true only sometimes.
I don't see what this has to do with anything you wrote before.

--
Reinier


Reply With Quote
  #4  
Old   
paul c
 
Posts: n/a

Default Re: a union is always a join! - 01-07-2009 , 04:45 PM



rpost wrote:
Quote:
paul c wrote:

Criticisms, please (preferably ones based on some formal logic or other).

It is inescapable that every relation is a join (eg., Heath's theorem).

As Brian writes, this is one step too far. Every relation can be
decomposed into smaller ones based on its nontrivial functional
dependencies, but if it hasn't any - and the resulting relations don't -
it is only a join of itself.
...
All relations have at least n+1 fd's, even if trivial, where n is the number of attributes. All relations with at least one attribute (all but 'Dee') are a join of at least two (non-empty) relations.

Quote:
Perhaps by join you mean something like the AND in Darwen's 'A New
Relational Algebra' (thanks for giving its URL earlier), which has the
join and and union of relational algebra as special cases. We can also
decompose relations by writing them as the unions of other relations
(someone I know wrote a PhD thesis about it), but Heath isn't about that.
...
<AND> or the equivalent in another algebra is the only way to understand what join and union really mean, ie., are actually capable of being interpreted as. Some 'unions' can only be expressed as the union of themselves and an empty relation. I think it is helpful to understand union from some formal definition that acknowledges the logical possibility that two relations may not be 'union' compatible, eg., the '<OR>' operation or equivalent. Understanding relation structure based on empty relations seems as pointless/useless to me as understanding relational complement as being 'simple' complement. In fact, I'd rather call simple complement the vague complement and relative complement the exact complement. Why eschew a more precise interpretation when it's staring us in the face?

Quote:
algebra join horizontally (it extends tuples with attributes and values),
but when projecting these attributes away again, hasn't added anything).
An insert can be expressed as a (relational algebra) union of the existing
relation with the inserted tuples, but no nontrivial r.a. union is a
r.a. join (and vice versa). Using the term 'join' for Darwen's AND
operator, which combines r.a. union and join, doesn't change this.

What are you trying to achieve?
...
Originally, the most potent axioms for an implementation language that follows McGoveran's suggestions, ie., ones that recognize relation structure as opposed to how a relation value may have been arbitrarily formed. The way most people talk of Codd's algebra is simplistic because they have typically not tried to interpret it, probably have never even read it, for example, in the above, what do horizontal and vertical coordinates have to do with anything relational?. This is faux technocracy, usually involving what de Bono called porridge words which in this case are additional lingo to disguise a lack of basic understand (and which is how I believe Codd's algebra is usually 'taught', or thought to be 'taught'. But more to the point, I continue to be amazed that people claim they can 'insert' a tuple to a relvar but cannot 'delete' that very tuple! (vice-versa too.) I think when they say they can't delete, what they are really saying is that they can't delete a tuple that
is not in the join! The same goes for asserting and retracting a proposition in general. I've no doubt that it might be difficult to retract a proposition that one has never asserted! In a way, this is no surprise to me and I still don't know why people think it is a problem, when all human affairs contain much mumbo-jumbo that is logically make-believe. Now that you mention it, another goal would be to prevent geometric interpretations of the basic algebraic structures! Sounds like tables or arrays are getting mixed up with relations.



Reply With Quote
  #5  
Old   
Brian Selzer
 
Posts: n/a

Default Re: a union is always a join! - 01-07-2009 , 09:35 PM



This is really very simple, if you choose to actually use the mass of tissue
between your ears.

~(p /\ q) = ~p \/ ~q

So if ~(p /\ q), then which is false? p? q? both? All we can determine
for certain is that it is definitely not neither.

"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
rpost wrote:
paul c wrote:

Criticisms, please (preferably ones based on some formal logic or
other).

It is inescapable that every relation is a join (eg., Heath's theorem).

As Brian writes, this is one step too far. Every relation can be
decomposed into smaller ones based on its nontrivial functional
dependencies, but if it hasn't any - and the resulting relations don't -
it is only a join of itself.
...

All relations have at least n+1 fd's, even if trivial, where n is the
number of attributes. All relations with at least one attribute (all but
'Dee') are a join of at least two (non-empty) relations.

Perhaps by join you mean something like the AND in Darwen's 'A New
Relational Algebra' (thanks for giving its URL earlier), which has the
join and and union of relational algebra as special cases. We can also
decompose relations by writing them as the unions of other relations
(someone I know wrote a PhD thesis about it), but Heath isn't about that.
...

AND> or the equivalent in another algebra is the only way to understand
what join and union really mean, ie., are actually capable of being
interpreted as. Some 'unions' can only be expressed as the union of
themselves and an empty relation. I think it is helpful to understand
union from some formal definition that acknowledges the logical
possibility that two relations may not be 'union' compatible, eg., the
'<OR>' operation or equivalent. Understanding relation structure based on
empty relations seems as pointless/useless to me as understanding
relational complement as being 'simple' complement. In fact, I'd rather
call simple complement the vague complement and relative complement the
exact complement. Why eschew a more precise interpretation when it's
staring us in the face?

algebra join horizontally (it extends tuples with attributes and values),
but when projecting these attributes away again, hasn't added anything).
An insert can be expressed as a (relational algebra) union of the
existing
relation with the inserted tuples, but no nontrivial r.a. union is a
r.a. join (and vice versa). Using the term 'join' for Darwen's AND
operator, which combines r.a. union and join, doesn't change this.

What are you trying to achieve?
...

Originally, the most potent axioms for an implementation language that
follows McGoveran's suggestions, ie., ones that recognize relation
structure as opposed to how a relation value may have been arbitrarily
formed. The way most people talk of Codd's algebra is simplistic because
they have typically not tried to interpret it, probably have never even
read it, for example, in the above, what do horizontal and vertical
coordinates have to do with anything relational?. This is faux
technocracy, usually involving what de Bono called porridge words which in
this case are additional lingo to disguise a lack of basic understand (and
which is how I believe Codd's algebra is usually 'taught', or thought to
be 'taught'. But more to the point, I continue to be amazed that people
claim they can 'insert' a tuple to a relvar but cannot 'delete' that very
tuple! (vice-versa too.) I think when they say they can't delete, what
they are really saying is that they can't delete a tuple that
is not in the join! The same goes for asserting and retracting a
proposition in general. I've no doubt that it might be difficult to
retract a proposition that one has never asserted! In a way, this is no
surprise to me and I still don't know why people think it is a problem,
when all human affairs contain much mumbo-jumbo that is logically
make-believe. Now that you mention it, another goal would be to prevent
geometric interpretations of the basic algebraic structures! Sounds like
tables or arrays are getting mixed up with relations.




Reply With Quote
  #6  
Old   
rpost
 
Posts: n/a

Default Re: a union is always a join! - 01-08-2009 , 01:21 PM



paul c wrote:

Quote:
rpost wrote:
paul c wrote:

Criticisms, please (preferably ones based on some formal logic or other).

It is inescapable that every relation is a join (eg., Heath's theorem).

As Brian writes, this is one step too far. Every relation can be
decomposed into smaller ones based on its nontrivial functional
dependencies, but if it hasn't any - and the resulting relations don't -
it is only a join of itself.
...

All relations have at least n+1 fd's, even if trivial, where n is the
number of attributes. All relations with at least one attribute (all
but 'Dee') are a join of at least two (non-empty) relations.
Yes, trivial ones: one is the relation itself.
Nontrivial ones exist only when the relation has (nontrivial)
join dependencies, e.g. a (nontrivial) key.

Quote:
Perhaps by join you mean something like the AND in Darwen's 'A New
Relational Algebra' (thanks for giving its URL earlier), which has the
join and and union of relational algebra as special cases. We can also
decompose relations by writing them as the unions of other relations
(someone I know wrote a PhD thesis about it), but Heath isn't about that.
...

AND> or the equivalent in another algebra is the only way to understand
what join and union really mean, ie., are actually capable of being
interpreted as.
I don't understand this, they are orthogonal operations, as I explained.

Quote:
Some 'unions' can only be expressed as the union of
themselves and an empty relation.
I don't understand your interest in trivial unions or trivial joins.

Quote:
I think it is helpful to understand
union from some formal definition that acknowledges the logical
possibility that two relations may not be 'union' compatible, eg., the
'<OR>' operation or equivalent.
The relational algebra union does so by not being defined for those
cases, but I agree with your and Darwen's sentiment that it is nicer to
extend operations to cover all cases when this can be usefully done,
in fact I've worked on a similar generalization of operations for a
different query language. But this doesn't change the fact that union
and join are fundamentally orthogonal, in the sense that one combines
relations vertically, the other horizontally.

So I do sympathize with his OR and NOT; but I also feel
the problems with them are more than 'computational issues'.

Quote:
Understanding relation structure based
on empty relations seems as pointless/useless to me as understanding
relational complement as being 'simple' complement. In fact, I'd rather
call simple complement the vague complement and relative complement the
exact complement. Why eschew a more precise interpretation when it's
staring us in the face?
I'm trying to make sense of this by reading Darwen's NOT
for 'relational complement' and relational algebra's MINUS
for 'simple complement'. Is that what you mean?

In that case, I don't see what you mean by preciseness.
Darwen's AND, OR and NOT are more general operations than
those of the relational algebra, but they are not any more precise.
And the price he pays for allowing them is worse than computational:
the tuples of the relations we create with them can no longer all
be regarded as expressing positive facts. We now have to know how a
resulting relation was produced before we can interpret it.
We *lose* preciseness.

Quote:
What are you trying to achieve?
...

Originally, the most potent axioms for an implementation language that
follows McGoveran's suggestions, ie., ones that recognize relation
structure as opposed to how a relation value may have been arbitrarily
formed.
*Implementation* language?

Relational algebra (with possible extensions such as transitive
closure) already gives us both mathematical simplicity (the axiom aspect)
and implementation orientedness (it is targeted at optimization).
Darwen's algebra isn't any more axiomatic, it is *less* precise,
as we no longer know how to interpret the relations we produce.
and *not* implementation oriented, because NOT and OR cannot actually
be implemented as they are (try NOT-ing the table {A: 1, B: 1, C: 1},
where A, B, C can take arbitrary floating point values). I would
venture that the first thing any implementer will attempt to do
is rewrite all of Darwen's expressions back to relational algebra as
far as possible, and throw errors on the remainder. Then again, there
is no doubt a lot of theory on working with partly disjunctively or
negatively interpreted relations that I'm not aware of.

The same objection holds for treating selection as join with mathematical
relations (in Darwen's section 'Treating operators as relations'.)
A relation in a relational database is of a very different nature
than a mathematical relation such as PLUS. A database relation
consists of a posteriori knowledge, information about the world.
A mathematical relation is a priori: it reflects information
about mathematical constructs, not information about the world.

A database relation is established by someone or something who observes
facts in the world and makes sure the tuples of the relation reflect them.
The reason databases exist is that we want to store and then retrieve such
information. We typically want to be able to write a query that produces,
say, the relation reflecting the municipalities of the country with their
mayors. Typically, we want such a query to always produce the latest
known version of that information without having to specifically rewrite
it to make it do so; in which case the relation is 'time-varying', i.e.
its contents will need to be updated from time to time.

None of this is true for mathematical relations. They are invariant.
Most are infinite. It is indeed mathematically possible to express
arbitrary selection expressions in terms of relational joins, but
this only makes them hard to read and write, it obscures implementation
aspects such as performance, and I can't think of a benefit.
Nice as an academic exercise (look, ma, I removed a concept)
but not as an academic result (no useful purpose).

Quote:
The way most people talk of Codd's algebra is simplistic
because they have typically not tried to interpret it, probably have
never even read it, for example, in the above, what do horizontal and
vertical coordinates have to do with anything relational?.
You know very well I wasn't talking about coordinates,
it was just a shorthand that is completely *obvious*
(considering that we usually visualize relations as tables)
and that I *carefully explained* just to be sure you would get it.
If you still didn't, I'm really sorry but I don't think it's my fault.

Quote:
This is faux
technocracy, usually involving what de Bono called porridge words which
in this case are additional lingo to disguise a lack of basic understand
Covering up a lack of reading comprehension with insults, as you have a
habit of doing, doesn't exactly help your credibility, mister. Do you
want help on understanding relational theory, or don't you?

Quote:
(and which is how I believe Codd's algebra is usually 'taught', or
thought to be 'taught'. But more to the point, I continue to be amazed
that people claim they can 'insert' a tuple to a relvar but cannot
'delete' that very tuple! (vice-versa too.)
I can make sense of that remark in various ways so I'm not sure how
to respond. In relational algebra this is well possible, because of
the possible creation of symmetries that selections cannot 'break'.
Please clarify.

Quote:
I think when they say they
can't delete, what they are really saying is that they can't delete a
tuple that
is not in the join! The same goes for asserting and retracting a
proposition in general.
I've no doubt that it might be difficult to
retract a proposition that one has never asserted! In a way, this is no
surprise to me and I still don't know why people think it is a problem,
I'm not sure what you mean. Maybe it depends on what kinds of facts a
database is assumed to contain; assuming that every tuple corresponds to
a fact and absence of a tuple to its negation really simplifies things,
but I don't know if you do. (Does make any sense as a reply?)

Quote:
when all human affairs contain much mumbo-jumbo that is logically
make-believe. Now that you mention it, another goal would be to prevent
geometric interpretations of the basic algebraic structures! Sounds
like tables or arrays are getting mixed up with relations.
Yes, to you it does. You see a word, it triggers an association in your
mind, and you start blathering about the association instead of focusing
on trying to understand what is actually meant. You're in bad company
here, and I certainly won't exclude myself, but if you want to make
progress on this algebra stuff, a change of attitude is advisable.

---
Reinier


Reply With Quote
  #7  
Old   
compdb@hotmail.com
 
Posts: n/a

Default Re: a union is always a join! - 01-29-2009 , 04:05 PM



On Jan 8, 11:21 am, rp... (AT) pcwin518 (DOT) campus.tue.nl (rpost) wrote:

Reinier,

There are fundamentals of Date& Darwen's "algebra" A that you don't
understand.

Definitions:
AND is JOIN.
NOT R has exactly the tuples typed by R that are not in R.
R OR Q has exactly those tuples typed by R JOIN Q which
give a tuple from R when projected on R's attributes or give a tuple
from Q when projected on Q's attributes.
R MINUS (for permitted R and Q) is R AND NOT Q.
R MINUS (for permitted R and Q) Q is also R OR Q.

Quote:
Darwen's AND, OR and NOT are more general operations than
those of the relational algebra, but they are not any more precise.
Agree.

Quote:
And the price he pays for allowing them is worse than computational:
the tuples of the relations we create with them can no longer all
be regarded as expressing positive facts.
Not so.
Queries are interpreted exactly the same as always.
Tuples that are present are true and those that are not present are
false.

Quote:
*We now have to know how a
resulting relation was produced before we can interpret it.
We *lose* preciseness.
Not so.
Interpretation is the same as always.
I don't know what you think the problem is.
Try to give an example.

Quote:
Darwen's algebra isn't any more axiomatic, it is *less* precise,
....
and *not* implementation oriented, because NOT and OR cannot actually
be implemented as they are (try NOT-ing the table {A: 1, B: 1, C: 1},
where A, B, C can take arbitrary floating point values).
Of course they can be implemented.
It's true that they denote large relations, but that's not necessarily
a problem.
First, you could allow only queries in NOT and OR that can be recast
using MINUS and UNION.
Second, you could allow non-query expressions but not queries in NOT
and OR.
Third, the cost of a query in NOT or OR is not necessarily
unacceptable.
It depends on the size of the attribute types rather than the number
of tuples in the relation,
but you could allow them and and still compute them (eg if the result
is small enough) (or in a lazy evaluation context).
The benefit is that NOT and OR can give clearer queries even if you
limit yourself to recastable queries
(they roughly mean "not" and "or" in the sense that
roughly JOIN means "and", MINUS means (limited) "and not" and UNION
means (limited) "or").

Quote:
I would
venture that the first thing any implementer will attempt to do
is rewrite all of Darwen's expressions back to relational algebra as
far as possible, and throw errors on the remainder.
You use "venture", "attempt" and "as far as possible" but the
semantics are clear.

Quote:
*Then again, there
is no doubt a lot of theory on working with partly disjunctively or
negatively interpreted relations that I'm not aware of.
Expressions in terms of NOT and OR are not interpreted differently.
Expressions that cannot be recast using MINUS and UNION instead of NOT
and OR
just might contain a lot of tuples.

Quote:
The same objection holds for treating selection as join with mathematical
relations (in Darwen's section 'Treating operators as relations'.)
A relation in a relational database is of a very different nature
than a mathematical relation such as PLUS. *A database relation
consists of a posteriori knowledge, information about the world.
A mathematical relation is a priori: it reflects information
about mathematical constructs, not information about the world.
Not so.
You can either think of PLUS as a named relation value or as a
variable that doesn't happen to change.
Otherwise everything is as usual.
Eg R add C as A-B means R COMPOSE (PLUS rename ARG1 as C, ARG2 as B,
RESULT as A)
(you need a parameter naming convention if you want to mix named with
positional parameters).
In your informal terms, PLUS is information about the world; that info
just doesn't happen to change.

Quote:
Nice as an academic exercise (look, ma, I removed a concept)
but not as an academic result (no useful purpose).
It is practical and not an exercise to simplify and generalize.
The basic consequence is that the difference between operators and
relations becomes only syntactic.
You can use a relation wherever you can use an operator and vice
versa.
This can be extended to include user-defined operations for which,
again,
the complexity for recastable queries is the same as the recast query,
and you need a policy for non-recastable queries (and further policies
for side-effects).
If the D&D presentation were more formal and complete this would be
more evident.

BTW, using NOT and OR is equivalent to using MINUS and UNION along
with a named single-attribute relation value
for each type each containing all that type's values.
Such a database has the same operation complexity as usual but some of
the leaves (the type-relations) are big.
So the cost of evaluating corresponding queries is the same.
Again, it is notationally more elegant and exactly as computationally
expensive
to allow the latter with the same recast-restrictions you would apply
to the former.

philip



Reply With Quote
  #8  
Old   
compdb@hotmail.com
 
Posts: n/a

Default Re: a union is always a join! - 01-29-2009 , 11:05 PM



On Jan 29, 2:05*pm, com... (AT) hotmail (DOT) com wrote:

Quote:
R MINUS (for permitted R and Q) Q is also R OR Q.
R UNION Q (for permitted R and Q) is R OR Q.

Quote:
Eg R add C as A-B means R COMPOSE (PLUS rename ARG1 as C, ARG2 as B,
RESULT as A)
Eg R add A-B as C means R JOIN (PLUS rename ARG1 as C, ARG2 as B,
RESULT as A)
Eg R add A+B as C means R JOIN (PLUS rename ARG1 as A, ARG2 as B,
RESULT as C)

philip


Reply With Quote
  #9  
Old   
paul c
 
Posts: n/a

Default Re: a union is always a join! - 01-30-2009 , 02:18 PM



compdb (AT) hotmail (DOT) com wrote:
Quote:
On Jan 8, 11:21 am, rp... (AT) pcwin518 (DOT) campus.tue.nl (rpost) wrote:
....
I would
venture that the first thing any implementer will attempt to do
is rewrite all of Darwen's expressions back to relational algebra as
far as possible, and throw errors on the remainder.

You use "venture", "attempt" and "as far as possible" but the
semantics are clear.
...
Perhaps a neophyte would try that. If he did it a second time though, one would have to question his aptitude for such work, for that matter, any kind of mental work. Apart from predicting and validating results, the chief use of an underlying algebra to define a language is as a means to prove logical optimizations. (I have not seen any dbms that has the kind of exact and formal definitions as Tutorial D has, but even that could go further when it comes to constraints. Most dbms's seem to rely on informal definitions, if any.)


Reply With Quote
  #10  
Old   
rpost
 
Posts: n/a

Default Re: a union is always a join! - 02-18-2009 , 01:14 PM



philip wrote:

Quote:
Reinier,

There are fundamentals of Date& Darwen's "algebra" A that you don't
understand.
I understood the definitions (thanks for repeating them though)
but I did make a major mistake regarding OR.

Quote:
Definitions:
AND is JOIN.
NOT R has exactly the tuples typed by R that are not in R.
Note that 'the tuples typed by R' is a ambiguous.

I was going by Darwen's appendix "A New Relational Algebra",
which doesn't mention types. Therefore I was tempted to read
'the tuples typed by R' as the join of the single-attribute
projections of R. Let's call this operation RELDOM(R),
and define RELNOT(R) = RELDOM(R) MINUS R. RELDOM can
(obviously) be expressed in Codd's algebra, albeit not
with a fixed expression.

However, what you actually mean by 'the tuples typed by R' is
the join of the single-attribute relations with all possible
values for types of the attribute. This is also an operation,
let's call it DOM. Then we can define NOT(R) = DOM(R) MINUS R.
This 'absolute' NOT generally introduces domain values
that weren't in R, and when the domains are big or infinite,
produces very big / infinite relations. It's this NOT that
I have problems with.

Quote:
R OR Q has exactly those tuples typed by R JOIN Q which
give a tuple from R when projected on R's attributes or give a tuple
from Q when projected on Q's attributes.
Yes; and we can again introduce a relative variant RELOR
by using RELDOM(R JOIN Q) instead of DOM(R JOIN Q).

Quote:
Darwen's AND, OR and NOT are more general operations than
those of the relational algebra, but they are not any more precise.

Agree.

And the price he pays for allowing them is worse than computational:
the tuples of the relations we create with them can no longer all
be regarded as expressing positive facts.

Not so.
Queries are interpreted exactly the same as always.
Tuples that are present are true and those that are not present are
false.
Yes, but what I meant to say is that in general, the tuples
don't really express facts regarding those domain values,
they just help express information about other domain values.

But I wasn't thinking straight when I wrote that (as paul c
suggested): this is true for *any* operation with negation,
e.g. RELOR or MINUS. It is only more conspicuous in NOT and OR,
because they introduce domain values not in the original relation(s).

Quote:
We *lose* preciseness.

Not so.
OK.

Quote:
Darwen's algebra is[...]
*not* implementation oriented, because NOT and OR cannot actually
be implemented as they are (try NOT-ing the table {A: 1, B: 1, C: 1},
where A, B, C can take arbitrary floating point values).

Of course they can be implemented.
It's true that they denote large relations, but that's not necessarily
a problem.
It complicates the implementation: it is no longer possible to represent
all relations as finite collections of tuples.

Quote:
First, you could allow only queries in NOT and OR that can be recast
using MINUS and UNION.
Second, you could allow non-query expressions but not queries in NOT
and OR.
Yes, that was my suggestion: rewrite to Codd's algebra
and reject the rest. Another idea is, of course, to use
RELNOT and RELOR instead.

Quote:
Third, the cost of a query in NOT or OR is not necessarily
unacceptable.
It depends on the size of the attribute types rather than the number
of tuples in the relation,
Definitely. For Booleans or enums it's acceptable
to just complement literally.

Quote:
but you could allow them and and still compute them (eg if the result
is small enough) (or in a lazy evaluation context).
Lazy evaluation doesn't help for NOT. I think you want to be smarter
anyway. But I notice there are several implementations already and I
haven't got a clue what they do.

Quote:
The benefit is that NOT and OR can give clearer queries even if you
limit yourself to recastable queries
(they roughly mean "not" and "or" in the sense that
roughly JOIN means "and", MINUS means (limited) "and not" and UNION
means (limited) "or").
Perhaps. I always liked how Codd's algebra separates 'horizontal'
operations (PROJECT, JOIN) from 'vertical' ones (MINUS, UNION,
INTERSECTION, quotient, selection). When I look at OR it doesn't
look like an operation I would use. But this may just be a matter
of getting used to it.

Quote:
I would
venture that the first thing any implementer will attempt to do
is rewrite all of Darwen's expressions back to relational algebra as
far as possible, and throw errors on the remainder.

You use "venture", "attempt" and "as far as possible" but the
semantics are clear.
True.

[...]

Quote:
You can either think of PLUS as a named relation value or as a
variable that doesn't happen to change.
Otherwise everything is as usual.
Mathematically, yes. But the implementation has to treat PLUS,
and mathematical operations in general, differently from the
finite, extensionally defined, time-varying relations that
constitute the actual information stored in a relational database.

[...]

Quote:
In your informal terms, PLUS is information about the world; that info
just doesn't happen to change.
No, it is not. The implementation can, and had better, take advantage
of the invariability of PLUS, and the typical computer hardware's
buil-in support for it. You really really don't want to implement it
as table lookup. Even for e.g. Boolean operators you probably want to
optimize differently than you they will be optimized when implemented
as table lookup.

Quote:
Nice as an academic exercise (look, ma, I removed a concept)
but not as an academic result (no useful purpose).

It is practical and not an exercise to simplify and generalize.
The basic consequence is that the difference between operators and
relations becomes only syntactic.
To the mathematician; not to the computer scientist.

Quote:
You can use a relation wherever you can use an operator and vice versa.
This can be extended to include user-defined operations for which, again,
the complexity for recastable queries is the same as the recast query,
and you need a policy for non-recastable queries (and further policies
for side-effects).
If the D&D presentation were more formal and complete this would be
more evident.
It is evident already, but I think this policy stuff is hard.

Quote:
BTW, using NOT and OR is equivalent to using MINUS and UNION along
with a named single-attribute relation value
for each type each containing all that type's values.
Yes (DOM above).

Quote:
Such a database has the same operation complexity as usual but some of
the leaves (the type-relations) are big.
So the cost of evaluating corresponding queries is the same.
Again, it is notationally more elegant and exactly as computationally
expensive
to allow the latter with the same recast-restrictions you would apply
to the former.
As I already said in my previous message, I see the elegance,
but at the same time I find elegance in Codd's use of two kinds
of operations. Obviously if you rewrite to Codd's algebra (or do
something equivalent to that) you're not going to lose efficiency
on the cases where this is possible. But just rejecting the rest
is only going to confuse the user. Why not be honest and admit
that we really expect, in our implementations, relational queries
to work on finite, typically small relations, that WORKS_IN is one
and PLUS is not, and that NOT doesn't do so?

Quote:
philip
--
Reinier


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.