dbTalk Databases Forums  

Can relational alegbra perform bulk operations?

comp.databases.theory comp.databases.theory


Discuss Can relational alegbra perform bulk operations? in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Banana
 
Posts: n/a

Default Can relational alegbra perform bulk operations? - 09-29-2009 , 01:23 PM






I'm new to relational theory, having read a C.J. Date's book but I worry
that I may have picked up mistaken impression about the relational
theory and thus want to validate whether my understanding is accurate or
not.

Prior to reading the book, I've always had understood that anything we
did in realm of relational algebra (or calculus) would be set-oriented
and to lesser extent sames applies to the SQL implementation, at least
in the theory (as not all vendors necessarily are consistent in the
implementation). Given that any kind of update operation is logically
all at once, I was quite concerned with the manners of updating itself.

Given two relvars, r and s, containing some numbers of attributes,
sharing a common attribute (or in more general parlance, s relvar
contains a foreign key to relvar r's candidate key), and we wish to
evalulate an expression upon the relvar r and s. Assume the evaluate
will include restrict, project and join, though we need not restrict to
only those three operators.

Relational algebra, like ordinary algebra, can be employed in help us
re-formulate the expression into a even simpler expression, with the net
result that we can reduce the numbers of tuples to be evaluated as we
process the expression. This can be done by examining the operators'
properties such as joins having the commutative property and thus
choosing the smaller relvar and evaluate its tuples against the bigger
relvar's tuples. So far, so good. We can see how much relational
algebra/calculus can help us optimize any kind of queries by
transforming the expression.

But... I don't see any means within the relational algebra that provides
a way of evaluating multiple tuples in one go. Considering that a
relation is essentially a collection of propositions conforming to a
given predicate, it seems necessary to evaluate each tuples to determine
whether they should participate in the join or not, satisfy the restrict
condition and other things. For a lack of better terms, there is no
"sieve" we can employ to evaluate a set of tuples in one go.

Is the preceding paragraph accurate?

TIA.

Reply With Quote
  #2  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Can relational alegbra perform bulk operations? - 09-29-2009 , 02:25 PM






On Sep 29, 11:23*am, Banana <Ban... (AT) Republic (DOT) com> wrote:
Quote:
I'm new to relational theory, having read a C.J. Date's book but I worry
that I may have picked up mistaken impression about the relational
theory and thus want to validate whether my understanding is accurate or
not.

Prior to reading the book, I've always had understood that anything we
... We can see how much relational
algebra/calculus can help us optimize any kind of queries by
transforming the expression.
This is not really the case. All these transformations are ad-hock. If
you try proving, say, "pushing selection through projection" you would
get bogged in set notation with awkward attribute indexing, and even
if succeed actually proving it, it would hardly result in any
insight.

Quote:
But... I don't see any means within the relational algebra that provides
a way of evaluating multiple tuples in one go. Considering that a
relation is essentially a collection of propositions conforming to a
given predicate, it seems necessary to evaluate each tuples to determine
whether they should participate in the join or not, satisfy the restrict
condition and other things. For a lack of better terms, there is no
"sieve" we can employ to evaluate a set of tuples in one go.

Is the preceding paragraph accurate?
It is too vague. Algebras, in general, don't care about the structure
of their elements; relational algebra, in particular, doesn't refer to
tuples. (BTW the fact that it refers to attributes is a flaw -- c.d.t
regulars might have already guessed I'm about to sing the familiar
relational lattice tune:-)

Reply With Quote
  #3  
Old   
Bob Badour
 
Posts: n/a

Default Re: Can relational alegbra perform bulk operations? - 09-29-2009 , 02:26 PM



Banana wrote:

Quote:
I'm new to relational theory, having read a C.J. Date's book but I worry
that I may have picked up mistaken impression about the relational
theory and thus want to validate whether my understanding is accurate or
not.

Prior to reading the book, I've always had understood that anything we
did in realm of relational algebra (or calculus) would be set-oriented
and to lesser extent sames applies to the SQL implementation, at least
in the theory (as not all vendors necessarily are consistent in the
implementation). Given that any kind of update operation is logically
all at once, I was quite concerned with the manners of updating itself.

Given two relvars, r and s, containing some numbers of attributes,
sharing a common attribute (or in more general parlance, s relvar
contains a foreign key to relvar r's candidate key), and we wish to
evalulate an expression upon the relvar r and s. Assume the evaluate
will include restrict, project and join, though we need not restrict to
only those three operators.

Relational algebra, like ordinary algebra, can be employed in help us
re-formulate the expression into a even simpler expression, with the net
result that we can reduce the numbers of tuples to be evaluated as we
process the expression. This can be done by examining the operators'
properties such as joins having the commutative property and thus
choosing the smaller relvar and evaluate its tuples against the bigger
relvar's tuples. So far, so good. We can see how much relational
algebra/calculus can help us optimize any kind of queries by
transforming the expression.

But... I don't see any means within the relational algebra that provides
a way of evaluating multiple tuples in one go. Considering that a
relation is essentially a collection of propositions conforming to a
given predicate, it seems necessary to evaluate each tuples to determine
whether they should participate in the join or not, satisfy the restrict
condition and other things. For a lack of better terms, there is no
"sieve" we can employ to evaluate a set of tuples in one go.

Is the preceding paragraph accurate?

TIA.
What you appear to be looking for is a physical structure not a logical
one, which in SQL one would build with the "CREATE INDEX" statement.

Reply With Quote
  #4  
Old   
Banana
 
Posts: n/a

Default Re: Can relational alegbra perform bulk operations? - 09-29-2009 , 02:50 PM



Tegiri Nenashi wrote:
Quote:
This is not really the case. All these transformations are ad-hock. If
you try proving, say, "pushing selection through projection" you would
get bogged in set notation with awkward attribute indexing, and even
if succeed actually proving it, it would hardly result in any
insight.
I suppose it's true that we can transform an expression into anything
else and could conceivably add as many zeros for additive/subtractive
operations or ones for multiplication/division operations and still have
logically equivalent expressions, but such expansion would hardly be
interesting in absence of further transformation.

Even so, wouldn't it be fair to say that in order to optimize any given
query upon relation, we would want to find the most simple (or maybe
it's accurate to say 'concise') expression that is computationally
simple to execute compared to other but equivalent expressions?


Quote:
It is too vague. Algebras, in general, don't care about the structure
of their elements; relational algebra, in particular, doesn't refer to
tuples.
Then I obviously don't have a full understanding of the theory as I
would like to think.

However, could I ask you to explain the statement that tuples aren't
referred at all in relational algebra?

My objective is to understand whether operators exists that enables us
to perform evaluations on a set of propositions as whole rather than
having to verify each proposition.

Back to the r and s example, if I were to join them based on an
attribute x, I would have to look at all possible values in attribute x
of the relvar r then go to s and look up each tuples in relvar s and
verify whether its attribute x has the same value and thus 'select' the
tuple into the resultant output. I've just described an iterative manner
of evaluating what tuples may participate in a join and that's precisely
concerns me; was that actually necessary or does there exist an operator
to process them in bulk?

Or maybe to attempt an analogy, when we multiply two numbers, we need
not add each number to itself as many times as specified by other
operand. It's valid but not the only way to determine the product; we
certainly can determine product by recalling the product from
multiplication table and even for an arbitrarily large operands, use the
same algorithm to quickly arrive at a product.. Does such concept exist
for operating upon relations in like manner?

I do apologize in advance if I'm being vague or imprecise; as I said I'm
quite new and trying to grasp the pictures, though I must admit that I'm
also looking at the whole thing through the lens of practicality; how
does it help us and how much can it help us? It obviously helps on
providing logical design, but I'm looking at the optimizations that it
may have to offer to whatever implementation.

Again, thank you.

Reply With Quote
  #5  
Old   
Banana
 
Posts: n/a

Default Re: Can relational alegbra perform bulk operations? - 09-29-2009 , 02:55 PM



Bob Badour wrote:
Quote:
What you appear to be looking for is a physical structure not a logical
one, which in SQL one would build with the "CREATE INDEX" statement.
Well, I do apologize if I provided the impression that I was looking for
the physical structure or implementation. In fact, what I'm trying to
visualize the practicality relational theory has for any given
implementation, and I can obviously see how it applies in logical design
(or at least I hope I did understand it correctly), however, the same
book I mentioned in my OP did mention (too briefly, alas) how relational
theory can help optimize queries, which I tried to illustrate in my OP
citing the joins' commutative property to transform an expression into a
equivalent expression that may be computationally simpler to resolve.

With that said, I am trying to understand whether it provides us with a
mechanism of evaluating relations as a set without requiring iterations
as I attempted to explain in my other post to Tegiri.

I hope this helps clarifies my quest toward a more solid understanding
of the theory.

Reply With Quote
  #6  
Old   
paul c
 
Posts: n/a

Default Re: Can relational alegbra perform bulk operations? - 09-29-2009 , 03:01 PM



Tegiri Nenashi wrote:
....
Quote:
It is too vague. Algebras, in general, don't care about the structure
of their elements; relational algebra, in particular, doesn't refer to
tuples. ...
It might be clearer to say that while the algebra's ops don't refer to
tuples, the sets involved certainly do contain tuples and as far as I
know, there are no relational ops that can be defined without reference
to tuples.

Reply With Quote
  #7  
Old   
Clifford Heath
 
Posts: n/a

Default Re: Can relational alegbra perform bulk operations? - 09-29-2009 , 03:48 PM



Banana wrote:
Quote:
Relational algebra, like ordinary algebra, can be employed in help us
re-formulate the expression into a even simpler expression...
But... I don't see any means within the relational algebra that provides
a way of evaluating multiple tuples in one go.
The theory allows the query to be reformulated without changing its meaning,
but it cannot help estimate the cost of executing one or the other. A simpler
expression may in fact cost more to evaluate.

Primitive RDBMS used heuristics to apply transformations to expressions in
the hope of reducing execution cost. In the mid 80's, most RDBMS ditched
these heuristic optimisers and built cost-based ones. These estimate the
execution cost by using statistics (row counts, index selectivity, ordering
and clustering of indices, etc), though some, like MySQL, still use
heuristics (these are still classed as primitive).

The change was triggered in part at least by the work of Ken J McDonell
on the MUSBUS benchmarking tools, who was able to point out the glaring
differences in actual behaviour of the RDBMS he benchmarked.

--
Clifford Heath, Data Constellation, http://dataconstellation.com
Agile Information Management and Design

Reply With Quote
  #8  
Old   
Banana
 
Posts: n/a

Default Re: Can relational alegbra perform bulk operations? - 09-29-2009 , 04:03 PM



Clifford Heath wrote:
Quote:
The theory allows the query to be reformulated without changing its
meaning, but it cannot help estimate the cost of executing one or the other. A
simpler expression may in fact cost more to evaluate.
Fascinating. It sounds like C.J. Date may have been too optimistic in
asserting that relational theory is sufficient in optimizing the
evaluation of a given expression. (well, he didn't say it can do so
exclusively, but my puzzlement definitely dealt with the 'if relational
theory is supposed to help with this, then why has SQL strayed so far
from the relational theory as understood in mathematics.')

On an aside, though, that didn't quite answer my intended questions of
whether theory provides a means of evaluating in bulk without requiring
one to evaluate tuples individually, as I attempted to explain/ask in my
other posts. But I suspect even if the answer was a yes, the theory has
little to say about how one can optimize such queries.

Quote:
The change was triggered in part at least by the work of Ken J McDonell
on the MUSBUS benchmarking tools, who was able to point out the glaring
differences in actual behaviour of the RDBMS he benchmarked.
Thank you. I may look his works up and learn a bit about that. It does
provide a hint (?) toward why the actual implementations is quite
divorced from the relational theory.

Reply With Quote
  #9  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Can relational alegbra perform bulk operations? - 09-29-2009 , 04:49 PM



On Sep 29, 12:50*pm, Banana <Ban... (AT) Republic (DOT) com> wrote:
Quote:
I suppose it's true that we can transform an expression into anything
else and could conceivably add as many zeros for additive/subtractive
operations or ones for multiplication/division operations and still have
logically equivalent expressions, but such expansion would hardly be
interesting in absence of further transformation.
I'm not following your "add zeroes" idea, but if you are saying that
only those transformations that lead to "shorter" relational
expressions make sense from optimization perspective, than this is not
true. E.g. "magic" transformation.

Quote:
Even so, wouldn't it be fair to say that in order to optimize any given
query upon relation, we would want to find the most simple (or maybe
it's accurate to say 'concise') expression that is computationally
simple to execute compared to other but equivalent expressions?
Are you familiar with cost based optimization? They basically throw in
multiple equivalent expression and estimate cost of running each
apriori.

Quote:
However, could I ask you to explain the statement that tuples aren't
referred at all in relational algebra?
Simple. Take projection for example. It contains the pi symbol
together with attributes names. Ditto selection and the others.

Quote:
Back to the r and s example, if I were to join them based on an
attribute x, I would have to look at all possible values in attribute x
of the relvar r then go to s and look up each tuples in relvar s and
verify whether its attribute x has the same value and thus 'select' the
tuple into the resultant output. I've just described an iterative manner
of evaluating what tuples may participate in a join and that's precisely
concerns me; was that actually necessary or does there exist an operator
to process them in bulk?
That is called nested loops join. It is implementation; algebraically
(natural) join is idempotent, symmetric, and associative operation.

Quote:
Or maybe to attempt an analogy, when we multiply two numbers, we need
not add each number to itself as many times as specified by other
operand. It's valid but not the only way to determine the product; we
certainly can determine product by recalling the product from
multiplication table and even for an arbitrarily large operands, use the
same algorithm to quickly arrive at a product.. Does such concept exist
for operating upon relations in like manner?
In this respect Bob's answer was spot on: while nested loops is quite
inefficient operator, the indexed nested loops isn't.

Reply With Quote
  #10  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Can relational alegbra perform bulk operations? - 09-29-2009 , 04:54 PM



On Sep 29, 1:01*pm, paul c <toledobythe... (AT) oohay (DOT) ac> wrote:
Quote:
Tegiri Nenashi wrote:

...

It is too vague. Algebras, in general, don't care about the structure
of their elements; relational algebra, in particular, doesn't refer to
tuples. ...

It might be clearer to say that while the algebra's ops don't refer to
tuples, the sets involved certainly do contain tuples and as far as I
know, there are no relational ops that can be defined without reference
to tuples.
But this is like saying that one can't define summation and
multiplication in a ring with abstract elements. You wouldn't insist
that any advanced algebra book had a long arithmetic introduction with
tedious instructions how one adds and multiplies numbers (in their
decimal representation)?

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.