![]() | |
![]() |
| | Thread Tools | Display Modes |
#31
| |||
| |||
|
|
In fact, I wonder what the hell a 'theory of implementation' would look like. *I'm tempted to suggest it would look like a 'theory of brick masoning', or a 'theory of mixing flour and water for bread baking'. *Sounds odd, I must say. Personally, I wander why would relational theorist consider that they |
#32
| |||
| |||
|
|
What is the object of optimization in a relational perspective ? What does it aim at, physically wise ? What are the modalities to defined to achieve such aim ? Don't you have the impression that the definition of*optimization* in such broad perspective is the relational theorists's way sweep the dust under the rug ? |
#33
| |||
| |||
|
|
That could be implemented directly. |
#34
| |||
| |||
|
|
As years pass by and database science goes further and further into collective IT industry amnesia, I somehow believe*old timers* (note that I do consider myself an old timer) have simply failed to explore the opportunities openned by Codd, and remained, implementation wise, too bound to the IBM historical context of how relational implementation is to be conceived. Such lack of perpective limited RM expressive power and constrained it(unhappy pun!) to empirism rather than applying it to how we conceive information encoding as a whole. |
#35
| |||
| |||
|
|
When a constraint applies to a relation (or maybe I should say when a relation satisfies a constraint), how and when can we apply the constraint solely to a restriction or projection of the relation and know that the constraint is still satisfied by the relation? * Precisely. And whatelse, appart from a *mathematical* computing model |
|
There are lots of extant adhoc re-writings used by various 'optimizers', some of which are have more formal underpinnings than others, such as the re-writing of a disjunctive query into a conjunctive form. Traditionally, both human and machine 'optimizers' have used metrics which I would classify as fairly crude, akin to what engineers call 'first approximations', the most common one being the counting of physical I/O's. Precisely.* |
|
A typical 'recipe' first looks for a few 'ingredients' such as keys followed by physical indexes and some will compare alternate indexes before 'deciding'. *Not many will eliminate redundancy in a query or constraint. *Optimization progress seems to involve whittling away at special cases. *I suspect the re-writing usually consists of re-writing queries, not the relations being queried. *Just musing out loud.- Masquer le texte des messages précédents - Recipes. That's is exactly where we are at, regarding implementation: |
#36
| ||||
| ||||
|
|
On 18/10/2010 2:31 AM, Cimode wrote: What is the object of optimization in a relational perspective ? What does it aim at, physically wise ? *What are the modalities to defined to achieve such aim ? Don't you have the impression that the definition of*optimization* *in such broad perspective is the relational theorists's way sweep the dust under the rug ? I just mean that the optimizations typically used are adhoc. *In algebra, we could write a 'reference' constraint, a more general form of 'foreign key' as B{F} = B{F} <AND> A{K}. *That could be implemented directly. Sure but what tells you that doing that such implementation is |
|
It must be a very desireable optimization to minimize physical work even if that work is only the movement of electrons in the face of the equally desireable ability to 'update', say when a relvar is assigned. When faced with B' := B <OR> I, where the apostrophe refers to a relvar B after an 'insert', it's fairly obvious that the reference constraint can be tested against the 'before' value and only the tuples 'I' need to be tested (as opposed to all the tuples in B'). *This strikes me as both a logical and physical optimization. Actually, I believe it is the way around. We should not start from |
|
Being somewhat inept at relational calculus I don't have a ready example for that. *I know that Codd expected his calculus to be more easily optimizable but I think that would still result in a similar situation. Exactly, but Codd somehow suggested that the chances for a physical IO |
|
This may have been written about and I just haven't seen it, but there seems to be some missing implementation theory, to do with axioms, if you will, for analyzing such expressions and recognizing when the set doesn't need to be tested, ie., when only one term in an algebraic expression needs to be tested against a constraint. (The reference constraint becomes quite convoluted when a strict foreign key constraint is desired and there are accordingly more ways to write it, but an optimization 'theory' would need to handle that too. *A strict algebra won't have any keywords such as 'KEY'. *A comprehensive optimization theory (eg. axioms?) for algebra expressions doesn't exist as far as I know. *Even though Date points out a number of useful optimizations in his Intro book, as I recall he doesn't get down to brass tacks with the perhaps most elementary one, candidate keys.) Foreign KEY is a already a complex case. My point is that we lack the |
#37
| |||
| |||
|
|
On 18/10/2010 2:31 AM, Cimode wrote: As years pass by and database science goes further and further into collective IT industry amnesia, I somehow believe*old timers* (note that I do consider myself an old timer) have simply failed to explore the opportunities openned by Codd, and remained, implementation wise, too bound to the IBM historical context of how relational implementation is to be conceived. Such lack of perpective limited RM expressive power and constrained it(unhappy pun!) to empirism rather than applying it to how we conceive information encoding as a whole. I remember a lot of amnesia! I posed a question I was curious about even though it is probably pointless today, to the folklore usenet group, about the origin of the term 'field' as applied to file systems. |
|
It might have been an unfair question since that group is mostly interested in the cultural history of hardware and operating systems as opposed to all software, but I thought it was more apt for that group than this one. There are several mathematical definitions for the term |
|
'field' but the term also dates back to the 1940's use of Hollerith punch cards and possibly much earlier for all I know (I remember seeing punched cards that had pre-printed 'nested fields', eg., telephone number might have sub-fields of area code and number. Anyway, it may be because I wrote the question badly, but so the reaction of that group seems to be that there was no mathematical inspiration behind the term. I had guessed that possibly the originators of Fortran might have thought it had special meaning being that the early Fortran files often consisted solely of numbers of different kinds and there is a definition of a 'closed field' of numbers that some mathematicians use. But logicians at one time also used the term to describe the union of what they called 'converse domains'. Maybe neither influenced the computer use of the term but so far nobody seems certain of that. |
#38
| |||
| |||
|
|
In the old days, a DISTINCT or GROUP BY in an SQL engine were done by a sort. Back then we built early SQLs on top of existing file systems and we had pretty good sort procedures in the library. How many kids today have ever seen a polyphase merge sort on tape drives? |
|
But today with parallel hardware and good hashing algorithms, would it be faster to use hashing to cluster equivalent classes of data together? |
|
Does there exist a set of hashing functions, H1(), H2(), .., Hn() which will produce at least one different result for any pair of data values? |
#39
| |||
| |||
|
|
Simulated that on disk, been there. You have to be rather keen on the "tape" labels unless you were fancy enough to well-errorcheck in code. |
![]() |
| Thread Tools | |
| Display Modes | |
| |