dbTalk Databases Forums  

does a table always need a PK?

comp.databases.theory comp.databases.theory


Discuss does a table always need a PK? in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #151  
Old   
Bob Badour
 
Posts: n/a

Default Re: Temporal operations - 09-06-2003 , 07:54 PM






"Leandro Guimarăes Faria Corsetti Dutra" <lgcdutra (AT) terra (DOT) com.br> wrote in
message newsan.2003.09.06.21.51.18.272744 (AT) terra (DOT) com.br...
Quote:
On Sat, 06 Sep 2003 13:53:39 -0700, Paul G. Brown wrote:

No reference to user-defined types

When the relational model says domains, user-defined types are implied.

After all, you don't expect the DBMS to ship with all the domains you
may ever need?
Perhaps, he expects users to make do with the only data type explicitly
required by Codd, ie. boolean.


Quote:
Bob: it would appear either that you are a little lacking in
'intellectual honesty', or that you continue to struggle with basic
written comprehension.

Beware, your description of Bob looks like you're projecting yourself.
There you go spoiling my blissful ignorance again. Oh well, half a bliss is
better than no bliss at all.


Quote:
BTW, D&D in the rest of their book promptly *extend* the model with
domain inheritance (not a concept from mathematical logic and, as D&D
themselves say, 'independent of' to the relational model)

Again you are taking things out of context... the inheritance is
orthogonal and optional, it has even been taken out of D4 v2 with D&D's
blessing.
I don't know whether he takes things out of context so much as he
demonstrates a profound incomprehension of "independent of" as in
"orthogonal to", which seems ironic while citing mathematical logic. He says
the equivalent to "extending the X-axis with the Y-axis". The Y-axis does
not extend the X-axis. It adds a whole new dimension while leaving the
original intact.


Quote:
*correct*
the model by introducing a distinction between relations and and
entirely new thing called a relvar

What is the problem in making clear a distinction that was already
implicit? Or don't you think the distinction between value and variable
is worth making?
I can understand an objection to proliferating jargon. Our industry has far
too much of that already, but I find this specific jargon useful because the
symbols "relation variable" and "relation value" look too similar. Relation
and relvar are shorter, clearer symbols when written and when spoken.


Quote:
But to broaden the theme: why is it illegitimate to 'extend' or
'pervert'
the relational model with 3VL but OK to add 'domain inheritance' (OO
Prescriptions 2 & 3)?

3VL is easily the most contentious point in the relational field,
I wouldn't spend much arguing this in the current loaded context.
It is perfectly legitimate for one to create a 3VL domain and provide
explicit operations that evaluate to boolean: IsKnown, IsUnknown,
IsKnownTrue, IsKnownFalse etc. It is not legitimate to pollute the logical
data model with redundant structural elements like NULL or references.


Quote:
But, even assuming 3VL to be OK, you still have to acknowledge it
is fundamental if the system uses 3 or 2VL; while domain inheritance is
just an optional accretion.
Ultimately, 2VL is fundamental even for SQL. SQL just uses inconsistent
implicit rules for converting 3VL to 2VL.


Quote:
How can you compare both, unless you're letting emotions take over
your thinking?
Maybe, he just doesn't know any better. Sometimes, it is more effective to
just answer the question.


Quote:
Why does a description of a logical model include
two Prescriptions about transactions? (RP# 17 and OO # 5)?
I don't know. I suggest he contact the authors and ask them. If I had to
speculate, I would guess they included RM Prescription #17 because data
management requires managed concurrent update from all logical data models,
and I would guess they included OO Prescription #5 as a requirement for good
language design.


Quote:
Read carefully... you will see that TTM title's _Foundations for
Future Database Systems_, that is, it is meant not only to set the record
straight on the current state of the relational model but also to describe
a possible implementation. Transactions, or an alternative way of dealing
with concurrency, may not be part of the relational model but are
essential to a database system... their inclusion, you may notice, is not
about prescribing transactions per se but about preventing perversions of
transactions that are frequently claimed for by naïve programmers.


And if (as
they may) D&D remove transactions from their view of the model does
this constitute a 'correction'? Using words like 'pervert' amps up
the
rhetoric but adds nothing of substance to the argument.

Again, if a system will support transactions or simply simultaneous
operations (D&D's new proposition) is hardly as essential as making the
distinction between values and variables, domains and objects, and logical
and physical...
It might constitute an evolutionary step in our understanding of the
relational model. It might indicate a better understanding of how to meet
the more fundamental requirement of managed concurrent update. Or they might
just decide to keep it in. Aftera ll, it is one thing to discourage
procedural specifications for complex data changes, and it is quite another
thing to make it impossible to safely specify them.


Quote:
Leaving aside for a moment the point that if the U_ operators are
shorthand for simpler operators why go to the trouble

Ooops... I won't go into details now for lack of the book by my side
now, but it seems you are misrepresenting stuff here.
Because the longhand is cumbersome and errorprone. I assume you cut out part
of his nonsense.

[further partial excerpts snipped]

Quote:
I would also do away with domain operators, domain
inheritanc (the whole ghastly edifice of the POSREP)

Ops... again, POSREPs have little to do with domain inheritance...
they certainly don't depend on inheritance, and are useful in its absence.
What do you have against them?
I wonder what Brown wants to do with data types if they have no operations.
Obviously, Brown does not value data independence, but he is a mindless
vendor after all so I don't really expect much.


Quote:
[1] Codd, E. F. "The Relational Model for Database Management: Version
2"
Addison-Wesley. 1990.

(Note that this work pre-dates the publication of the Manifesto by a
whole
four years, pre-empts many of TTM's novelties (though not all)

Which ones?

BTW, TTM is not about novelties. It is about reaffirming the RM
as a solid foundation against the OO Newspeak, and applying it to the
needs that supposedly gave birth to OODBs.
Well said. Their whole exercise started as an exploration of the features of
object orientation that might improve the relational model and ended by
observing that object orientation had nothing to offer.




Reply With Quote
  #152  
Old   
Paul G. Brown
 
Posts: n/a

Default Re: Temporal operations - 09-06-2003 , 10:01 PM






Leandro Guimarăes Faria Corsetti Dutra <lgcdutra (AT) terra (DOT) com.br> wrote

Quote:
On Sat, 06 Sep 2003 13:53:39 -0700, Paul G. Brown wrote:

No reference to user-defined types

When the relational model says domains, user-defined types are implied.
Really? Where was that ever said anywhere *except* TTM? ( I know
where,
Leandro - do you? And I know where it was said 10 years prior.)

Are you aware of the distinction between user-defined types and
abstract
data types (ADTs)? Domains (as defined mathematically) don't do
inheritance,
don't do identity (as ADTs can do), and lack a parameterized flavor
(an
array, vector or bag of values might be a concept represented using
an
ADT but not a domain.)

A domain is a domain is a domain. A user-defined type, or an ADT,
is an
implementation specific approximation of the domain concept. Want to
talk
about domains, talk about domains. Want to talk about types, talk
about
types. They are not equivalent.

Quote:
After all, you don't expect the DBMS to ship with all the domains you
may ever need?
Yee Gods! Why don't you DBLP my name, Leandro. Or simply search
google
groups for my thoughts on this topic.

Quote:
Bob: it would appear either that you are a little lacking in
'intellectual honesty', or that you continue to struggle with basic
written comprehension.

Beware, your description of Bob looks like you're projecting yourself.
Why? Have I accused anyone of taking anything out of context?
Further,
have I ever done so when the statement clearly *was not* taken out
of
context? Have I ever avoided arguments by responding with ad
homenim?

Quote:
BTW, D&D in the rest of their book promptly *extend* the model with
domain inheritance (not a concept from mathematical logic and, as D&D
themselves say, 'independent of' to the relational model)

Again you are taking things out of context... the inheritance is
orthogonal and optional, it has even been taken out of D4 v2 with D&D's
blessing.
So! The model has been 'corrected'? But they said it needed no
'correction'! What's the next 'correction'? Bags instead of sets?
Maybe NULLs are OK after all?

Quote:
*correct*
the model by introducing a distinction between relations and and
entirely new thing called a relvar

What is the problem in making clear a distinction that was already
implicit? Or don't you think the distinction between value and variable
is worth making?
What distinction? No one has *ever* in mathematical logic or in
other
discussions of the relational model, introduced this distinction.
It
was not there before, either implicit or explicit, anywhere else.
It
is entirely novel.

And not particularly useful. Indeed, what of relations which are
*constants*. In mathematical logic, the relation
'greaterthanorequal< integer, integer >' is not a variable, nor is
it a value. It simply is. D&D's novel distinction is useful for
implementors because it allows us to reason about update queries
with respect to the transaction model. But it is utterly
superflous,
and frankly limiting, wrt the logical model.

Quote:
subsume the core of the relational
model (33 prescriptions) within an edifice that includes 26 other
inessential rules.

If something is a suggestion, how can it subsume what is prescripted?
There are prescriptions, proscriptions and strong suggestions.
Formal
systems begin with a set of axioms, not 'strong suggestions'.
Strong
suggestions are for systems implementers and marketing types. It is
not
'a very strong suggestion' that 1 + 1 = 2. There are no 'very
strong
suggestions' in formal systems. There are lemmas, and derivations.

Yet they have, it might be agued, subsumed the principles of their
Relational Model within layers of marketing and implementation
cruft.

Quote:
But to broaden the theme: why is it illegitimate to 'extend' or
'pervert'
the relational model with 3VL but OK to add 'domain inheritance' (OO
Prescriptions 2 & 3)?

3VL is easily the most contentious point in the relational field,
I wouldn't spend much arguing this in the current loaded context.

But, even assuming 3VL to be OK, you still have to acknowledge it
is fundamental if the system uses 3 or 2VL; while domain inheritance is
just an optional accretion.

How can you compare both, unless you're letting emotions take over
your thinking?
I'm not the one who initiated the 'emotional' here. Let's see:
their
word waa 'perversion'. That's a rhetorical device, designed to
appeal to the
emotion. Certain folk here throw around words like 'moron',
'ignorami' and
so on, with flair. I don't.

And in the entire TTM I don't see single lemma, a single formal
proof,
or any of the notational systems I find in PODS proceedings, or
books
by other writers. That isn't a criticism of TTM, because its clarity
is quite remarkable. But is is an observation for those of you who
want
to hold the book up as a paragon of rigor. It ain't.

Quote:
Why does a description of a logical model include
two Prescriptions about transactions? (RP# 17 and OO # 5)?

Read carefully... you will see that TTM title's _Foundations for
Future Database Systems_, that is, it is meant not only to set the record
straight on the current state of the relational model but also to describe
a possible implementation. Transactions, or an alternative way of dealing
with concurrency, may not be part of the relational model but are
essential to a database system... their inclusion, you may notice, is not
about prescribing transactions per se but about preventing perversions of
transactions that are frequently claimed for by naïve programmers.
If it's a logical model, why put transactions in at all. If its
about
providing a basis for building systems, why drop it out? Why are
D&D
(apparently) considering bouncing the transactions stuff out if it
is
"essential to a database system" (your words, not mine). A quote
from
the book to support your argument would be useful. I keep providing
chapter and verse. Intellectual honesty requires it.

Quote:
And if (as
they may) D&D remove transactions from their view of the model does
this constitute a 'correction'? Using words like 'pervert' amps up the
rhetoric but adds nothing of substance to the argument.

Again, if a system will support transactions or simply simultaneous
operations (D&D's new proposition) is hardly as essential as making the
distinction between values and variables, domains and objects, and logical
and physical...
I've yet to see a formal definition of "simultaneous operations".
I've
seen plenty about transactions. But if what D&D have to say about
the
subject when they talk about nested transactions (OO Prescription #
6)
is any guide, then I'm worried. Are TX and TX' isolated with
respect to one
another? Are they required to each be independently consistent wrt
the
database rules, or is is only required that the overall transaction
TX''
(being TX U TX') be consistent? Atomicity and durability are *only
two*
of the requirements of an relational assignment operation.

Quote:
Leaving aside for a moment the point that if the U_ operators are
shorthand for simpler operators why go to the trouble

Ooops... I won't go into details now for lack of the book by my side
now, but it seems you are misrepresenting stuff here.
I have the book here. I read the book. I remember the book. I know
the
details.

You *cannot* define the U_ operators in terms of the other
operators,
alone. You must introduce the notion of a compulsary *relation*:
that is, a
relation to contain all possible values of the temporal domain.

Now, this is how its done in theory. And we are talking about
logical
models here, right? In practice, you wouldn't do it that way, and
no one with a Temporal SQL implementation does.

But the point stands: DD&L extend the model by introducing a set
of
operators which are completely novel in that that *require the
presence
of a relation*. Talk about the incoherence principle! Is an operator
an operator, or a relation?

Quote:
There are alternatives. Why not recognize that 'value identity' is
not
the same thing as 'value equality' yet 'value identity' fulfills all
the requirements of the other aspects of the model (functional
dependencies, set algebra, basis for proving operator re-ordering
rules, etc). This is how Codd deals with the question in RM/V2[1] (in
RT-1).

Will try to study this tomorrow, luckly have found a (expensive)
used copy. At first sight it seems nonsensical, but then I may be just
naïve...
Parts of it are a bit dodgy. But it was written *before* the
Object/Relational systems were built. D&D came along afterwards and
critised the hell out of stuff that was already done. Their
criticisms
I find, by and large, balanced, correct and convincing.

Quote:
Another question: why is D&D's approach to view updateability
preferred to Codd's (or to Dayal & Bernstein's, or Keller's)?

Perhaps by virtue of its reasonability, clarity, comprehensiveness?
Or perhaps because there is a popular book that describes the ideas?

Quote:
Or can you expand on Codd's advantages?
I'm not saying it has any advantage. I'm only pointing out that
the
religous nature of this discourse makes me want to rename the
newsgroup comp.databases.fan.date_and_darwen.

Quote:
Doing so means that we don't need to suddenly add to the model
propositions like 'The following relations *must* exist", and makes the
'Relational Model' capable of capturing a much broader range of
predicates.

Again, can you expand please?
I already have. Read the post where I explain the distinction
between a
defintion of identity which encompases isomorphism. The problem I
talked
about was trains, tracks and time. A train is on a track for an
interval of time. What is the 'key' to this relation?

Tracks < From_Station, To_Station >

The tuples Tracks < 'Station_A', 'Station_B' > and Tracks <
'Station_B',
'Station_A' > are indistinguishable. But they are not 'equal'. To
overcome
this, you need to write a complex disjunctive predicate.

And: Schedule < Train, From_Station, To_Station, Start, End >

What's the 'key' here? How do you differentiate tuples based on
their values?

But having posed the DD&L approach as a solution I'll simply ask
you
what do do with spatial concepts, where the prime subset of a
relation's elements (those elements upon which all other columns are
functionally dependent) include a polygon. Or with graph data
values?
Or algebraic functions as keys?

How on earth, given RM Prescription # 22 (which says you can only
have
the ordinal operators, <, <=, =, =>, >, !=) do you expect to handle
the
enormous range of problems where identity and distinction are more
complex than value equality or representational equivalence?

Quote:
I would also do away with domain operators, domain
inheritanc (the whole ghastly edifice of the POSREP)

Ops... again, POSREPs have little to do with domain inheritance...
they certainly don't depend on inheritance, and are useful in its absence.
What do you have against them?
Yes. That's what I said. Putting "the whole ghastly edifice of the
POSREP" into parenthesis was intended to avoid the association you
might have read had I written ",..., domain inheritanc [sic] and the
whole ghastly edifice of the POSREP".

Quite why a type ought to care any more about how it is presented
than it does about how it is used escapes me.

Quote:
[1] Codd, E. F. "The Relational Model for Database Management: Version
2"
Addison-Wesley. 1990.

(Note that this work pre-dates the publication of the Manifesto by a
whole
four years, pre-empts many of TTM's novelties (though not all)

Which ones?
Codd elaborates the whole 'data type is domain' question at length,
he
presents a model of view updateability, and he treats questions of
transitive closure and recursion with more rigor.

Quote:
BTW, TTM is not about novelties. It is about reaffirming the RM
as a solid foundation against the OO Newspeak, and applying it to the
needs that supposedly gave birth to OODBs.
By the time the third manifesto was published we had already had
ten years worth of systems and theory research into these questions.
We had papers on extended types in relational languages [1][2], we
had
theory work[3][4].

And once again, don't get me wrong. TTM is a really good piece
of
work. But it is not gospel. And it does the authors and their ideas
no
service to disparage those who are unaware of their ideas, or to
avoid
engaging with the material in a thoughtful, critical way.

KR

Pb


[1] Stonebraker and L.A. Rowe (eds). "The POSTGRES Papers." Technical
Memorandum UCB/ERL M86/85, ERL, College of Engineering, UC
Berkeley,
June, 1987.

[2] Sylvia L. Osborn, T. E. Heaven: The Design of a Relational
Database System with Abstract Data Types for Domains. TODS 11(3):
357-373(1986)

[3] Carlo Zaniolo: The Database Language GEM. SIGMOD Conference 1983.

[4] R. Fagin,"A Normal Form for Relational Databases That is Based on
Domain and Keys," ACM Transactions on Database Systems, Volume 6,
pp.
387-415


Reply With Quote
  #153  
Old   
Paul G. Brown
 
Posts: n/a

Default Re: Temporal operations - 09-07-2003 , 02:59 AM



"Bob Badour" <bbadour (AT) golden (DOT) net> wrote


Quote:
Perhaps, he expects users to make do with the only data type explicitly
required by Codd, ie. boolean.
Yes. As a matter of fact, he does.

There is true. And there is false. (And there is unknown). For the purposes
of automating reason everything else is superflous. I don't know if what
you're storing in my DBMS is true or false, and I sure don't know what
isn't stored there. I can only try to ensure that the way you manage and
manipulate it is consistent with the laws of logic.

Quote:
I don't know whether he takes things out of context so much as he
demonstrates a profound incomprehension of "independent of" as in
"orthogonal to", which seems ironic while citing mathematical logic. He says
the equivalent to "extending the X-axis with the Y-axis". The Y-axis does
not extend the X-axis. It adds a whole new dimension while leaving the
original intact.
There are no axis in formal systems. There are axioms. Adding new axioms
extends the set of axioms. This is what the term mean.

Do you have any argument here? Any refutation? Or are you obliged to
attribute to me things I did not say in order to support a position you
cannot articulate and are unable to defend?


Quote:
*correct*
the model by introducing a distinction between relations and and
entirely new thing called a relvar

What is the problem in making clear a distinction that was already
implicit? Or don't you think the distinction between value and variable
is worth making?

I can understand an objection to proliferating jargon. Our industry has far
too much of that already, but I find this specific jargon useful because the
symbols "relation variable" and "relation value" look too similar. Relation
and relvar are shorter, clearer symbols when written and when spoken.
Jargon is language. We are talking axioms here. We are talking about
the logical model: the building blocks of reason. Your choice of notation
is entirely up to you. I only want consistency in the way you use your
terms.

What is this distinction between 'relation', 'relation variable' and
'relation value'? It is not a part of mathematical logic (this is a
statement of fact, easily refuted if you can offer a *single* example from
the literature, which contradicts it). I further assert that it is a
confusing and unnecessary distinction in terms of implementing a
DBMS, and provided an example (the relation 'greaterthanorequal<>) which is
neither a value nor a variable. Do you have any logical argument or even
a coherent observation to make about the example?

Quote:
But to broaden the theme: why is it illegitimate to 'extend' or
'pervert'
the relational model with 3VL but OK to add 'domain inheritance' (OO
Prescriptions 2 & 3)?

3VL is easily the most contentious point in the relational field,
I wouldn't spend much arguing this in the current loaded context.

It is perfectly legitimate for one to create a 3VL domain and provide
explicit operations that evaluate to boolean: IsKnown, IsUnknown,
IsKnownTrue, IsKnownFalse etc. It is not legitimate to pollute the logical
data model with redundant structural elements like NULL or references.
Wow. From TRUE, FALSE, and UNKNOWN, you have multiplied entities to include
IsTrue, IsNotTrue, IsKnown, IsUnkown, IsKnownFaslse. You sound like
Donald Fumsfeld! Want to add 'IsKnownKnown', 'IsUnknownKnown', and
'IsUnknownUnknown'?

But seriously, you're conceding the point. "The relational model needs
no blah blah . ." was the original D&D diktat! Well, apparently, it
does. But I take the argument seriously. The relational model is something
you extend at great hazard. TRUE, FALSE and UNKNOWN. All else
is incoherence.

[ snip ]

Quote:
How can you compare both, unless you're letting emotions take over
your thinking?

Maybe, he just doesn't know any better. Sometimes, it is more effective to
just answer the question.
What, in one sentnce, was the question? If I'm supposed to be answering
one then at least I have a right to know what it is. In performing the
comparison I can see no distinction. It is up to the person making the
claim to justify it.

Quote:
Why does a description of a logical model include
two Prescriptions about transactions? (RP# 17 and OO # 5)?

I don't know. I suggest he contact the authors and ask them. If I had to
speculate, I would guess they included RM Prescription #17 because data
management requires managed concurrent update from all logical data models,
and I would guess they included OO Prescription #5 as a requirement for good
language design.
"If you had to speculate." "Ask the authors."

I don't speculate. I read the text. It says nothing about language in
RP # 17. In OP # 5 it says that the reason you want to do this is to avoid
certain SQL mistakes. *Then* they go on to point out that the "mistake" is
related to the way the SQL standard says "leave it up to the implementation",
implying that whatever practical difficulties arise are not SQL mistakes
at all!

The DBMS products with which I am most familiar provide a SQL compliant and
a non-SQL compliant flavor. Is the transaction support in SQL a mistake?
Maybe. But it is apparently something the vendors got right. Are you gonna
say that? (And if you do I have a medium, rusty freighter loaded down with
really angry people who would beg to differ.)

But once again, what is this doing in the logical model? That's the point.
Please address the question I raised, not the strawman you think D&D
were demolishing.

[ snip ]

Quote:
And if (as
they may) D&D remove transactions from their view of the model does
this constitute a 'correction'? Using words like 'pervert' amps up
the
rhetoric but adds nothing of substance to the argument.

Again, if a system will support transactions or simply simultaneous
operations (D&D's new proposition) is hardly as essential as making the
distinction between values and variables, domains and objects, and logical
and physical...

It might constitute an evolutionary step in our understanding of the
relational model. It might indicate a better understanding of how to meet
the more fundamental requirement of managed concurrent update. Or they might
just decide to keep it in. Aftera ll, it is one thing to discourage
procedural specifications for complex data changes, and it is quite another
thing to make it impossible to safely specify them.
THE RELATIONAL MODEL NEEDS NO EXTENSION, CORRECTION, SUBSUMATION or
PERVERSION. (Their words!)

Oh. But it does. It needs transactions.

But wait! They're going! Why? Is it because they're an extension? Or
because they're a perversion? Or because they susbsume too much? I don't
care. Transactions define the quality of service guarantees DBMS might
make wrt changes. They don't matter squat to the logical model.

"Discourage?" Formal systems are sets of axions. You don't "discourage"
an axiom. It's like discouraging the tide. Or gravity.

Sloppy language, sloppy thinking. I am having fun.

Quote:
Leaving aside for a moment the point that if the U_ operators are
shorthand for simpler operators why go to the trouble

Ooops... I won't go into details now for lack of the book by my side
now, but it seems you are misrepresenting stuff here.

Because the longhand is cumbersome and errorprone. I assume you cut out part
of his nonsense.
Nonsense? (Note: ad homenim, unsustained argumentation). Care to provide
a single piece of evidence or an argument to sustain that assertion?

I gave specific citicisms and a chain of reasoning. I quoted from the
source material. And the best you can do is to say "nonsense"? Care to
contest any of the citations? See any problem with the logic? Can you
enlighten us as to any dodgy assumptions underlying the thinking? They
are there, Bob. All over the place.

Quote:
[further partial excerpts snipped]

I would also do away with domain operators, domain
inheritanc (the whole ghastly edifice of the POSREP)

Ops... again, POSREPs have little to do with domain inheritance...
they certainly don't depend on inheritance, and are useful in its absence.
What do you have against them?

I wonder what Brown wants to do with data types if they have no operations.
Obviously, Brown does not value data independence, but he is a mindless
vendor after all so I don't really expect much.
While there was no question mark there, Bob's sentence was a question,
so I'll take that as evidence of some kind of growth.

Well, take away domain operators, and what is left? Domains, and relations.

Consider Equals. Pretty unambiguous, huh? But I want you to note that we
call 'equal' a 'relation' in mathematical logic. The term 'operator' is
usually reserved for group manipulation.

So, here is an example of what the alternative looks like:

CREATE COMPUTED RELATION Equal ( Domain INTEGER, Range INTEGER );

CREATE COMPUTED RELATION Equal ( Domain String, Range String );

RANGE OF Identity ( Domain, Range ) OVER
RETRIEVE ( Domain, Range ) FROM Equal ( INTEGER, INTEGER );

RANGE OF Identity ( Domain, Range ) OVER
RETRIEVE ( Domain, Range ) FROM Equal ( String, String );

CREATE REAL RELATION Parts ( PartId INTEGER, Name String )
Identity ( PartId );

CREATE REAL RELATION Supplier ( SuppName String,...)
Identity ( SuppName );

CREATE REAL RELATION PS ( PartId INTEGER, Supplier String, Price Money )
Identity ( PartId, Supplier ),
Exists ( Supplier ( SuppName, ...), Equals ( SuppName, Supplier )),
Exists ( Parts ( PartId, ...), Equals ( PartId, PS.PartId));

This is a relation with a 'key' identity, being two elements for which
the 'identity attribute holds in a conjunction. Note the paired rules
which replace 'foreign keys'.

Let's see what happens with a slightly more involved example:

CREATE COMPUTED RELATION Identity ( Domain INTERVAL, Range INTERVAL );

CREATE COMPUTED RELATION Within ( Domain String, Range INTERVAL );

-- NOTE: No Equal here. Although there might be:

RANGE Overlap ( Domain, Range ) OVER
RETRIEVE ( Domain, Range ) FROM Identity ( Domain INTERVAl, Range INTERVAL);

ALTER RELATION PS ADD ATTRIBUTE ( When INTERVAL );

ALTER RELATION PS DROP Identity,
ADD Identity ( PartId, Supplier, When );

Q1: Show me who supplied part "5" at date "D"? (Values always being
their lives as a keystroke. This is not true, but it serves as an axiom
for now. )

RETRIEVE ( SuppName )
FROM PS ( PartId INTEGER, Supplier String, Price Money,
When INTERVAL),
Equal ( PartId, "5"),
Overlap ( Domain, PS.When ) AS O1,
Within ( O1.Domain, "D" );

What the, constitutes the behavior of a domain? Nothing more than the
set of relations within which its use is indicated. (For those of you
who know about the limits to Datalog, I do too, but the point here is
that we don't need 'domain operators', not about the problem raised by
quantifiers and negation in this class of language.)

You can wrap all kinds of semantic sugar around this. I'm the
"mindless vendor" here (another ad hominem there). But it is liberating to
reason about the implementation using this model, as opposed to a model
where the domain's behavior is fixed and assumed.

For example, properties of binary relations -- such as commutivity --
fall out of the definition of the relation. And with other properties
such as transitivity and reflexivity I can automate the exploration of a
plan space.

Quote:
[1] Codd, E. F. "The Relational Model for Database Management: Version
2"
Addison-Wesley. 1990.

(Note that this work pre-dates the publication of the Manifesto by a
whole
four years, pre-empts many of TTM's novelties (though not all)

Which ones?

BTW, TTM is not about novelties. It is about reaffirming the RM
as a solid foundation against the OO Newspeak, and applying it to the
needs that supposedly gave birth to OODBs.

Well said. Their whole exercise started as an exploration of the features of
object orientation that might improve the relational model and ended by
observing that object orientation had nothing to offer.
I offered an argument. It has the form of the following syllogism.

o Date & Darwen said "the relational model needs no . . ."

o Data & Darwen promptly did " . . ."

o Therefore, Date & Darwen are inconsistent.

Having provided the syllogism I then provided the evidence. I showed where
they said what they said , and refuted your claim that the quotation was
out of context. I then showed examples of where they had apparently commited
the offenses they were banning. At every step I have acknowledged the
merits of their work, I have emphasized my respect for it, and I have
demonstrated my knowledge of it through extensive quotation and critical
thinking.

Let me quote something you just said back to you:

Quote:
Well said. Their whole exercise started as an exploration of the features of
object orientation that might improve the relational model and ended by
observing that object orientation had nothing to offer.
"Of course, we do believe that object technology has important contributions
to make to the database management field as well; indeed, our remarks
above on Object/Relational systems were meant to suggest as much".

This is from page 1, paragraph 2 of the Manifesto. Once again, you are
either lying (knowingly uttering an untruth) or else you have a shaky grasp
of written comprehension.

Let us know which alternative you decide best fits the facts.

KR

Pb


Reply With Quote
  #154  
Old   
--CELKO--
 
Posts: n/a

Default Re: missing information and aggregates - 09-07-2003 , 02:45 PM



Quote:
It is obvious that aggregates should return a non-null value at
least in cases like these:

SELECT SUM (sal) FROM Emp WHERE 1 = 0;

the correct result is 0, not the NULL (SQL standard?) <<

Why isw it obvious? Consider the set {1,-1} (or any set that sums to
zero from actual values). Why would you want a set operation SUM()
that cannot tell the difference between the empty set and these
non-empty sets?

Quote:
I find it strange, arbitrary and inconsistent that COUNT(*) returns
0 for zero rows but SUM(1) returns NULL

COUNT(*) is a bad notation; it is the cardinality operator on the set
as a whole. All the other aggregate functions are done with values
from rows (elements) inside the table (set). The syntax should have
been something like

CARD(SELECT * FROM .. WHERE...)

Quote:
... claimed the answer to these aggregates over zero rows is
undefined, and he quoted a passage from _Concrete Mathematics_ by
Graham, Knuth and Patashnik a page or two before the passage that
explicitly states the answer is defined as the identity element. <<

The Knuth notation (which he got from Iverson) replaces the use of a
sequence for an index of summation on a series with a set
characteristic function. In the Knuth notation, all the terms come
into existence and are added "all at once"; the old notation due to
Fourier was a FOR loop, which generated each sum in a series.

In short, Knuth's notation follows the SQL model. Knuth stated that
making the empty set equal to 1 udner all operations was a convention
that made some of the summations easier to manipulate. Instead of
explictly excluding undefined terms, convert them to the identity
element instead. You can have some real problems with sets that are
hard to define, like all Primes or the (3n+1) sets, etc.

Quote:
SELECT MAX(sal) FROM Emp WHERE 1 = 0;
the correct result is minus infinity, not the NULL. I would accept the
minimum value of the data type, which is a close enough approximation
to minus infinity. The other option is an underflow exception. <<

Never return an actual value for a missing/error/unknown marker when
it could be meaningful -- someone will do math with it and not know it
is a value marker. Centura (nee Gupta) returned minus infinity in
some of its operations, but they are the only one I know that did.
While it might complicate things a bit, the IEEE floating point specs
have some NaN (Not a Number) configurations that would be standard.


Reply With Quote
  #155  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: missing information and aggregates - 09-07-2003 , 10:58 PM



joe.celko (AT) northface (DOT) edu (--CELKO--) wrote in message news:<a264e7ea.0309071145.43d64bf8 (AT) posting (DOT) google.com>...

[...]

Quote:
You can have some real problems with sets that are
hard to define, like all Primes or the (3n+1) sets, etc.

I dont think it is the set itself that causes trouble, it is the
operator on the set.


/Lennart

[...]


Reply With Quote
  #156  
Old   
Mikito Harakiri
 
Posts: n/a

Default Re: missing information and aggregates - 09-08-2003 , 12:59 PM



"--CELKO--" <joe.celko (AT) northface (DOT) edu> wrote

Quote:
It is obvious that aggregates should return a non-null value at
least in cases like these:

SELECT SUM (sal) FROM Emp WHERE 1 = 0;

the correct result is 0, not the NULL (SQL standard?)

Why isw it obvious? Consider the set {1,-1} (or any set that sums to
zero from actual values). Why would you want a set operation SUM()
that cannot tell the difference between the empty set and these
non-empty sets?
Empty sets aren't special. We don't want to distinguish them, we just want
to extend definition of SUM to empty sets which is mathematically
consistent.

The defintion where the query

select sum_empty+sum_nonempty from (
select sum(sal) sum_empty from emp where 1=0
), (
select sum(sal) sum_nonempty from emp where 1!=0
)

returns non null answer meets my math consistency criteria. The "standard"
definition that returns null does not. (Note that the query above returns
the same answer no matter what predicate is, since both sets are
complementary. It's just silly if it breaks in case when one set is empty).

Quote:
I find it strange, arbitrary and inconsistent that COUNT(*) returns
0 for zero rows but SUM(1) returns NULL

COUNT(*) is a bad notation; it is the cardinality operator on the set
as a whole. All the other aggregate functions are done with values
from rows (elements) inside the table (set). The syntax should have
been something like

CARD(SELECT * FROM .. WHERE...)
The standard notation is certainly ugly. It makes little sence fixing it, as
count is redundant anyway.

Quote:
In short, Knuth's notation follows the SQL model. Knuth stated that
making the empty set equal to 1 udner all operations was a convention
that made some of the summations easier to manipulate. Instead of
explictly excluding undefined terms, convert them to the identity
element instead. You can have some real problems with sets that are
hard to define, like all Primes or the (3n+1) sets, etc.
I would like to see some examples in SQL notation if you want to explore
this topic furhter.

Quote:
SELECT MAX(sal) FROM Emp WHERE 1 = 0;
the correct result is minus infinity, not the NULL. I would accept the
minimum value of the data type, which is a close enough approximation
to minus infinity. The other option is an underflow exception.

Never return an actual value for a missing/error/unknown marker when
it could be meaningful -- someone will do math with it and not know it
is a value marker. Centura (nee Gupta) returned minus infinity in
some of its operations, but they are the only one I know that did.
While it might complicate things a bit, the IEEE floating point specs
have some NaN (Not a Number) configurations that would be standard.
What max(NaN, 5) is? I certainly know that max("minus infinity", 5) = 5.
Note that you must have 2 infinity symbols to be able to handle cases like
this, not just a single one.




Reply With Quote
  #157  
Old   
Mikito Harakiri
 
Posts: n/a

Default Re: missing information and aggregates - 09-08-2003 , 02:04 PM



"Mikito Harakiri" <mikharakiri (AT) ywho (DOT) com> wrote

Quote:
While it might complicate things a bit, the IEEE floating point specs
have some NaN (Not a Number) configurations that would be standard.

What max(NaN, 5) is? I certainly know that max("minus infinity", 5) = 5.
Note that you must have 2 infinity symbols to be able to handle cases like
this, not just a single one.
Well, given that

"minus infinity" = - "plus infinity"

one symbol only. Anyhow, I prefer a smart symbol that is aware about
ariphmetics and returns reasonable answers versus a dumb one that has a
universal simplistic answer "Don't know" to all math problem.




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.