dbTalk Databases Forums  

Re: foundations of relational theory?

comp.databases.pick comp.databases.pick


Discuss Re: foundations of relational theory? in the comp.databases.pick forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Seun Osewa
 
Posts: n/a

Default Re: foundations of relational theory? - 10-13-2003 , 03:54 PM






dwolt (AT) iserv (DOT) net (Dawn M. Wolthuis) wrote in message news:<6db906b2.0310121537.5892c25b (AT) posting (DOT) google.com>...
Quote:
"Anith Sen" <anith (AT) bizdatasolutions (DOT) com> wrote

BTW, it is hard to provide quick references to rectify every conceptual
misunderstanding and counterpoint each fallacious MV arguments. Generally,
it just warrants the invocation of the Principle of Incoherence.

If the "XML doc specifiers" are proposing an alternative data model, yes,
they are taking a path, which was proven wrong already, but they aren't
aware of it yet.

PROVEN? I would LOVE to see the proof -- I have begged to see a
proof. There IS NO PROOF OF WHICH I AM AWARE or can you point me to
one? In fact, persisting data with a model that is influenced by an
understanding of language (since the idea is that we not just store
it, but also retrieve it again) does employ a more complex structure
by some measures (mathematical relations, for example) but it does so
for a reason and, as I understand it, there is emperical evidence from
contests performed over a decade ago (so we need to do these again) to
show that systems that persist data with XML-like models (that what
PICK is) provide a more agile development environment.

Coming from a DBMS background, I was surprised to find that from my
experience, my developers who used PICK-based systems also had a more
agile environment for maintaining applications over time. That,
however, is anecdotal evidence and therefore requires some emperical
evidence or mathematical proof before I claimn that I KNOW it to be
the case that it is always better to persist data outside of the RDBMS
model.
[.....]

Ok, I think what we need to do is come up with an illustrative
example. A particular simple application which Dawn can use to
demonstrate the advantages of the MV model. And then our knowledgeable
relational model advocates can use the same example to "debunk" her
claims. And perhaps I can use it to show how some of my PlayDB
concepts might prove useful.

Dawn is not thinking in terms of "Large Shared Data Banks" that must
survive several generations of applications. She is thinking in terms
of productivity of application developers, who are also responsible
for the data model. The relational model is designed for flexibility
and data that is used for a long time by a host of different
applications.

So, in summary, how about a sample application which we use to compare
the models?

Seun Osewa


Reply With Quote
  #2  
Old   
Dawn M. Wolthuis
 
Posts: n/a

Default Re: foundations of relational theory? - 10-13-2003 , 10:29 PM






seunosewa (AT) inaira (DOT) com (Seun Osewa) wrote in message news:<ba87a3cf.0310131254.66a641ba (AT) posting (DOT) google.com>...
Quote:
Ok, I think what we need to do is come up with an illustrative
example. A particular simple application which Dawn can use to
demonstrate the advantages of the MV model. And then our knowledgeable
relational model advocates can use the same example to "debunk" her
claims. And perhaps I can use it to show how some of my PlayDB
concepts might prove useful.

Dawn is not thinking in terms of "Large Shared Data Banks" that must
survive several generations of applications. She is thinking in terms
of productivity of application developers, who are also responsible
for the data model. The relational model is designed for flexibility
and data that is used for a long time by a host of different
applications.

So, in summary, how about a sample application which we use to compare
the models?

Seun Osewa
Good idea, Seun. One thing that might be of note is that many
PICK/MultiValue applications have survived well over 20 years -- NOT
JUST THE DATABASE, BUT THE ENTIRE APPLICATION! I was just asked to
review a D3 (PICK) application that was written in 1981 (and
maintained and enhanced in 82 - 2003 as well). The software company
is going to stick with a MV database solution and even retain some of
their existing data structures. This is NOT the exception -- it is
the NORM in the MV space. That is because of the ability to make
significant changes to applications without ditching them completely
and THAT is because of the approach used for persisting data. So,
while I agree that the relational model was intended to provide
flexibility and maintainability for large amounts of data over time
and that it DOES provide for application-development-environment
independence that is also true of other models such as PICK.

The big difference is where one codes the database integrity
constraints and whether they are all encoded as global constraints or
permit local constraints. In PICK, the same development environment
is used to encode integrity constraints as is used to build an
application, which means that a developer can change them to match new
application requirements. While this makes a typical DBA shudder,
there is simply no need for a typical DBA within the PICK environment
-- it runs lean and mean. If a new application will be using the same
persisted data, but not the same development environment, then
integrity constraints must be encoded in the new environment too.
That sounds terrible at the outset, but ends up acknowledging that
constraints mean nothing outside of the context of an application
anyway -- what one DOES with the data is part of what defines it.

Each new application ends up with many LOCAL constraints -- those that
were not present or might even contradict what was coded as a global
constraint when the data structures were initially defined. Within
the relational model, application developers must encode local
constraints and must also get any contradicting global constraints
changes AND the applications that originally forced these global
constraints must now encode them locally else they will be gone
entirely from that application (was that clear enough?)

Once it is clear that data is not useful in a vacuum, just as a human
memory is not useful outside of the I-O processing of the brain, then
the data and the applications can be seen as parts of the whole,
without such a separation as that introduced with (old-fashioned)
databases, such as network, hierarchical, or relational. PICK is also
old-fashioned, but within PICK, code is really data (source and object
code are items in files, just like any other database "record") and
data is really code (it is typeless/strings as stored data, but shown
as a type -- such as a date -- for output purposes; also you define
derived data as you would other application logic) and the data is
constrained by code, not by some database specs.

Take something seemingly simple like Names (perhaps including former
names) and phone numbers (or others can choose something) and we can
compare how this would be handled by the database plus an application
used to maintain such data and report against it in both an RDBMS and
in PICK. I'm game to try and if I can't find the time, I can see if
some real PICK developers could help me out. Alternatively or
additionally, we might want to see how the Berkely DB-XML contest
turns out -- there should be some good applications coming out of that
where a relational database is nowhere in sight.

--dawn


Reply With Quote
  #3  
Old   
Jonathan Leffler
 
Posts: n/a

Default Re: foundations of relational theory? - 10-14-2003 , 01:38 AM



Dawn M. Wolthuis wrote:

Quote:
[...snip...] but ends up acknowledging that
constraints mean nothing outside of the context of an application
anyway -- what one DOES with the data is part of what defines it.
[...snip...]
Can you clarify what you mean by application? Is it a single program
executable, or a suite of programs forming an integrated unit? Or
something else...

So, from your point of view, each application (under any meaning
defined in answer to the previous question) defines and enforces its
own set of constraints on the data? And if more than one application
needs to use the data (developed, for the purposes of this discussion,
by different teams of programmers), how do the developers in any one
team ensure that the rules that they apply agree with the rules
applied by all the other teams? And when the people in one team need
to apply new rules to the existing data, how do they ensure that the
same rules are applied by all the other teams?

Your (heavily snipped) discussion mentioned local constraints and
global constraints and conflicts between the two. But I'm not sure I
followed how an MV system avoids problems between two mutually
incompatible views of the constraints on the data - could you clarify
that.

One of the things that vexes me in the context of databases
(especially those where the DBMS is in a separate process from the
application) is how an application can remain up to date with the
validation that is performed by the DBMS. The DBMS *must* (IMO) check
the data that it is asked to store for validity according to the rules
it has been told to enforce. (Obviously, if the DBMS has no rules to
enforce, any data is valid, but that is seldom going to be the case.)
The application obtains information from the user and would like to
be able to validate before presenting it to the DBMS for storage.
Applying basic rules (like domain constraints) is not usually a
problem; but more complex rules such as referential integrity etc are
much harder to deal with if the rules change.

--
Jonathan Leffler #include <disclaimer.h>
Email: jleffler (AT) earthlink (DOT) net, jleffler (AT) us (DOT) ibm.com
Guardian of DBD::Informix v2003.04 -- http://dbi.perl.org/



Reply With Quote
  #4  
Old   
Albert D. Kallal
 
Posts: n/a

Default Re: foundations of relational theory? - 10-14-2003 , 04:41 AM



"Jonathan Leffler" <jleffler (AT) earthlink (DOT) net> wrote

Quote:
Dawn M. Wolthuis wrote:

[...snip...] but ends up acknowledging that
constraints mean nothing outside of the context of an application
anyway -- what one DOES with the data is part of what defines it.
[...snip...]

Can you clarify what you mean by application? Is it a single program
executable, or a suite of programs forming an integrated unit? Or
something else...

So, from your point of view, each application (under any meaning
defined in answer to the previous question) defines and enforces its
own set of constraints on the data? And if more than one application
needs to use the data (developed, for the purposes of this discussion,
by different teams of programmers), how do the developers in any one
team ensure that the rules that they apply agree with the rules
applied by all the other teams?
Good question. When you look at the old dbase programs of the 1980's, they
all ran fine and of course any constraints in the data had to be in the
code. However, who wants to go back 20 years in the IT industry?

In the case of MV systems there is no question that some data constraints do
exist in the code. Thus, there is for sure the problem of communicating the
rules to the developers. And, of course the obvious problem of discovering
those rules existing in the code. Thus, each application will, and often
does have some constraints in the application. We would all agree this is
not good. However, even in applications today, we often see little, or no
use of data engine constraints anyway. The bulk of applications that use
MySql is a good example of this. Not to be un-kind to the many old dbase
programs, or MySql programs, but the fact is that many applications don't
need much constraints anyway. The popular success of dBase years ago, and
the popularity of MySql today proves this (MySql has not had any engine
constraints for that long of a period). Need we mention the popular of
Excel? It is perhaps the most wildly used tool for data managing (this is
not good...but is a fact of life!).

However, having noted this lack of constraints, I did say that a "portion"
of the constraints do exist in MV databases. In fact, you actually get a LOT
of RI (referential integrity) for free in a MV database. The other poster
mentioned there is LOTS of MV databases that have survived more then 20
years. One the reasons for this is that MV systems did, and do incorporate a
good deal of RI by their design. Something that most systems 20 years ago
did not have. While dbase programs have gone mostly by the way side, MV
systems continue to function today.

Lets me give you an example of what I mean by constraints for free:

Let assume we have the classic invoice. That means a bunch of fields like
date, time, customer, method of shipping etc. (you know...the whole deal
here). The other thing you need of course is the invoice details. (the
classic related child table with stuff like product number, quantity,
description, unit price etc). In MV land, that "set" of invoice details is
defined in the dictionary. The result of this MV set is a number of RI
constraints for free:

Some of them are:
** If the invoice record is deleted, then the invoice details are also
deleted for you. This is simply the way the MV systems have worked for the
last 30 years. You get this feature for free.


** You can't add, or save invoice details without first creating a invoice
id. Again, this RI feature comes for free in a MV system. The design of
system is such that you can't even write code, nor save the record to disk
without first creating the invoice id.

** You don't have to create or have some index that connects the invoice,
and the invoice details together. This also means that you do not need to
read multiple child records here, since the invoice is in fact one record.
This fact increases performance by a large amount, and eliminates the need
for relational join.

** For each line of invoice details you add, no foreign key is needed to
relate back to the parent record (the invoice record). Again, this feature
is free as result of the design. Virtually all modern sql based systems
today STILL require some code to set this foreign key value. We have don't
have to do this for MV data sets. It is amazing that modern systems sill
require the developers to set this key value. Note that RI might prevent the
developers from adding a foreign key that does not exist, but the developers
and the code still has to set this foreign key value. In the case of a
multi-valued set of data...we don't have to code this, or even have to use a
foreign key! Thus, in some cases, I can argue that MV systems has less code
required for RI.

So, you actually do get a lot of RI for free in a MV system. In fact, that
invoice may have several sets data related, and all the above free benefits
applies. So, if I have a list of who edited, or viewed the invoice, and then
we delete the invoice, those lists also get deleted just like the invoice
details do. No special coding again is required.

As mentioned, not all RI is free in a MV system, but surprisingly amount is.
In fact, the problem is that MOST of the MV community does not realize the
above! It is only when you spend time in sql land do you realize the above
points.

Since the MV system does thus have a good deal of RI for free, then this is
no doubt one reason why MV systems have stood the test of time.

Further, even more important is that the addition of these child data sets
were far more easier to implement in MV systems then most systems, and
existing data AND CODE DID not have to be modified when a "child data set"
is added.

For example, many old systems only needed one phone number field. today, we
have cell phones, pagers etc. (the list is huge). In a mv system, these
additional phone numbers can be added, BUT EXISTING REPORTS and even code
can still function. In a sql relational system, all of the reports, code and
even the data entry screens will have to be EXTENSIVELY modified to add
additional phone fields. I am assuming that those additional fields would of
course be moved into a child table and related back to the parent table.
This large change in design that does not occur in a MV system. Thus, again,
this is why so many are pointing out the benefiting of long term
maintenance, and why the MV system has stood the test of time.

--
Albert D. Kallal
Edmonton, Alberta Canada
kallal (AT) msn (DOT) com
http://www.attcanada.net/~kallal.msn





Reply With Quote
  #5  
Old   
Paul Vernon
 
Posts: n/a

Default Re: foundations of relational theory? - 10-14-2003 , 05:30 AM



"Dawn M. Wolthuis" <dwolt (AT) iserv (DOT) net> wrote

[snip]
Quote:
That is because of the ability to make
significant changes to applications without ditching them completely
and THAT is because of the approach used for persisting data.
I suggest it is because of the tight integration between application and
database that exists in a (nowadays isolated) environment such as PICK that
allows 'significant changes to applications' to be made relatively easily.
It is not due to the 'approach used for persisting data' (or at least,
relational would be even better if only we had good relational systems that
tightly integrate with applications. Maybe Dataphor is the best example we
currently have).

In otherwords, the 'data sub-language' idea of SQL is arguably the biggest
hindrance to easy schema evolution in SQL systems. This is absolutely, not a
problem of the Relational Model itself however. In fact the logical data
independence of the relational model provides amble support for schema
evolution, it's just that no-one has done any kind of semi-decent
implementation AKAIK.


[snip]
Quote:
In PICK, the same development environment
is used to encode integrity constraints as is used to build an
application, which means that a developer can change them to match new
application requirements.
[snip]
While this makes a typical DBA shudder,
there is simply no need for a typical DBA within the PICK environment
-- it runs lean and mean.
[snip]
If a new application will be using the same
persisted data, but not the same development environment, then
integrity constraints must be encoded in the new environment too.
That sounds terrible at the outset, but ends up acknowledging that
constraints mean nothing outside of the context of an application
anyway -- what one DOES with the data is part of what defines it.
All of the above are arguments about the pros/cons of particular
implementations, they say nothing about the inherent capabilities of
different logical models of data (which is what we are meant to be talking
about yes?)

Having said that, I agree that relational implementations should exhibit the
kinds of things you mention above. In particular not *requiring* a DBA would
be a huge step forward for implementations of the relational model .

[snip]
Quote:
Once it is clear that data is not useful in a vacuum, just as a human
memory is not useful outside of the I-O processing of the brain, then
the data and the applications can be seen as parts of the whole,
without such a separation as that introduced with (old-fashioned)
databases, such as network, hierarchical, or relational.
Just to be clear, the relational (or network or hierarchical..) models
themselves do not mandate any particular large or small separation between
applications and data. It is only that some implementations of them choose
to do so.


Regards
Paul Vernon
Business Intelligence, IBM Global Services




Reply With Quote
  #6  
Old   
Paul Vernon
 
Posts: n/a

Default Re: foundations of relational theory? - 10-14-2003 , 06:18 AM



"Albert D. Kallal" <NOOSSPAMkallal (AT) msn (DOT) com> wrote

[snip]
Quote:
For example, many old systems only needed one phone number field. today,
we
have cell phones, pagers etc. (the list is huge). In a mv system, these
additional phone numbers can be added, BUT EXISTING REPORTS and even code
can still function. In a sql relational system, all of the reports, code
and
even the data entry screens will have to be EXTENSIVELY modified to add
additional phone fields. I am assuming that those additional fields would
of
course be moved into a child table and related back to the parent table.
Agreed. Although that is a SQL problem (e.g. lack of updatable views, lack
of relation valued attributes etc) rather than a problem of the relational
model.

Good post though.

Regards
Paul Vernon
Business Intelligence, IBM Global Services




Reply With Quote
  #7  
Old   
Mecki Foerthmann
 
Posts: n/a

Default Re: foundations of relational theory? - 10-14-2003 , 07:56 AM




"Jonathan Leffler" <jleffler (AT) earthlink (DOT) net> wrote

<snip>
Quote:
And if more than one application
needs to use the data (developed, for the purposes of this discussion,
by different teams of programmers), how do the developers in any one
team ensure that the rules that they apply agree with the rules
applied by all the other teams? And when the people in one team need
to apply new rules to the existing data, how do they ensure that the
same rules are applied by all the other teams?

Communication!




Reply With Quote
  #8  
Old   
Bob Badour
 
Posts: n/a

Default Re: foundations of relational theory? - 10-14-2003 , 08:17 AM



"Jonathan Leffler" <jleffler (AT) earthlink (DOT) net> wrote


[responses to dawn's idiotic nonsense snipped]

Quote:
One of the things that vexes me in the context of databases
(especially those where the DBMS is in a separate process from the
application) is how an application can remain up to date with the
validation that is performed by the DBMS. The DBMS *must* (IMO) check
the data that it is asked to store for validity according to the rules
it has been told to enforce. (Obviously, if the DBMS has no rules to
enforce, any data is valid, but that is seldom going to be the case.)
The application obtains information from the user and would like to
be able to validate before presenting it to the DBMS for storage.
Applying basic rules (like domain constraints) is not usually a
problem; but more complex rules such as referential integrity etc are
much harder to deal with if the rules change.
One need only write well-behaved applications that communicate database
errors to users. At worst, the user will experience a slight delay when the
user enters data that would corrupt the data according to some unanticipated
future requirement. One is left with the freedom to perform a simple
cost-benefit analysis: Does the benefit of removing the delay in these
exceptional circumstances outweigh the cost of updating the application?
Alphora addresses this problem by reducing the cost of updating the
application through automatic application generation.

Contrast the cost-benefit analysis when applications enforce constraints
instead: Does the benefit of trusting one's data exceed the cost of updating
all affected applications? Does the benefit of a new application exceed
either the cost of corrupt data or the cost of updating all existing
applications?




Reply With Quote
  #9  
Old   
Bob Badour
 
Posts: n/a

Default Re: foundations of relational theory? - 10-14-2003 , 08:18 AM



"Albert D. Kallal" <NOOSSPAMkallal (AT) msn (DOT) com> wrote

Quote:
"Jonathan Leffler" <jleffler (AT) earthlink (DOT) net> wrote in message
news:nRMib.566$7a4.75 (AT) newsread4 (DOT) news.pas.earthlink.net...
Dawn M. Wolthuis wrote:

[...snip...] but ends up acknowledging that
constraints mean nothing outside of the context of an application
anyway -- what one DOES with the data is part of what defines it.
[...snip...]

Can you clarify what you mean by application? Is it a single program
executable, or a suite of programs forming an integrated unit? Or
something else...

So, from your point of view, each application (under any meaning
defined in answer to the previous question) defines and enforces its
own set of constraints on the data? And if more than one application
needs to use the data (developed, for the purposes of this discussion,
by different teams of programmers), how do the developers in any one
team ensure that the rules that they apply agree with the rules
applied by all the other teams?

Good question. When you look at the old dbase programs of the 1980's, they
all ran fine and of course any constraints in the data had to be in the
code. However, who wants to go back 20 years in the IT industry?

In the case of MV systems there is no question that some data constraints
do
exist in the code. Thus, there is for sure the problem of communicating
the
rules to the developers. And, of course the obvious problem of discovering
those rules existing in the code. Thus, each application will, and often
does have some constraints in the application. We would all agree this is
not good.
Enough said.

[superfluous and contradictory rationalizations snipped]


Quote:
Lets me give you an example of what I mean by constraints for free:
Nothing is free. We have already established that pick imposes large
unecessary costs just to express simple concepts.

[wordy elaboration predicated on false assumption snipped]




Reply With Quote
  #10  
Old   
Bob Badour
 
Posts: n/a

Default Re: foundations of relational theory? - 10-14-2003 , 08:27 AM



"Mecki Foerthmann" <meckif (AT) gmx (DOT) net> wrote

Quote:
"Jonathan Leffler" <jleffler (AT) earthlink (DOT) net> wrote in message
news:nRMib.566$7a4.75 (AT) newsread4 (DOT) news.pas.earthlink.net...
snip
And if more than one application
needs to use the data (developed, for the purposes of this discussion,
by different teams of programmers), how do the developers in any one
team ensure that the rules that they apply agree with the rules
applied by all the other teams? And when the people in one team need
to apply new rules to the existing data, how do they ensure that the
same rules are applied by all the other teams?

Communication!
Have you read Fred Brooks' _The Mythical Man-Month_ ? As the number of
applications grows linearly the communication problem grows quadratically.
However, there remains a single database.




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.