dbTalk Databases Forums  

Re: Dreaming About Redesigning SQL

comp.databases.object comp.databases.object


Discuss Re: Dreaming About Redesigning SQL in the comp.databases.object forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Seun Osewa
 
Posts: n/a

Default Re: Dreaming About Redesigning SQL - 10-07-2003 , 01:07 AM






One question that I think must be looked into is this: If SQL
databases are successful today, is it because:
** of the relational _model_ they are based on?

** of the ease with which SQL can be used from within all programming
languages and as an interactive query language?

** The failure or earlier models and the support of major SQL database
vendors once it reached critical mass of adoption?

In other words do we have the model, the language, or standardisation
to blame/praise for the popularity of the relational model?

I would also like to know the classical arguments against the network
model or other "pointer based" models. The only things I know are
that:

** using pointers to positions in memory or disk can be messy when
data has to be moved around. But then is seems there are several
simple ways to solve this, e.g. what I can only call "logical
pointers".

** The difficulty of performing adhoc queries. but I want to think
that if there is a procedural (query?) language many of the advanced
features of SQL e.g. group by and sorting of the data can be done with
user-generated procedural code. Then I observe that most databases in
the world today are being accessed by asp, php or perl web scripts and
only recieve adhoc queries during the development stage. These
queries are written by developers who are skilled enough in procedural
programming not to have problems if they needed such code to access
the data.

Other than that, what's wrong with the network (or similar) models? I
have not found a good link to such a discussion though I hear it
repeated that there are certain classical arguments against them.

Seun Osewa

Lee Fesperman <firstsql (AT) ix (DOT) netcom.com> wrote

Quote:
Seun Osewa wrote:
Sometimes I wonder why its so important to model data in the "rela-
tional way", to think of data in form of sets of tuples rather than
tables or lists or whatever. I mean, though its elegant and based
on mathematical principles I would like to know why its the _right_
model to follow in designing a DBMS (or database). The way my mind
sees it, should we not rather be interested in what works?

Relational is the _right_ model because 'it works'. It's the only truly comprehensive
data model and subject of decades of research. All other data models have been found to
be flawed and (nearly) discarded.

Reply With Quote
  #2  
Old   
Seun Osewa
 
Posts: n/a

Default Re: Dreaming About Redesigning SQL - 10-07-2003 , 04:19 AM






What do you think about the so-called "Associative Model of Data"

"The associative model divides the real-world things about which data
is to be recorded into two sorts: Entities are things that have
discrete, independent existence. An entity's existence does not depend
on any other thing. Associations are things whose existence depends on
one or more other things, such that if any of those things ceases to
exist, then the thing itself ceases to exist or becomes meaningless."

Whitepaper(its ok to skip over the hype):
http://www.lazysoft.com/docs/other_d...whitepaper.pdf

==Unrelated to the above, but somewhat motivated by it==
I am beginning to think there may a place for a database model that
consists of the most commonly used subset of relational, SQL or
aiming-to-be-relational databases. For example, I may want to make a
generalization of the observation that so many tables begin with
CREATE TABLE XXX (ID INT AUTO_INCREMENT PRIMARY KEY ... ) unless there
is a natural unique key, and that those that don't are often used
merely to indicate many-to-many relationships: CREATE TABLE COUPLES
(MAN_ID INT REFERENCES MEN, WOMAN_ID INT REFERENCES WOMEN).

Now even in the case where there is a natural primary key for a table
(hope my terminology is right) e.g. a social security number for a
PERSONS table, a product part number (like METPLAS012A) for a
PRODUCT_PARTS table, etc, if we can figure a way to generate an
integer primary key (from 0 to 2^32 or 2^64 or as the case may be),
then its easy to think of a technique related to hashing that
eliminates the need for another index on the "natural primary key" and
makes lookups/joins for the table using the "natural" key as fast as
operations using the "meaningless" integer key.

Sometimes I think that, while the the separation of logical from
physical model is a nice concept, it should not prevent those who
understand the underlying physical implementation and the data access
patterns of their target application from tweaking things to achieve
acceptable performance... things they would not have to tweak if their
performance was ok by default. I want to compare it to things like
C++ inline keyword, which help to make things faster in the case where
the programmer knows a little bit more than the compiler. I want to
say if the application designer knows a lower-level way to get his
applications to run faster, give him the chance, that a good language
should give power to the user who wants to go the extra mile.

So pls help me point out any flaws there might be in my logic!

A million thanks,

Seun Osewa.

Reply With Quote
  #3  
Old   
Jan Hidders
 
Posts: n/a

Default Re: Dreaming About Redesigning SQL - 10-07-2003 , 02:37 PM



Seun Osewa wrote:
Quote:
What do you think about the so-called "Associative Model of Data"
Looks like a slightly scrambled and rather limited version of ORM.
Certainly nothing new there.

Quote:
==Unrelated to the above, but somewhat motivated by it==
I am beginning to think there may a place for a database model that
consists of the most commonly used subset of relational, SQL or
aiming-to-be-relational databases.
Has already been done. See RM/T by Codd.

Quote:
Sometimes I think that, while the the separation of logical from
physical model is a nice concept, it should not prevent those who
understand the underlying physical implementation and the data access
patterns of their target application from tweaking things to achieve
acceptable performance...
That depends on your situation. If we are talking about a large database
with many users and/or applications that access it then it should be the
DBA that decides what is optimized for whom so he or she gets to decide
what exactly the underlying physical implementation is. In that case the
programmer does not and cannot know what the underlying physical
implementation is. The whole idea of data independence is that the DBA
is able to change this without the application noticing this, except for
a difference in performance, of course.

-- Jan Hidders



Reply With Quote
  #4  
Old   
Jan Hidders
 
Posts: n/a

Default Re: Dreaming About Redesigning SQL - 10-07-2003 , 02:51 PM



Seun Osewa wrote:
Quote:
In other words do we have the model, the language, or standardisation
to blame/praise for the popularity of the relational model?
Don't forget market domination.

Quote:
I would also like to know the classical arguments against the network
model or other "pointer based" models. The only things I know are
that:

** using pointers to positions in memory or disk can be messy when
data has to be moved around. But then is seems there are several
simple ways to solve this, e.g. what I can only call "logical
pointers".
Correct. There's absolutely no reason to believe that you cannot have
data independence with logical pointer or references. I would however
argue that allowing entities without representable keys is not a good idea.

Quote:
** The difficulty of performing adhoc queries. but I want to think
that if there is a procedural (query?) language many of the advanced
features of SQL e.g. group by and sorting of the data can be done with
user-generated procedural code.
No. Procedural query language are a very big no no, because you should
let the database choose how to optimize the queries and not let the
users choose some optimizations on assumptions about how they think the
data is stored.

-- Jan Hidders




Reply With Quote
  #5  
Old   
Anthony W. Youngman
 
Posts: n/a

Default Re: Dreaming About Redesigning SQL - 10-07-2003 , 04:27 PM



In article <ba87a3cf.0310062207.1d09cff6 (AT) posting (DOT) google.com>, Seun Osewa
<seunosewa (AT) inaira (DOT) com> writes
Quote:
One question that I think must be looked into is this: If SQL
databases are successful today, is it because:
** of the relational _model_ they are based on?
This model, because it's mathematically provable, gives Computer
Scientists the warm fuzzies ...
Quote:
** of the ease with which SQL can be used from within all programming
languages and as an interactive query language?
This is an effect, not a cause. SQL is a naff query language - it's a
mathematical notation (which is why computer scientists like it) but an
absolute bummer for normal people.
Quote:
** The failure or earlier models and the support of major SQL database
vendors once it reached critical mass of adoption?
Well, with IBM behind it, it wasn't hard to reach critical mass ...

Like most things, once the ball starts rolling, it's hard to stop. The
fact that it may (or may not) be crap tends to be irrelevant :-)
Quote:
In other words do we have the model, the language, or standardisation
to blame/praise for the popularity of the relational model?
I think we have what is called the "network effect" - in other words,
it's popular because it's popular. I know that's a circular argument,
but in the real world that sort of argument works.
Quote:
I would also like to know the classical arguments against the network
model or other "pointer based" models. The only things I know are
that:
The relational advocates believe that because their model is
"mathematical" it must be better than anything else. And all too often
they convince themselves that theirs is the only model based on maths -
when in fact most of the others are too ...
Quote:
** using pointers to positions in memory or disk can be messy when
data has to be moved around. But then is seems there are several
simple ways to solve this, e.g. what I can only call "logical
pointers".

** The difficulty of performing adhoc queries. but I want to think
that if there is a procedural (query?) language many of the advanced
features of SQL e.g. group by and sorting of the data can be done with
user-generated procedural code. Then I observe that most databases in
the world today are being accessed by asp, php or perl web scripts and
only recieve adhoc queries during the development stage. These
queries are written by developers who are skilled enough in procedural
programming not to have problems if they needed such code to access
the data.
Jan thinks the DBA should determine how the database should be
optimised. As a Pickie, I'm used to believing that a DBA shouldn't be
necessary. Pick *doesn't* *have* a query optimiser - because it doesn't
*need* one. (It's hard to optimise something that's 97% efficient before
optimisation.)

Let's design a Pick database. Let's normalise it. And store it in the
Pick data structure. I can declare one "table" per entity (with no
attribute or relationship tables - I don't need them). Each Pick FILE is
equivalent to a relational entity-view.

So, given that I know the primary key of the entity I want to access, it
takes me just ONE disk seek to retrieve EVERYTHING about that entity.
Its attributes. Its relationships. THE LOT. Now do you see why I say a
query optimiser is a waste of time?
Quote:
Other than that, what's wrong with the network (or similar) models? I
have not found a good link to such a discussion though I hear it
repeated that there are certain classical arguments against them.

There's nothing wrong with other models. The thing about relational is
that it is designed to make "ad hoc" "easy". No one query is harder or
easier than any other. The problem is that by reducing all queries to
the same difficulty, the "easy" queries are made far harder than they
need to be.

Things like hierarchical databases make easy queries easy. The snag is
they leave hard queries hard.

But let me leave you with this - why are DB2, MS SQL-server, Oracle all
adding multi-dimensional features to their relational databases? I'll
tell you. It's because they work - and Pick has had exactly that from
the day it was designed nearly 40 years ago. And the typical company
that runs Pick as its database spends roughly half (as a percentage of
turnover) what a relational-based company spends on its database.

Cheers,
Wol
--
Anthony W. Youngman <pixie (AT) thewolery (DOT) demon.co.uk>
'Yings, yow graley yin! Suz ae rikt dheu,' said the blue man, taking the
thimble. 'What *is* he?' said Magrat. 'They're gnomes,' said Nanny. The man
lowered the thimble. 'Pictsies!' Carpe Jugulum, Terry Pratchett 1998
Visit the MaVerick web-site - <http://www.maverick-dbms.org> Open Source Pick


Reply With Quote
  #6  
Old   
Lee Fesperman
 
Posts: n/a

Default Re: Dreaming About Redesigning SQL - 10-07-2003 , 04:32 PM



Jan Hidders wrote:
Quote:
Seun Osewa wrote:
I would also like to know the classical arguments against the network
model or other "pointer based" models. The only things I know are
that:

** using pointers to positions in memory or disk can be messy when
data has to be moved around. But then is seems there are several
simple ways to solve this, e.g. what I can only call "logical
pointers".

Correct. There's absolutely no reason to believe that you cannot have
data independence with logical pointer or references. I would however
argue that allowing entities without representable keys is not a good idea.
Incorrect. Even though you call them 'logical' pointers, they are still physical
artifacts and have no place in a truly logical view of the database. Databases are about
data, and pointers are not data (or meta-data).

This is very old news.

The OP needs to increase his knowledge of database concepts before he tries to 'fix'
things. A newsgroup is the wrong place to get any real depth of understanding.

--
Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
================================================== ============
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)


Reply With Quote
  #7  
Old   
Lauri Pietarinen
 
Posts: n/a

Default Re: Dreaming About Redesigning SQL - 10-07-2003 , 05:44 PM



seunosewa (AT) inaira (DOT) com (Seun Osewa) wrote in message news:<ba87a3cf.0310062207.1d09cff6 (AT) posting (DOT) google.com>...
Quote:
One question that I think must be looked into is this: If SQL
databases are successful today, is it because:
** of the relational _model_ they are based on?
This is the main reason.

Quote:
** of the ease with which SQL can be used from within all programming
languages and as an interactive query language?
I think this "dual mode" concept was proposed by Codd, so in a sense
it is part of the "Relational Model"

Quote:
** The failure or earlier models and the support of major SQL database
vendors once it reached critical mass of adoption?
IMS (a hierarchical DBMS) and IDMS (a network, or CODASYL DBMS) were
VERY successful in the 70's and 80's. SQL databases had an uphill
battle.

Quote:
In other words do we have the model, the language, or standardisation
to blame/praise for the popularity of the relational model?
I think the model is the biggest issue. Standardisation has also
helped. Many say that QUEL was better than SQL, so maybe we could say
"inspite of SQL"

Quote:
I would also like to know the classical arguments against the network
model or other "pointer based" models. The only things I know are
that:

** using pointers to positions in memory or disk can be messy when
data has to be moved around. But then is seems there are several
simple ways to solve this, e.g. what I can only call "logical
pointers".
Well, that's what the relational model does: it uses "logical
pointers".

Quote:
** The difficulty of performing adhoc queries. but I want to think
that if there is a procedural (query?) language many of the advanced
features of SQL e.g. group by and sorting of the data can be done with
user-generated procedural code. Then I observe that most databases in
the world today are being accessed by asp, php or perl web scripts and
only recieve adhoc queries during the development stage. These
queries are written by developers who are skilled enough in procedural
programming not to have problems if they needed such code to access
the data.
If you take a standard SQL-query with sorting and grouping and some
joins and compare it with hand written "navigational" code you will
notice that what can be expressed in SQL in 5-10 lines of code will
require several pages of hand written code for equivalent result. It
is also very helpfull to pretest your SQL using a query editor, and
when you are convinced of the correctness of your SQL query you just
drop it in the program.

The third issue is that the SQL query can be optimised to a much
higher degree than procedural code. Say you add a new index. The
DBMS will/can immediately start using it without any user intervention
(if it decides to). In the procedural alternative you would have to
recode your query. The optimiser can also take into account the
cardinality of tables (=number of rows) so that it will produce a
different plan for a small database and a big database. An optimiser
can even be sensitive to input from users at run time, say you have

select name, salary
from nurses
where sex = ?

If the parameter given at run time is 'M' then it could be
advantageous to use an index, but not if parameter is 'F'. This
decision could be made by the optimiser "on the fly". I hope you get
the picture...

Quote:
Other than that, what's wrong with the network (or similar) models? I
have not found a good link to such a discussion though I hear it
repeated that there are certain classical arguments against them.
In 1974 there was the BIG DEBATE where Codd defended the RM against
network guys. I wonder if there is a transcript available?

I think one of Codds arguments was that the number of manipulative
operators needed in the RM (i.e. insert, update, delete) was much
lower than in CODASYL (connect, disconnect, etc etc...)

regards,
Lauri Pietarinen


Reply With Quote
  #8  
Old   
Paul G. Brown
 
Posts: n/a

Default Re: Dreaming About Redesigning SQL - 10-08-2003 , 03:06 AM



Lee Fesperman <firstsql (AT) ix (DOT) netcom.com> wrote

Quote:
Jan Hidders wrote:

Seun Osewa wrote:
I would also like to know the classical arguments against the network
model or other "pointer based" models. The only things I know are
that:

** using pointers to positions in memory or disk can be messy when
data has to be moved around. But then is seems there are several
simple ways to solve this, e.g. what I can only call "logical
pointers".

Correct. There's absolutely no reason to believe that you cannot have
data independence with logical pointer or references. I would however
argue that allowing entities without representable keys is not a good idea.

Incorrect. Even though you call them 'logical' pointers, they are still physical
artifacts and have no place in a truly logical view of the database. Databases are about
data, and pointers are not data (or meta-data).
Erm . .

What's 'physical' about this?

RELATION Dept ( Id Dept_Id KEY, Name String );
RELATION Emp ( Id Emp_Id KEY, Dept REF(Dept),
Name PersonName, Salary Money );

RETRIEVE E.Name, DEREF(E.Dept).Name FROM Emp E;

RETRIEVE E.Name FROM Emp E, Dept D
WHERE DEREF(E.Dept).Id = E.Id AND D.Name = 'shoe';

This schema assumes a strict two-value logic, so Emp.Dept cannot, under any
circumstances what-so-ever, have anything in it except a 'logical reference'
to exactly one Dept tuple. How is this achieved under the covers? Who
cares! It's about the logical model.

(I should note that this formulation kind of makes my skin crawl for
aesthetic reasons, but its a REF()/DEREF() model that is essentially
syntactic sugar around well understood key and foreign key semantics.)

Not trying to start a flame war, really (though that's an easy thing to
do when you're as ignorant as I am). I'm just hoping that we can all
agree that it's the semantics of the model, not the syntax, which we ought
to care about. 'Logical pointers' (by which I mean this simple REF/DEREF,
and mindful of a memory I have to the effect that SQL-3 REF/DEREF are not)
are mappable to PK/FK, and to a wide variety of rules stated in terms of the
existance of a tuple with certain properties.

The advantage of predicate logic is that there are various kinds of
rules which can't be stated using REF/DEREF. And when you have a choice
between two models take the one with the broader expressive power.

KR

P "twit filtered since 2003!" b


Reply With Quote
  #9  
Old   
Jan Hidders
 
Posts: n/a

Default Re: Dreaming About Redesigning SQL - 10-08-2003 , 03:53 AM



Lee Fesperman wrote:
Quote:
Jan Hidders wrote:

Correct. There's absolutely no reason to believe that you cannot have
data independence with logical pointer or references. I would however
argue that allowing entities without representable keys is not a good idea.

Incorrect. Even though you call them 'logical' pointers, they are
still physical artifacts and have no place in a truly logical view of
the database.
Logical pointers can be defined at the logical level and implemented in
various ways. They are just as much physical artifacts as, say,
relations are.

Quote:
Databases are about
data, and pointers are not data (or meta-data).
They carry information. That makes them data. I see no good reason to
use a more restrictive definition.

Quote:
The OP needs to increase his knowledge of database concepts before he
tries to 'fix' things.
His motivation may be wrong, but he asks the right questions.

Quote:
A newsgroup is the wrong place to get any real
depth of understanding.
I fully agree. If only because there seem to be so many in these
newsgroups that confuse "knowledge of database concepts" with "knowing
what Chris Date et al. say about them". If you want to know what the
real experts in industry and the research community think, these
newsgroups are certainly not representative.

-- Jan Hidders



Reply With Quote
  #10  
Old   
Seun Osewa
 
Posts: n/a

Default Re: Dreaming About Redesigning SQL - 10-08-2003 , 03:56 AM



Lee Fesperman <firstsql (AT) ix (DOT) netcom.com> wrote
Quote:
The OP needs to increase his knowledge of database concepts before he tries to 'fix'
things. A newsgroup is the wrong place to get any real depth of understanding.
In that case, I would like to know what is the _right_ way to get the
depth of understanding. At the moment due to where I live I only have
access to internet resources, so what do you suggest? Besides, I find
that many people who have all followed the right way disagree on a
number of issues. I know that there are certain classic books that
would help (and I would like to be pointed to some more), but there
must be some web resources I've missed?

The observation is appreciated, a little assistance in this area would
be appreciated even more ;-). How else can I get in touch with very
smart people?

Seun Osewa


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.